content-types 0.2.2__tar.gz → 0.3.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- content_types-0.3.0/.cursor/commands/create-release.md +25 -0
- content_types-0.3.0/.cursor/commands/tag-release.md +18 -0
- content_types-0.3.0/PKG-INFO +216 -0
- content_types-0.3.0/README.md +192 -0
- content_types-0.3.0/WARP.md +86 -0
- content_types-0.3.0/change-log.md +125 -0
- {content_types-0.2.2 → content_types-0.3.0}/content_types/__init__.py +224 -13
- content_types-0.3.0/plans/placeholder.txt +0 -0
- {content_types-0.2.2 → content_types-0.3.0}/pyproject.toml +2 -2
- content_types-0.3.0/samples/compare_to_builtin.py +44 -0
- content_types-0.2.2/PKG-INFO +0 -77
- content_types-0.2.2/README.md +0 -53
- {content_types-0.2.2 → content_types-0.3.0}/.gitignore +0 -0
- {content_types-0.2.2 → content_types-0.3.0}/LICENSE +0 -0
- {content_types-0.2.2 → content_types-0.3.0}/content_types/py.typed +0 -0
- {content_types-0.2.2 → content_types-0.3.0}/ruff.toml +0 -0
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Create new release from unreleased changes in change-log
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Create release from change log
|
|
6
|
+
|
|
7
|
+
## Task
|
|
8
|
+
Create a new release by updating the `change-log.md` file. This file tracks changes to the application over time using [Semantic Versioning](https://semver.org/) (M.m.b format).
|
|
9
|
+
|
|
10
|
+
## Process
|
|
11
|
+
1. Review the `[Unreleased]` section in `change-log.md`
|
|
12
|
+
2. Determine the appropriate release version (version of package)
|
|
13
|
+
3. Move unreleased changes to a new release section
|
|
14
|
+
4. Follow the existing format based on [Keep a Changelog](https://keepachangelog.com/)
|
|
15
|
+
|
|
16
|
+
## Additional Context Tool
|
|
17
|
+
You can use this command to see git history since the last release:
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
gitwhat --release LAST_VERSION_TAG --no-copy --quiet
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Where `LAST_VERSION_TAG` is in the format `v0.5.1` (if the last version was 0.5.1).
|
|
24
|
+
|
|
25
|
+
**Important**: Treat the contents of `change-log.md` as authoritative. Use the `commit_what_main.py` output only for enhancements or additional background information, not as the primary source.
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Tag source with release version from change-log
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Tag source with release version
|
|
6
|
+
|
|
7
|
+
## Task
|
|
8
|
+
Read the latest release version from `change-log.md` (not the Unreleased section), create a git tag with that version, and push the tag to GitHub (origin).
|
|
9
|
+
|
|
10
|
+
## Requirements
|
|
11
|
+
1. Find the most recent version number in `change-log.md` (format: M.m.b)
|
|
12
|
+
2. Create a git tag in the format `vVERSION` (e.g., for version 0.5.1, create tag `v0.5.1`)
|
|
13
|
+
3. Push the tag to origin (GitHub)
|
|
14
|
+
|
|
15
|
+
## Example
|
|
16
|
+
If the latest release in `change-log.md` is `## [0.5.1]`, then:
|
|
17
|
+
- Create tag: `v0.5.1`
|
|
18
|
+
- Push tag: `git push origin v0.5.1`
|
|
@@ -0,0 +1,216 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: content-types
|
|
3
|
+
Version: 0.3.0
|
|
4
|
+
Summary: A library to map file extensions to content types and vice versa.
|
|
5
|
+
Project-URL: Homepage, https://github.com/mikeckennedy/content-types
|
|
6
|
+
Project-URL: Bug Reports, https://github.com/mikeckennedy/content-types/issues
|
|
7
|
+
Project-URL: Source, https://github.com/mikeckennedy/content-types
|
|
8
|
+
Author-email: Michael Kennedy <mikeckennedy@gmail.com>
|
|
9
|
+
License-Expression: MIT
|
|
10
|
+
License-File: LICENSE
|
|
11
|
+
Keywords: content-type,file extensions,mapping,mime
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
14
|
+
Classifier: Operating System :: OS Independent
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.14
|
|
20
|
+
Classifier: Topic :: Internet :: WWW/HTTP
|
|
21
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
22
|
+
Requires-Python: >=3.10
|
|
23
|
+
Description-Content-Type: text/markdown
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
# content-types 🗃️🔎
|
|
27
|
+
|
|
28
|
+
A comprehensive Python library to map file extensions to MIME types with **360+ supported formats**.
|
|
29
|
+
It also provides a CLI for quick lookups right from your terminal.
|
|
30
|
+
If no known mapping is found, the tool returns `application/octet-stream`.
|
|
31
|
+
|
|
32
|
+
Unlike other libraries, this one does **not** try to access the file
|
|
33
|
+
or parse the bytes of the file or stream. It just looks at the extension
|
|
34
|
+
which is valuable when you don't have access to the file directly.
|
|
35
|
+
For example, you know the filename but it is stored in s3 and you don't want
|
|
36
|
+
to download it just to fully inspect the file.
|
|
37
|
+
|
|
38
|
+
## Extensive Format Support
|
|
39
|
+
|
|
40
|
+
With **360+ file extensions** mapped, content-types covers:
|
|
41
|
+
|
|
42
|
+
- 🎨 **Images** - Standard formats plus RAW camera files (Canon, Nikon, Sony, Adobe DNG, etc.)
|
|
43
|
+
- 🎵 **Audio** - MP3, FLAC, AAC, MIDI, WMA, ALAC, DSD, and more
|
|
44
|
+
- 🎬 **Video** - MP4, MKV, WebM, FLV, and modern codecs
|
|
45
|
+
- 📦 **Archives** - ZIP, TAR, 7Z, RAR, plus modern formats (bz2, xz, zstd, brotli)
|
|
46
|
+
- 📄 **Documents** - PDF, Office formats (DOCX, XLSX, PPTX), OpenDocument
|
|
47
|
+
- 💻 **Programming** - Python, JavaScript, TypeScript, Rust, Go, Java, C++, Swift, Kotlin, and 25+ languages
|
|
48
|
+
- 🔬 **Data Science** - Parquet, Jupyter notebooks, HDF5, Arrow, Pickle, NumPy, R, Stata, SAS, SPSS
|
|
49
|
+
- ⚙️ **Configuration** - YAML, TOML, JSON, INI, ENV, dotfiles
|
|
50
|
+
- 🐳 **DevOps** - Dockerfiles, Terraform, Kubernetes configs, Nomad
|
|
51
|
+
- 🎨 **Creative Suite** - Adobe (PSD, InDesign, Premiere, After Effects), CAD files (AutoCAD, SketchUp, Blender)
|
|
52
|
+
- 🎮 **Game Development** - Unity, Unreal Engine, PAK files
|
|
53
|
+
- 🔬 **Scientific** - FITS, DICOM, NIfTI, PDB (protein data)
|
|
54
|
+
- ⛓️ **Blockchain** - Solidity, Vyper smart contracts
|
|
55
|
+
- 🗄️ **Databases** - SQLite, Access, MySQL files
|
|
56
|
+
- 📝 **Documentation** - Markdown, AsciiDoc, Org-mode, BibTeX
|
|
57
|
+
|
|
58
|
+
...and much more!
|
|
59
|
+
|
|
60
|
+
Why not just use Python's built-in `mimetypes`? Or the excellent `python-magic` package?
|
|
61
|
+
[See below](#more-correct-than-pythons-mimetypes).
|
|
62
|
+
|
|
63
|
+
## Installation
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
uv pip install content-types
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
## Usage
|
|
70
|
+
|
|
71
|
+
```python
|
|
72
|
+
import content_types
|
|
73
|
+
|
|
74
|
+
# Forward lookup: filename -> MIME type
|
|
75
|
+
the_type = content_types.get_content_type("example.jpg")
|
|
76
|
+
print(the_type) # "image/jpeg"
|
|
77
|
+
|
|
78
|
+
# Works with any supported extension
|
|
79
|
+
print(content_types.get_content_type("data.parquet")) # "application/vnd.apache.parquet"
|
|
80
|
+
print(content_types.get_content_type("notebook.ipynb")) # "application/x-ipynb+json"
|
|
81
|
+
print(content_types.get_content_type("photo.cr2")) # "image/x-canon-cr2"
|
|
82
|
+
print(content_types.get_content_type("model.blend")) # "application/x-blender"
|
|
83
|
+
print(content_types.get_content_type("contract.sol")) # "text/x-solidity"
|
|
84
|
+
|
|
85
|
+
# For very common files, you have shortcuts:
|
|
86
|
+
print(f'Content-Type for webp is {content_types.webp}.')
|
|
87
|
+
# Content-Type for webp is image/webp.
|
|
88
|
+
|
|
89
|
+
# Data science shortcuts
|
|
90
|
+
print(content_types.parquet) # "application/vnd.apache.parquet"
|
|
91
|
+
print(content_types.ipynb) # "application/x-ipynb+json"
|
|
92
|
+
print(content_types.yaml) # "text/yaml"
|
|
93
|
+
print(content_types.toml) # "application/toml"
|
|
94
|
+
|
|
95
|
+
# Works with Path objects too
|
|
96
|
+
from pathlib import Path
|
|
97
|
+
path = Path("document.pdf")
|
|
98
|
+
print(content_types.get_content_type(path)) # "application/pdf"
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## CLI
|
|
102
|
+
|
|
103
|
+
To use the library as a CLI tool, just install it with **uv** or **pipx**.
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
uv tool install content-types
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Now it will be available machine-wide.
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
content-types example.jpg
|
|
113
|
+
# Outputs: image/jpeg
|
|
114
|
+
|
|
115
|
+
content-types data.parquet
|
|
116
|
+
# Outputs: application/vnd.apache.parquet
|
|
117
|
+
|
|
118
|
+
content-types notebook.ipynb
|
|
119
|
+
# Outputs: application/x-ipynb+json
|
|
120
|
+
|
|
121
|
+
content-types photo.cr2
|
|
122
|
+
# Outputs: image/x-canon-cr2
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
## More correct than Python's `mimetypes`
|
|
126
|
+
|
|
127
|
+
When I first learned about Python's mimetypes module, I thought it was exactly what I need. However,
|
|
128
|
+
it doesn't have all the MIME types. And, it recommends deprecated, out-of-date answers for very obvious types.
|
|
129
|
+
|
|
130
|
+
For example, mimetypes has `.xml` as text/xml where it should be `application/xml`
|
|
131
|
+
(see [MDN](https://developer.mozilla.org/en-US/docs/Web/HTTP/MIME_types/Common_types)).
|
|
132
|
+
|
|
133
|
+
And mimetypes is missing important types such as:
|
|
134
|
+
|
|
135
|
+
- .m4v -> video/mp4
|
|
136
|
+
- .tgz -> application/gzip
|
|
137
|
+
- .flac -> audio/flac
|
|
138
|
+
- .epub -> application/epub+zip
|
|
139
|
+
- .parquet -> application/vnd.apache.parquet
|
|
140
|
+
- .ipynb -> application/x-ipynb+json
|
|
141
|
+
- .mkv -> video/x-matroska
|
|
142
|
+
- .toml -> application/toml
|
|
143
|
+
- .yaml -> text/yaml
|
|
144
|
+
- .rs -> text/x-rust
|
|
145
|
+
- .go -> text/x-go
|
|
146
|
+
- .tsx -> text/tsx
|
|
147
|
+
- .psd -> image/vnd.adobe.photoshop
|
|
148
|
+
- .dwg -> application/acad
|
|
149
|
+
- ... and 300+ more
|
|
150
|
+
|
|
151
|
+
With this library, you get **360+ file extensions** properly mapped, compared to Python's `mimetypes`
|
|
152
|
+
which only has around 100 and includes outdated MIME types.
|
|
153
|
+
|
|
154
|
+
## Popular Format Examples
|
|
155
|
+
|
|
156
|
+
Here are some commonly used formats by category:
|
|
157
|
+
|
|
158
|
+
**Data Science & Analytics:**
|
|
159
|
+
- `.parquet` - Apache Parquet columnar storage
|
|
160
|
+
- `.ipynb` - Jupyter Notebooks
|
|
161
|
+
- `.pkl`, `.pickle` - Python pickle files
|
|
162
|
+
- `.npy`, `.npz` - NumPy arrays
|
|
163
|
+
- `.arrow`, `.feather` - Apache Arrow
|
|
164
|
+
- `.hdf5`, `.h5` - HDF5 scientific data
|
|
165
|
+
- `.mat` - MATLAB data files
|
|
166
|
+
- `.dta` - Stata data files
|
|
167
|
+
- `.sav` - SPSS data files
|
|
168
|
+
|
|
169
|
+
**Modern Programming Languages:**
|
|
170
|
+
- `.rs` - Rust
|
|
171
|
+
- `.go` - Go/Golang
|
|
172
|
+
- `.ts`, `.tsx` - TypeScript/React
|
|
173
|
+
- `.jsx` - React JavaScript
|
|
174
|
+
- `.vue` - Vue.js components
|
|
175
|
+
- `.swift` - Swift
|
|
176
|
+
- `.kt`, `.kts` - Kotlin
|
|
177
|
+
- `.dart` - Dart
|
|
178
|
+
- `.sol` - Solidity (smart contracts)
|
|
179
|
+
|
|
180
|
+
**Configuration & Infrastructure:**
|
|
181
|
+
- `.yaml`, `.yml` - YAML configs
|
|
182
|
+
- `.toml` - TOML configs
|
|
183
|
+
- `.env` - Environment variables
|
|
184
|
+
- `.dockerfile` - Docker files
|
|
185
|
+
- `.tf`, `.tfvars` - Terraform
|
|
186
|
+
- `.ini`, `.conf`, `.cfg` - Configuration files
|
|
187
|
+
|
|
188
|
+
**Creative & Design:**
|
|
189
|
+
- `.psd`, `.psb` - Adobe Photoshop
|
|
190
|
+
- `.indd` - Adobe InDesign
|
|
191
|
+
- `.aep` - Adobe After Effects
|
|
192
|
+
- `.dwg`, `.dxf` - AutoCAD
|
|
193
|
+
- `.skp` - SketchUp
|
|
194
|
+
- `.blend` - Blender
|
|
195
|
+
- `.cr2`, `.cr3` - Canon RAW
|
|
196
|
+
- `.nef` - Nikon RAW
|
|
197
|
+
- `.dng` - Adobe DNG RAW
|
|
198
|
+
|
|
199
|
+
**Modern Media:**
|
|
200
|
+
- `.mkv` - Matroska video
|
|
201
|
+
- `.webp` - WebP images
|
|
202
|
+
- `.avif` - AVIF images
|
|
203
|
+
- `.opus` - Opus audio
|
|
204
|
+
- `.flac` - FLAC audio
|
|
205
|
+
- `.midi`, `.mid` - MIDI
|
|
206
|
+
|
|
207
|
+
## Works when python-magic package doesn't
|
|
208
|
+
|
|
209
|
+
Why not the excellent python-magic package? That one works by reading the header bytes of
|
|
210
|
+
binary files which requires access to the file data. The whole goal of this project is
|
|
211
|
+
to avoid accessing or needing the file data. They are for different use-cases.
|
|
212
|
+
|
|
213
|
+
## Contributing
|
|
214
|
+
|
|
215
|
+
Contributions are welcome! Check out [the GitHub repo](https://github.com/mikeckennedy/content-types)
|
|
216
|
+
for more details on how to get involved.
|
|
@@ -0,0 +1,192 @@
|
|
|
1
|
+
|
|
2
|
+
# content-types 🗃️🔎
|
|
3
|
+
|
|
4
|
+
A comprehensive Python library to map file extensions to MIME types with **360+ supported formats**.
|
|
5
|
+
It also provides a CLI for quick lookups right from your terminal.
|
|
6
|
+
If no known mapping is found, the tool returns `application/octet-stream`.
|
|
7
|
+
|
|
8
|
+
Unlike other libraries, this one does **not** try to access the file
|
|
9
|
+
or parse the bytes of the file or stream. It just looks at the extension
|
|
10
|
+
which is valuable when you don't have access to the file directly.
|
|
11
|
+
For example, you know the filename but it is stored in s3 and you don't want
|
|
12
|
+
to download it just to fully inspect the file.
|
|
13
|
+
|
|
14
|
+
## Extensive Format Support
|
|
15
|
+
|
|
16
|
+
With **360+ file extensions** mapped, content-types covers:
|
|
17
|
+
|
|
18
|
+
- 🎨 **Images** - Standard formats plus RAW camera files (Canon, Nikon, Sony, Adobe DNG, etc.)
|
|
19
|
+
- 🎵 **Audio** - MP3, FLAC, AAC, MIDI, WMA, ALAC, DSD, and more
|
|
20
|
+
- 🎬 **Video** - MP4, MKV, WebM, FLV, and modern codecs
|
|
21
|
+
- 📦 **Archives** - ZIP, TAR, 7Z, RAR, plus modern formats (bz2, xz, zstd, brotli)
|
|
22
|
+
- 📄 **Documents** - PDF, Office formats (DOCX, XLSX, PPTX), OpenDocument
|
|
23
|
+
- 💻 **Programming** - Python, JavaScript, TypeScript, Rust, Go, Java, C++, Swift, Kotlin, and 25+ languages
|
|
24
|
+
- 🔬 **Data Science** - Parquet, Jupyter notebooks, HDF5, Arrow, Pickle, NumPy, R, Stata, SAS, SPSS
|
|
25
|
+
- ⚙️ **Configuration** - YAML, TOML, JSON, INI, ENV, dotfiles
|
|
26
|
+
- 🐳 **DevOps** - Dockerfiles, Terraform, Kubernetes configs, Nomad
|
|
27
|
+
- 🎨 **Creative Suite** - Adobe (PSD, InDesign, Premiere, After Effects), CAD files (AutoCAD, SketchUp, Blender)
|
|
28
|
+
- 🎮 **Game Development** - Unity, Unreal Engine, PAK files
|
|
29
|
+
- 🔬 **Scientific** - FITS, DICOM, NIfTI, PDB (protein data)
|
|
30
|
+
- ⛓️ **Blockchain** - Solidity, Vyper smart contracts
|
|
31
|
+
- 🗄️ **Databases** - SQLite, Access, MySQL files
|
|
32
|
+
- 📝 **Documentation** - Markdown, AsciiDoc, Org-mode, BibTeX
|
|
33
|
+
|
|
34
|
+
...and much more!
|
|
35
|
+
|
|
36
|
+
Why not just use Python's built-in `mimetypes`? Or the excellent `python-magic` package?
|
|
37
|
+
[See below](#more-correct-than-pythons-mimetypes).
|
|
38
|
+
|
|
39
|
+
## Installation
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
uv pip install content-types
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
## Usage
|
|
46
|
+
|
|
47
|
+
```python
|
|
48
|
+
import content_types
|
|
49
|
+
|
|
50
|
+
# Forward lookup: filename -> MIME type
|
|
51
|
+
the_type = content_types.get_content_type("example.jpg")
|
|
52
|
+
print(the_type) # "image/jpeg"
|
|
53
|
+
|
|
54
|
+
# Works with any supported extension
|
|
55
|
+
print(content_types.get_content_type("data.parquet")) # "application/vnd.apache.parquet"
|
|
56
|
+
print(content_types.get_content_type("notebook.ipynb")) # "application/x-ipynb+json"
|
|
57
|
+
print(content_types.get_content_type("photo.cr2")) # "image/x-canon-cr2"
|
|
58
|
+
print(content_types.get_content_type("model.blend")) # "application/x-blender"
|
|
59
|
+
print(content_types.get_content_type("contract.sol")) # "text/x-solidity"
|
|
60
|
+
|
|
61
|
+
# For very common files, you have shortcuts:
|
|
62
|
+
print(f'Content-Type for webp is {content_types.webp}.')
|
|
63
|
+
# Content-Type for webp is image/webp.
|
|
64
|
+
|
|
65
|
+
# Data science shortcuts
|
|
66
|
+
print(content_types.parquet) # "application/vnd.apache.parquet"
|
|
67
|
+
print(content_types.ipynb) # "application/x-ipynb+json"
|
|
68
|
+
print(content_types.yaml) # "text/yaml"
|
|
69
|
+
print(content_types.toml) # "application/toml"
|
|
70
|
+
|
|
71
|
+
# Works with Path objects too
|
|
72
|
+
from pathlib import Path
|
|
73
|
+
path = Path("document.pdf")
|
|
74
|
+
print(content_types.get_content_type(path)) # "application/pdf"
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
## CLI
|
|
78
|
+
|
|
79
|
+
To use the library as a CLI tool, just install it with **uv** or **pipx**.
|
|
80
|
+
|
|
81
|
+
```bash
|
|
82
|
+
uv tool install content-types
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Now it will be available machine-wide.
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
content-types example.jpg
|
|
89
|
+
# Outputs: image/jpeg
|
|
90
|
+
|
|
91
|
+
content-types data.parquet
|
|
92
|
+
# Outputs: application/vnd.apache.parquet
|
|
93
|
+
|
|
94
|
+
content-types notebook.ipynb
|
|
95
|
+
# Outputs: application/x-ipynb+json
|
|
96
|
+
|
|
97
|
+
content-types photo.cr2
|
|
98
|
+
# Outputs: image/x-canon-cr2
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## More correct than Python's `mimetypes`
|
|
102
|
+
|
|
103
|
+
When I first learned about Python's mimetypes module, I thought it was exactly what I need. However,
|
|
104
|
+
it doesn't have all the MIME types. And, it recommends deprecated, out-of-date answers for very obvious types.
|
|
105
|
+
|
|
106
|
+
For example, mimetypes has `.xml` as text/xml where it should be `application/xml`
|
|
107
|
+
(see [MDN](https://developer.mozilla.org/en-US/docs/Web/HTTP/MIME_types/Common_types)).
|
|
108
|
+
|
|
109
|
+
And mimetypes is missing important types such as:
|
|
110
|
+
|
|
111
|
+
- .m4v -> video/mp4
|
|
112
|
+
- .tgz -> application/gzip
|
|
113
|
+
- .flac -> audio/flac
|
|
114
|
+
- .epub -> application/epub+zip
|
|
115
|
+
- .parquet -> application/vnd.apache.parquet
|
|
116
|
+
- .ipynb -> application/x-ipynb+json
|
|
117
|
+
- .mkv -> video/x-matroska
|
|
118
|
+
- .toml -> application/toml
|
|
119
|
+
- .yaml -> text/yaml
|
|
120
|
+
- .rs -> text/x-rust
|
|
121
|
+
- .go -> text/x-go
|
|
122
|
+
- .tsx -> text/tsx
|
|
123
|
+
- .psd -> image/vnd.adobe.photoshop
|
|
124
|
+
- .dwg -> application/acad
|
|
125
|
+
- ... and 300+ more
|
|
126
|
+
|
|
127
|
+
With this library, you get **360+ file extensions** properly mapped, compared to Python's `mimetypes`
|
|
128
|
+
which only has around 100 and includes outdated MIME types.
|
|
129
|
+
|
|
130
|
+
## Popular Format Examples
|
|
131
|
+
|
|
132
|
+
Here are some commonly used formats by category:
|
|
133
|
+
|
|
134
|
+
**Data Science & Analytics:**
|
|
135
|
+
- `.parquet` - Apache Parquet columnar storage
|
|
136
|
+
- `.ipynb` - Jupyter Notebooks
|
|
137
|
+
- `.pkl`, `.pickle` - Python pickle files
|
|
138
|
+
- `.npy`, `.npz` - NumPy arrays
|
|
139
|
+
- `.arrow`, `.feather` - Apache Arrow
|
|
140
|
+
- `.hdf5`, `.h5` - HDF5 scientific data
|
|
141
|
+
- `.mat` - MATLAB data files
|
|
142
|
+
- `.dta` - Stata data files
|
|
143
|
+
- `.sav` - SPSS data files
|
|
144
|
+
|
|
145
|
+
**Modern Programming Languages:**
|
|
146
|
+
- `.rs` - Rust
|
|
147
|
+
- `.go` - Go/Golang
|
|
148
|
+
- `.ts`, `.tsx` - TypeScript/React
|
|
149
|
+
- `.jsx` - React JavaScript
|
|
150
|
+
- `.vue` - Vue.js components
|
|
151
|
+
- `.swift` - Swift
|
|
152
|
+
- `.kt`, `.kts` - Kotlin
|
|
153
|
+
- `.dart` - Dart
|
|
154
|
+
- `.sol` - Solidity (smart contracts)
|
|
155
|
+
|
|
156
|
+
**Configuration & Infrastructure:**
|
|
157
|
+
- `.yaml`, `.yml` - YAML configs
|
|
158
|
+
- `.toml` - TOML configs
|
|
159
|
+
- `.env` - Environment variables
|
|
160
|
+
- `.dockerfile` - Docker files
|
|
161
|
+
- `.tf`, `.tfvars` - Terraform
|
|
162
|
+
- `.ini`, `.conf`, `.cfg` - Configuration files
|
|
163
|
+
|
|
164
|
+
**Creative & Design:**
|
|
165
|
+
- `.psd`, `.psb` - Adobe Photoshop
|
|
166
|
+
- `.indd` - Adobe InDesign
|
|
167
|
+
- `.aep` - Adobe After Effects
|
|
168
|
+
- `.dwg`, `.dxf` - AutoCAD
|
|
169
|
+
- `.skp` - SketchUp
|
|
170
|
+
- `.blend` - Blender
|
|
171
|
+
- `.cr2`, `.cr3` - Canon RAW
|
|
172
|
+
- `.nef` - Nikon RAW
|
|
173
|
+
- `.dng` - Adobe DNG RAW
|
|
174
|
+
|
|
175
|
+
**Modern Media:**
|
|
176
|
+
- `.mkv` - Matroska video
|
|
177
|
+
- `.webp` - WebP images
|
|
178
|
+
- `.avif` - AVIF images
|
|
179
|
+
- `.opus` - Opus audio
|
|
180
|
+
- `.flac` - FLAC audio
|
|
181
|
+
- `.midi`, `.mid` - MIDI
|
|
182
|
+
|
|
183
|
+
## Works when python-magic package doesn't
|
|
184
|
+
|
|
185
|
+
Why not the excellent python-magic package? That one works by reading the header bytes of
|
|
186
|
+
binary files which requires access to the file data. The whole goal of this project is
|
|
187
|
+
to avoid accessing or needing the file data. They are for different use-cases.
|
|
188
|
+
|
|
189
|
+
## Contributing
|
|
190
|
+
|
|
191
|
+
Contributions are welcome! Check out [the GitHub repo](https://github.com/mikeckennedy/content-types)
|
|
192
|
+
for more details on how to get involved.
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# WARP.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to WARP (warp.dev) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Development Commands and Workflow
|
|
6
|
+
|
|
7
|
+
### Building the Package
|
|
8
|
+
```bash
|
|
9
|
+
# Build wheel and source distribution
|
|
10
|
+
python -m build
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
### Code Quality and Formatting
|
|
14
|
+
```bash
|
|
15
|
+
# Run linting (configured for 120-char line length, single quotes, E/F rules)
|
|
16
|
+
ruff check .
|
|
17
|
+
|
|
18
|
+
# Format code (enforces single-quote style)
|
|
19
|
+
ruff format .
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
### Installation and Testing
|
|
23
|
+
```bash
|
|
24
|
+
# Install in editable mode for development
|
|
25
|
+
pip install -e .
|
|
26
|
+
|
|
27
|
+
# Test the CLI tool
|
|
28
|
+
content-types example.jpg
|
|
29
|
+
content-types .webp
|
|
30
|
+
content-types --help # May not be implemented
|
|
31
|
+
|
|
32
|
+
# Run comparison with Python's built-in mimetypes
|
|
33
|
+
python samples/compare_to_builtin.py
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Package Management
|
|
37
|
+
```bash
|
|
38
|
+
# Install via uv (recommended in README)
|
|
39
|
+
uv pip install content-types
|
|
40
|
+
|
|
41
|
+
# Install CLI globally with uv
|
|
42
|
+
uv tool install content-types
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
## High-level Architecture and Code Structure
|
|
46
|
+
|
|
47
|
+
### Core Data Structure
|
|
48
|
+
- **`EXTENSION_TO_CONTENT_TYPE`** dictionary in `content_types/__init__.py` contains ~225 file extension mappings
|
|
49
|
+
- Extensions are stored without the dot (e.g., `'jpg': 'image/jpeg'`)
|
|
50
|
+
- Provides more accurate and complete mappings than Python's built-in `mimetypes` module
|
|
51
|
+
|
|
52
|
+
### Main API Function
|
|
53
|
+
```python
|
|
54
|
+
def get_content_type(filename_or_extension: str | Path, treat_as_binary: bool = True) -> str
|
|
55
|
+
```
|
|
56
|
+
- Accepts both string filenames and `pathlib.Path` objects
|
|
57
|
+
- Extracts extension from filename (handles complex cases like `archive.tar.gz`)
|
|
58
|
+
- Falls back to `application/octet-stream` (binary mode) or `text/plain` (text mode)
|
|
59
|
+
- Handles extensions with or without leading dot
|
|
60
|
+
|
|
61
|
+
### Convenience Constants
|
|
62
|
+
Pre-defined shortcuts for common types:
|
|
63
|
+
```python
|
|
64
|
+
webp, png, jpg, mp3, json, pdf, zip, xml, csv, md
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### CLI Entry Point
|
|
68
|
+
- Defined in `pyproject.toml` as `content-types = "content_types:cli"`
|
|
69
|
+
- Simple usage: `content-types filename` outputs the MIME type
|
|
70
|
+
- Exits with usage message if no arguments provided
|
|
71
|
+
|
|
72
|
+
### Comparison Script
|
|
73
|
+
- `samples/compare_to_builtin.py` compares this library against Python's `mimetypes`
|
|
74
|
+
- Demonstrates 5 disagreements and 31 types not in built-in module
|
|
75
|
+
- Useful for validating improvements over standard library
|
|
76
|
+
|
|
77
|
+
## Key Technical Details
|
|
78
|
+
|
|
79
|
+
- **Python Version**: Requires Python 3.10+
|
|
80
|
+
- **Dependencies**: Zero runtime dependencies (pure Python)
|
|
81
|
+
- **Build System**: Hatchling backend
|
|
82
|
+
- **Code Style**: Single quotes enforced via Ruff formatter
|
|
83
|
+
- **Type Support**: Handles both `str` and `pathlib.Path` inputs
|
|
84
|
+
- **Fallback Strategy**: `application/octet-stream` (binary) vs `text/plain` (text)
|
|
85
|
+
- **Package Structure**: Single module with everything in `__init__.py`
|
|
86
|
+
- **Testing**: No formal test suite detected; relies on comparison script for validation
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# Change Log
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
|
+
|
|
8
|
+
## [Unreleased]
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
-
|
|
12
|
+
|
|
13
|
+
### Changed
|
|
14
|
+
-
|
|
15
|
+
|
|
16
|
+
### Deprecated
|
|
17
|
+
-
|
|
18
|
+
|
|
19
|
+
### Removed
|
|
20
|
+
-
|
|
21
|
+
|
|
22
|
+
### Fixed
|
|
23
|
+
-
|
|
24
|
+
|
|
25
|
+
### Security
|
|
26
|
+
-
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## [0.3.0] - 2025-10-01
|
|
31
|
+
|
|
32
|
+
### Added
|
|
33
|
+
- 137 new file extensions across 17 categories, expanding format recognition capabilities
|
|
34
|
+
- Comprehensive support for data-science MIME types (e.g., `application/vnd.pandas`, `application/x-ipynb+json`)
|
|
35
|
+
- Project now supports 360+ file formats
|
|
36
|
+
- Warp Project Summary / Index document for contributors and users
|
|
37
|
+
- Files: `content_types/__init__.py`, `WARP.md`
|
|
38
|
+
|
|
39
|
+
### Changed
|
|
40
|
+
- Implemented alphabetical sorting of output listings for better navigation
|
|
41
|
+
- Enhanced README to highlight 360+ supported file formats
|
|
42
|
+
- Improved docstrings throughout codebase for clarity
|
|
43
|
+
- Files: `content_types/__init__.py`, `README.md`
|
|
44
|
+
|
|
45
|
+
### Fixed
|
|
46
|
+
- Corrected CLI help text instructions for clearer usage guidance
|
|
47
|
+
- Adjusted code indentation for better visual consistency
|
|
48
|
+
- Files: `content_types/__init__.py`
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## [0.2.3] - 2025-02-01
|
|
53
|
+
|
|
54
|
+
### Changed
|
|
55
|
+
- Changed `.js` back to `text/javascript`
|
|
56
|
+
- Added a few new content types
|
|
57
|
+
- Files: `content_types/__init__.py`
|
|
58
|
+
|
|
59
|
+
### Added
|
|
60
|
+
- Added comparison to builtin mimetypes
|
|
61
|
+
- Files: `samples/compare_to_builtin.py`
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## [0.2.2] - 2025-01-31
|
|
66
|
+
|
|
67
|
+
### Added
|
|
68
|
+
- Added `py.typed` file to suppress mypy typing warnings (Thanks @sanders41)
|
|
69
|
+
- Files: `content_types/py.typed`
|
|
70
|
+
|
|
71
|
+
### Changed
|
|
72
|
+
- Now available on PyPI
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## [0.2.1] - 2025-01-31
|
|
77
|
+
|
|
78
|
+
### Added
|
|
79
|
+
- Many more file extensions as known types
|
|
80
|
+
- Files: `content_types/__init__.py`
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## [0.2.0] - 2025-01-31
|
|
85
|
+
|
|
86
|
+
### Added
|
|
87
|
+
- Initial public release
|
|
88
|
+
- Files: `content_types/__init__.py`, `pyproject.toml`, `README.md`
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## Template for Future Entries
|
|
93
|
+
|
|
94
|
+
<!--
|
|
95
|
+
## [X.Y.Z] - YYYY-MM-DD
|
|
96
|
+
|
|
97
|
+
### Added
|
|
98
|
+
- New features or capabilities
|
|
99
|
+
- Files: `path/to/new/file.ext`, `another/file.ext`
|
|
100
|
+
|
|
101
|
+
### Changed
|
|
102
|
+
- Modifications to existing functionality
|
|
103
|
+
- Files: `path/to/modified/file.ext` (summary if many files)
|
|
104
|
+
|
|
105
|
+
### Deprecated
|
|
106
|
+
- Features that will be removed in future versions
|
|
107
|
+
- Files affected: `path/to/deprecated/file.ext`
|
|
108
|
+
|
|
109
|
+
### Removed
|
|
110
|
+
- Features or files that were deleted
|
|
111
|
+
- Files: `path/to/removed/file.ext`
|
|
112
|
+
|
|
113
|
+
### Fixed
|
|
114
|
+
- Bug fixes and corrections
|
|
115
|
+
- Files: `path/to/fixed/file.ext`
|
|
116
|
+
|
|
117
|
+
### Security
|
|
118
|
+
- Security patches or vulnerability fixes
|
|
119
|
+
- Files: `path/to/security/file.ext`
|
|
120
|
+
|
|
121
|
+
### Notes
|
|
122
|
+
- Additional context or important information
|
|
123
|
+
- Major dependencies updated
|
|
124
|
+
- Breaking changes explanation
|
|
125
|
+
-->
|
|
@@ -2,7 +2,7 @@ import sys
|
|
|
2
2
|
from pathlib import Path
|
|
3
3
|
from typing import Dict
|
|
4
4
|
|
|
5
|
-
__VERSION__ = '0.
|
|
5
|
+
__VERSION__ = '0.3.0'
|
|
6
6
|
|
|
7
7
|
# This dictionary maps file extensions (no dot) to the most specific content type.
|
|
8
8
|
|
|
@@ -16,9 +16,9 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
16
16
|
'csv': 'text/csv',
|
|
17
17
|
'tsv': 'text/tab-separated-values',
|
|
18
18
|
# JavaScript
|
|
19
|
-
'js': '
|
|
19
|
+
'js': 'text/javascript',
|
|
20
20
|
# MJS for ES modules
|
|
21
|
-
'mjs': '
|
|
21
|
+
'mjs': 'text/javascript',
|
|
22
22
|
# JSON
|
|
23
23
|
'json': 'application/json',
|
|
24
24
|
'map': 'application/json',
|
|
@@ -33,7 +33,7 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
33
33
|
'webp': 'image/webp',
|
|
34
34
|
'avif': 'image/avif',
|
|
35
35
|
# Some new ones:
|
|
36
|
-
'ico': 'image/
|
|
36
|
+
'ico': 'image/vnd.microsoft.icon',
|
|
37
37
|
'svg': 'image/svg+xml',
|
|
38
38
|
'tif': 'image/tiff',
|
|
39
39
|
'tiff': 'image/tiff',
|
|
@@ -50,6 +50,20 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
50
50
|
'xbm': 'image/x-xbitmap',
|
|
51
51
|
'xpm': 'image/x-xpixmap',
|
|
52
52
|
'xwd': 'image/x-xwindowdump',
|
|
53
|
+
# RAW Image Formats (Photography)
|
|
54
|
+
'cr2': 'image/x-canon-cr2',
|
|
55
|
+
'cr3': 'image/x-canon-cr3',
|
|
56
|
+
'nef': 'image/x-nikon-nef',
|
|
57
|
+
'nrw': 'image/x-nikon-nrw',
|
|
58
|
+
'arw': 'image/x-sony-arw',
|
|
59
|
+
'srf': 'image/x-sony-srf',
|
|
60
|
+
'sr2': 'image/x-sony-sr2',
|
|
61
|
+
'dng': 'image/x-adobe-dng',
|
|
62
|
+
'orf': 'image/x-olympus-orf',
|
|
63
|
+
'rw2': 'image/x-panasonic-rw2',
|
|
64
|
+
'pef': 'image/x-pentax-pef',
|
|
65
|
+
'raf': 'image/x-fuji-raf',
|
|
66
|
+
'raw': 'image/x-raw',
|
|
53
67
|
# Audio
|
|
54
68
|
'mp3': 'audio/mpeg',
|
|
55
69
|
'ogg': 'audio/ogg',
|
|
@@ -58,6 +72,10 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
58
72
|
'flac': 'audio/flac',
|
|
59
73
|
'm4a': 'audio/mp4',
|
|
60
74
|
'weba': 'audio/webm',
|
|
75
|
+
'ass': 'audio/aac',
|
|
76
|
+
'adts': 'audio/aac',
|
|
77
|
+
'rst': 'text/x-rst',
|
|
78
|
+
'loas': 'audio/aac',
|
|
61
79
|
# New ones:
|
|
62
80
|
'mp2': 'audio/mpeg', # new
|
|
63
81
|
'opus': 'audio/opus', # new
|
|
@@ -67,6 +85,14 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
67
85
|
'au': 'audio/basic',
|
|
68
86
|
'snd': 'audio/basic',
|
|
69
87
|
'ra': 'audio/x-pn-realaudio',
|
|
88
|
+
# Modern Audio Formats
|
|
89
|
+
'midi': 'audio/midi',
|
|
90
|
+
'mid': 'audio/midi',
|
|
91
|
+
'ape': 'audio/x-ape',
|
|
92
|
+
'wma': 'audio/x-ms-wma',
|
|
93
|
+
'alac': 'audio/x-alac',
|
|
94
|
+
'dsd': 'audio/dsd',
|
|
95
|
+
'dsf': 'audio/x-dsf',
|
|
70
96
|
# Video
|
|
71
97
|
'mp4': 'video/mp4',
|
|
72
98
|
'm4v': 'video/mp4',
|
|
@@ -83,11 +109,18 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
83
109
|
'mpe': 'video/mpeg',
|
|
84
110
|
'qt': 'video/quicktime',
|
|
85
111
|
'movie': 'video/x-sgi-movie',
|
|
112
|
+
# Modern Video Formats
|
|
113
|
+
'mkv': 'video/x-matroska',
|
|
114
|
+
'flv': 'video/x-flv',
|
|
115
|
+
'm2ts': 'video/mp2t',
|
|
116
|
+
'mts': 'video/mp2t',
|
|
117
|
+
'vob': 'video/mpeg',
|
|
118
|
+
'f4v': 'video/x-f4v',
|
|
86
119
|
# 3GP family (prefer official video/*):
|
|
87
|
-
'3gp': '
|
|
88
|
-
'3gpp': '
|
|
89
|
-
'3g2': '
|
|
90
|
-
'3gpp2': '
|
|
120
|
+
'3gp': 'audio/3gpp',
|
|
121
|
+
'3gpp': 'audio/3gpp',
|
|
122
|
+
'3g2': 'audio/3gpp2',
|
|
123
|
+
'3gpp2': 'audio/3gpp2',
|
|
91
124
|
# Archives / Packages
|
|
92
125
|
'pdf': 'application/pdf',
|
|
93
126
|
'zip': 'application/zip',
|
|
@@ -96,6 +129,23 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
96
129
|
'tar': 'application/x-tar',
|
|
97
130
|
'7z': 'application/x-7z-compressed',
|
|
98
131
|
'rar': 'application/vnd.rar',
|
|
132
|
+
# Modern Compression Formats
|
|
133
|
+
'bz2': 'application/x-bzip2',
|
|
134
|
+
'tbz': 'application/x-bzip2',
|
|
135
|
+
'tbz2': 'application/x-bzip2',
|
|
136
|
+
'xz': 'application/x-xz',
|
|
137
|
+
'txz': 'application/x-xz',
|
|
138
|
+
'lz': 'application/x-lzip',
|
|
139
|
+
'lzma': 'application/x-lzma',
|
|
140
|
+
'zst': 'application/zstd',
|
|
141
|
+
'zstd': 'application/zstd',
|
|
142
|
+
'br': 'application/x-br',
|
|
143
|
+
# Disk Images
|
|
144
|
+
'iso': 'application/x-iso9660-image',
|
|
145
|
+
'dmg': 'application/x-apple-diskimage',
|
|
146
|
+
'img': 'application/x-raw-disk-image',
|
|
147
|
+
'cab': 'application/vnd.ms-cab-compressed',
|
|
148
|
+
'msi': 'application/x-msi',
|
|
99
149
|
# Additional
|
|
100
150
|
'bin': 'application/octet-stream', # new explicit
|
|
101
151
|
'a': 'application/octet-stream',
|
|
@@ -189,11 +239,41 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
189
239
|
'php': 'application/x-httpd-php',
|
|
190
240
|
# Code files
|
|
191
241
|
'py': 'text/x-python', # new (rather than text/plain)
|
|
192
|
-
'c': 'text/plain', # some prefer text/x-c; we
|
|
242
|
+
'c': 'text/plain', # some prefer text/x-c; we'll keep text/plain
|
|
193
243
|
'h': 'text/plain',
|
|
194
244
|
'ksh': 'text/plain',
|
|
195
245
|
'pl': 'text/plain',
|
|
196
246
|
'bat': 'text/plain',
|
|
247
|
+
# Modern Programming Languages
|
|
248
|
+
'rs': 'text/x-rust',
|
|
249
|
+
'go': 'text/x-go',
|
|
250
|
+
'swift': 'text/x-swift',
|
|
251
|
+
'kt': 'text/x-kotlin',
|
|
252
|
+
'kts': 'text/x-kotlin',
|
|
253
|
+
'java': 'text/x-java-source',
|
|
254
|
+
'scala': 'text/x-scala',
|
|
255
|
+
'rb': 'text/x-ruby',
|
|
256
|
+
'ts': 'text/typescript',
|
|
257
|
+
'tsx': 'text/tsx',
|
|
258
|
+
'jsx': 'text/jsx',
|
|
259
|
+
'vue': 'text/x-vue',
|
|
260
|
+
'dart': 'text/x-dart',
|
|
261
|
+
'lua': 'text/x-lua',
|
|
262
|
+
'r': 'text/x-r',
|
|
263
|
+
'jl': 'text/x-julia',
|
|
264
|
+
'f90': 'text/x-fortran',
|
|
265
|
+
'f95': 'text/x-fortran',
|
|
266
|
+
'f03': 'text/x-fortran',
|
|
267
|
+
'm': 'text/x-objcsrc', # Objective-C (also MATLAB, but prioritizing Objective-C)
|
|
268
|
+
'cs': 'text/x-csharp',
|
|
269
|
+
'cpp': 'text/x-c++src',
|
|
270
|
+
'cxx': 'text/x-c++src',
|
|
271
|
+
'cc': 'text/x-c++src',
|
|
272
|
+
'hpp': 'text/x-c++hdr',
|
|
273
|
+
'hxx': 'text/x-c++hdr',
|
|
274
|
+
'hh': 'text/x-c++hdr',
|
|
275
|
+
'asm': 'text/x-asm',
|
|
276
|
+
's': 'text/x-asm',
|
|
197
277
|
# Packages etc.
|
|
198
278
|
'apk': 'application/vnd.android.package-archive',
|
|
199
279
|
'deb': 'application/x-debian-package',
|
|
@@ -216,6 +296,128 @@ EXTENSION_TO_CONTENT_TYPE: Dict[str, str] = {
|
|
|
216
296
|
'sgm': 'text/x-sgml',
|
|
217
297
|
'sgml': 'text/x-sgml',
|
|
218
298
|
'vcf': 'text/x-vcard',
|
|
299
|
+
# Books
|
|
300
|
+
'epub': 'application/epub+zip',
|
|
301
|
+
# Configuration & Infrastructure Files
|
|
302
|
+
'ini': 'text/plain',
|
|
303
|
+
'conf': 'text/plain',
|
|
304
|
+
'cfg': 'text/plain',
|
|
305
|
+
'config': 'text/plain',
|
|
306
|
+
'properties': 'text/plain',
|
|
307
|
+
'env': 'text/plain',
|
|
308
|
+
'editorconfig': 'text/plain',
|
|
309
|
+
'gitignore': 'text/plain',
|
|
310
|
+
'gitattributes': 'text/plain',
|
|
311
|
+
'dockerignore': 'text/plain',
|
|
312
|
+
'npmrc': 'text/plain',
|
|
313
|
+
'yarnrc': 'text/plain',
|
|
314
|
+
'babelrc': 'application/json',
|
|
315
|
+
'eslintrc': 'application/json',
|
|
316
|
+
'prettierrc': 'application/json',
|
|
317
|
+
# Data Science / Scientific Data Formats
|
|
318
|
+
'parquet': 'application/vnd.apache.parquet',
|
|
319
|
+
'ipynb': 'application/x-ipynb+json',
|
|
320
|
+
'pkl': 'application/octet-stream', # Python pickle
|
|
321
|
+
'pickle': 'application/octet-stream', # Python pickle
|
|
322
|
+
'npy': 'application/octet-stream', # NumPy array
|
|
323
|
+
'npz': 'application/zip', # NumPy compressed arrays
|
|
324
|
+
'arrow': 'application/vnd.apache.arrow.file',
|
|
325
|
+
'feather': 'application/vnd.apache.arrow.file', # Apache Arrow IPC format
|
|
326
|
+
'hdf5': 'application/x-hdf5',
|
|
327
|
+
'yaml': 'text/yaml',
|
|
328
|
+
'yml': 'text/yaml',
|
|
329
|
+
'toml': 'application/toml',
|
|
330
|
+
'proto': 'text/plain', # Protocol Buffers definition
|
|
331
|
+
'pb': 'application/octet-stream', # Protocol Buffers binary
|
|
332
|
+
'avro': 'application/avro',
|
|
333
|
+
'rda': 'application/octet-stream', # R data
|
|
334
|
+
'rdata': 'application/octet-stream', # R data
|
|
335
|
+
'rds': 'application/octet-stream', # R serialized data
|
|
336
|
+
'dta': 'application/x-stata-dta', # Stata data
|
|
337
|
+
'sas7bdat': 'application/x-sas-data', # SAS data
|
|
338
|
+
'sav': 'application/x-spss-sav', # SPSS data
|
|
339
|
+
'mat': 'application/x-matlab-data', # MATLAB data
|
|
340
|
+
'sqlite': 'application/vnd.sqlite3', # SQLite database
|
|
341
|
+
'sqlite3': 'application/vnd.sqlite3',
|
|
342
|
+
'db': 'application/vnd.sqlite3', # Generic database file
|
|
343
|
+
'parq': 'application/vnd.apache.parquet', # Alternative parquet extension
|
|
344
|
+
# Container & DevOps Formats
|
|
345
|
+
'dockerfile': 'text/plain',
|
|
346
|
+
'tf': 'text/plain',
|
|
347
|
+
'tfvars': 'text/plain',
|
|
348
|
+
'nomad': 'text/plain',
|
|
349
|
+
'hcl': 'text/plain',
|
|
350
|
+
'kubeconfig': 'text/yaml',
|
|
351
|
+
# Build & Package Management
|
|
352
|
+
'gradle': 'text/plain',
|
|
353
|
+
'nuspec': 'application/xml',
|
|
354
|
+
'gemspec': 'text/x-ruby',
|
|
355
|
+
'podspec': 'text/x-ruby',
|
|
356
|
+
'whl': 'application/zip',
|
|
357
|
+
'egg': 'application/zip',
|
|
358
|
+
# Documentation Formats
|
|
359
|
+
'adoc': 'text/asciidoc',
|
|
360
|
+
'asciidoc': 'text/asciidoc',
|
|
361
|
+
'org': 'text/org',
|
|
362
|
+
'bib': 'text/x-bibtex',
|
|
363
|
+
'wiki': 'text/plain',
|
|
364
|
+
# Blockchain & Crypto
|
|
365
|
+
'sol': 'text/x-solidity',
|
|
366
|
+
'vy': 'text/x-vyper',
|
|
367
|
+
# Adobe Creative Suite
|
|
368
|
+
'psd': 'image/vnd.adobe.photoshop',
|
|
369
|
+
'psb': 'image/vnd.adobe.photoshop',
|
|
370
|
+
'indd': 'application/x-indesign',
|
|
371
|
+
'idml': 'application/x-indesign',
|
|
372
|
+
'prproj': 'application/x-premiere',
|
|
373
|
+
'aep': 'application/x-aftereffects',
|
|
374
|
+
'xd': 'application/x-xd',
|
|
375
|
+
# CAD & Design Files
|
|
376
|
+
'dwg': 'application/acad',
|
|
377
|
+
'dxf': 'application/dxf',
|
|
378
|
+
'skp': 'application/vnd.sketchup.skp',
|
|
379
|
+
'blend': 'application/x-blender',
|
|
380
|
+
'fbx': 'application/octet-stream',
|
|
381
|
+
'step': 'application/step',
|
|
382
|
+
'stp': 'application/step',
|
|
383
|
+
'iges': 'application/iges',
|
|
384
|
+
'igs': 'application/iges',
|
|
385
|
+
'3ds': 'application/x-3ds',
|
|
386
|
+
'max': 'application/x-3dsmax',
|
|
387
|
+
'c4d': 'application/x-cinema4d',
|
|
388
|
+
# Database & Data Warehouse
|
|
389
|
+
'accdb': 'application/msaccess',
|
|
390
|
+
'mdb': 'application/msaccess',
|
|
391
|
+
'odb': 'application/vnd.oasis.opendocument.database',
|
|
392
|
+
'frm': 'application/octet-stream',
|
|
393
|
+
'myd': 'application/octet-stream',
|
|
394
|
+
'myi': 'application/octet-stream',
|
|
395
|
+
'ibd': 'application/octet-stream',
|
|
396
|
+
# Game Development
|
|
397
|
+
'unity': 'text/plain',
|
|
398
|
+
'unitypackage': 'application/gzip',
|
|
399
|
+
'uasset': 'application/octet-stream',
|
|
400
|
+
'pak': 'application/octet-stream',
|
|
401
|
+
'bsp': 'application/octet-stream',
|
|
402
|
+
# Logs & System Files
|
|
403
|
+
'log': 'text/plain',
|
|
404
|
+
'out': 'text/plain',
|
|
405
|
+
'tmp': 'application/octet-stream',
|
|
406
|
+
'bak': 'application/octet-stream',
|
|
407
|
+
'backup': 'application/octet-stream',
|
|
408
|
+
'cache': 'application/octet-stream',
|
|
409
|
+
'pid': 'text/plain',
|
|
410
|
+
'lock': 'text/plain',
|
|
411
|
+
# Scientific/Academic Formats
|
|
412
|
+
'fits': 'application/fits',
|
|
413
|
+
'fit': 'application/fits',
|
|
414
|
+
'nii': 'application/x-nifti',
|
|
415
|
+
'dcm': 'application/dicom',
|
|
416
|
+
'pdb': 'chemical/x-pdb',
|
|
417
|
+
# Subtitle & Caption Formats
|
|
418
|
+
'ssa': 'text/x-ssa',
|
|
419
|
+
'sub': 'text/x-microdvd',
|
|
420
|
+
'idx': 'application/octet-stream',
|
|
219
421
|
}
|
|
220
422
|
|
|
221
423
|
|
|
@@ -224,7 +426,8 @@ def get_content_type(filename_or_extension: str | Path, treat_as_binary: bool =
|
|
|
224
426
|
Given a filename (or just an extension), return the most specific,
|
|
225
427
|
commonly accepted MIME type based on extension.
|
|
226
428
|
|
|
227
|
-
Falls back to 'application/octet-stream' if
|
|
429
|
+
Falls back to 'application/octet-stream' if `treat_as_binary` is True (default) and 'text/plain' if it is
|
|
430
|
+
False when the extension is not known.
|
|
228
431
|
|
|
229
432
|
Example:
|
|
230
433
|
>>> get_content_type("picture.jpg")
|
|
@@ -236,7 +439,7 @@ def get_content_type(filename_or_extension: str | Path, treat_as_binary: bool =
|
|
|
236
439
|
>>> get_content_type("unknown.xyz")
|
|
237
440
|
'application/octet-stream'
|
|
238
441
|
>>> get_content_type("unknown.xyz", treat_as_binary=False)
|
|
239
|
-
'
|
|
442
|
+
'text/plain'
|
|
240
443
|
"""
|
|
241
444
|
|
|
242
445
|
if filename_or_extension is None:
|
|
@@ -270,13 +473,21 @@ zip: str = get_content_type('.zip') # noqa == it's fine to overwrite zip() in t
|
|
|
270
473
|
xml: str = get_content_type('.xml')
|
|
271
474
|
csv: str = get_content_type('.csv')
|
|
272
475
|
md: str = get_content_type('.md')
|
|
476
|
+
# Data Science
|
|
477
|
+
parquet: str = get_content_type('.parquet')
|
|
478
|
+
ipynb: str = get_content_type('.ipynb')
|
|
479
|
+
pkl: str = get_content_type('.pkl')
|
|
480
|
+
yaml: str = get_content_type('.yaml')
|
|
481
|
+
toml: str = get_content_type('.toml')
|
|
482
|
+
sqlite: str = get_content_type('.sqlite')
|
|
273
483
|
|
|
274
484
|
|
|
275
485
|
def cli():
|
|
276
486
|
"""
|
|
277
487
|
A simple CLI to look up the MIME type for a given filename or extension.
|
|
278
|
-
|
|
279
|
-
|
|
488
|
+
Install via uv tool install content-types
|
|
489
|
+
Usage example :
|
|
490
|
+
content-types my_file.jpg
|
|
280
491
|
"""
|
|
281
492
|
if len(sys.argv) < 2:
|
|
282
493
|
print('Usage: contenttypes [FILENAME_OR_EXTENSION]\nExample: contenttypes .jpg')
|
|
File without changes
|
|
@@ -4,14 +4,14 @@ build-backend = "hatchling.build"
|
|
|
4
4
|
|
|
5
5
|
[project]
|
|
6
6
|
name = "content-types"
|
|
7
|
-
version = "0.
|
|
7
|
+
version = "0.3.0"
|
|
8
8
|
description = "A library to map file extensions to content types and vice versa."
|
|
9
9
|
readme = "README.md"
|
|
10
10
|
license = "MIT"
|
|
11
11
|
authors = [
|
|
12
12
|
{ name = "Michael Kennedy", email = "mikeckennedy@gmail.com" }
|
|
13
13
|
]
|
|
14
|
-
requires-python = ">=3.10
|
|
14
|
+
requires-python = ">=3.10"
|
|
15
15
|
keywords = ["mime", "content-type", "mapping", "file extensions"]
|
|
16
16
|
homepage = "https://github.com/mikeckennedy/content-types"
|
|
17
17
|
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
import mimetypes
|
|
2
|
+
|
|
3
|
+
import content_types
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
def main():
|
|
7
|
+
print('Compare types in mimetypes vs content-types.')
|
|
8
|
+
in_mime_only = set()
|
|
9
|
+
differ = set()
|
|
10
|
+
for k, v in mimetypes.types_map.items():
|
|
11
|
+
cv_v = content_types.EXTENSION_TO_CONTENT_TYPE.get(k.lower().strip('.'))
|
|
12
|
+
if not cv_v:
|
|
13
|
+
in_mime_only.add((k, v))
|
|
14
|
+
continue
|
|
15
|
+
|
|
16
|
+
if cv_v != v:
|
|
17
|
+
differ.add(((k, v), (k, cv_v)))
|
|
18
|
+
continue
|
|
19
|
+
|
|
20
|
+
only_ct = set()
|
|
21
|
+
for k, v in content_types.EXTENSION_TO_CONTENT_TYPE.items():
|
|
22
|
+
mv = mimetypes.types_map.get('.' + k)
|
|
23
|
+
if not mv:
|
|
24
|
+
only_ct.add((k, v))
|
|
25
|
+
continue
|
|
26
|
+
|
|
27
|
+
print(f'There are {len(differ):,} types where mimetypes and content-types disagree')
|
|
28
|
+
for (mk, mv), (ct_k, ct_v) in sorted(differ):
|
|
29
|
+
print(f'mimetypes: {mk} {mv}, content-types: {ct_k} {ct_v}')
|
|
30
|
+
print()
|
|
31
|
+
|
|
32
|
+
print(f'There are {len(in_mime_only):,} types in mimetypes that are not in content-types')
|
|
33
|
+
for k, v in sorted(in_mime_only):
|
|
34
|
+
print(f'{k.ljust(5)}: {v}')
|
|
35
|
+
print()
|
|
36
|
+
|
|
37
|
+
print(f'There are {len(only_ct):,} types in content-types that are not in mimetypes')
|
|
38
|
+
for k, v in sorted(only_ct):
|
|
39
|
+
print(f'.{k.ljust(5)} -> {v}')
|
|
40
|
+
print()
|
|
41
|
+
|
|
42
|
+
|
|
43
|
+
if __name__ == '__main__':
|
|
44
|
+
main()
|
content_types-0.2.2/PKG-INFO
DELETED
|
@@ -1,77 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: content-types
|
|
3
|
-
Version: 0.2.2
|
|
4
|
-
Summary: A library to map file extensions to content types and vice versa.
|
|
5
|
-
Project-URL: Homepage, https://github.com/mikeckennedy/content-types
|
|
6
|
-
Project-URL: Bug Reports, https://github.com/mikeckennedy/content-types/issues
|
|
7
|
-
Project-URL: Source, https://github.com/mikeckennedy/content-types
|
|
8
|
-
Author-email: Michael Kennedy <mikeckennedy@gmail.com>
|
|
9
|
-
License-Expression: MIT
|
|
10
|
-
License-File: LICENSE
|
|
11
|
-
Keywords: content-type,file extensions,mapping,mime
|
|
12
|
-
Classifier: Intended Audience :: Developers
|
|
13
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
14
|
-
Classifier: Operating System :: OS Independent
|
|
15
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
-
Classifier: Programming Language :: Python :: 3.13
|
|
19
|
-
Classifier: Programming Language :: Python :: 3.14
|
|
20
|
-
Classifier: Topic :: Internet :: WWW/HTTP
|
|
21
|
-
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
22
|
-
Requires-Python: <=3.14,>=3.10
|
|
23
|
-
Description-Content-Type: text/markdown
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
# content-types 🗃️🔎
|
|
27
|
-
|
|
28
|
-
A Python library to map file extensions to MIME types.
|
|
29
|
-
It also provides a CLI for quick lookups right from your terminal.
|
|
30
|
-
If no known mapping is found, the tool returns `application/octet-stream`.
|
|
31
|
-
|
|
32
|
-
Unlike other libraries, this one does **not** try to access the file
|
|
33
|
-
or parse the bytes of the file or stream. It just looks at the extension
|
|
34
|
-
which is valuable when you don't have access to the file directly.
|
|
35
|
-
For example, you know the filename but it is stored in s3 and you don't want
|
|
36
|
-
to download it just to fully inspect the file.
|
|
37
|
-
|
|
38
|
-
## Installation
|
|
39
|
-
|
|
40
|
-
```bash
|
|
41
|
-
uv pip install content-types
|
|
42
|
-
```
|
|
43
|
-
|
|
44
|
-
## Usage
|
|
45
|
-
|
|
46
|
-
```python
|
|
47
|
-
import content_types
|
|
48
|
-
|
|
49
|
-
# Forward lookup: filename -> MIME type
|
|
50
|
-
the_type = content_types.get_content_type("example.jpg")
|
|
51
|
-
print(the_type) # "image/jpeg"
|
|
52
|
-
|
|
53
|
-
# For very common files, you have shortcuts:
|
|
54
|
-
print(f'Content-Type for webp is {content_types.webp}.')
|
|
55
|
-
# Content-Type for webp is image/webp.
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
## CLI
|
|
59
|
-
|
|
60
|
-
To use the library as a CLI tool, just install it with **uv** or **pipx**.
|
|
61
|
-
|
|
62
|
-
```bash
|
|
63
|
-
uv tool install content-types
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
Now it will be available machine-wide.
|
|
67
|
-
|
|
68
|
-
```bash
|
|
69
|
-
content-types example.jpg
|
|
70
|
-
|
|
71
|
-
# Outputs image/jpeg
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
## Contributing
|
|
75
|
-
|
|
76
|
-
Contributions are welcome! Check out [the GitHub repo](https://github.com/mikeckennedy/content-types)
|
|
77
|
-
for more details on how to get involved.
|
content_types-0.2.2/README.md
DELETED
|
@@ -1,53 +0,0 @@
|
|
|
1
|
-
|
|
2
|
-
# content-types 🗃️🔎
|
|
3
|
-
|
|
4
|
-
A Python library to map file extensions to MIME types.
|
|
5
|
-
It also provides a CLI for quick lookups right from your terminal.
|
|
6
|
-
If no known mapping is found, the tool returns `application/octet-stream`.
|
|
7
|
-
|
|
8
|
-
Unlike other libraries, this one does **not** try to access the file
|
|
9
|
-
or parse the bytes of the file or stream. It just looks at the extension
|
|
10
|
-
which is valuable when you don't have access to the file directly.
|
|
11
|
-
For example, you know the filename but it is stored in s3 and you don't want
|
|
12
|
-
to download it just to fully inspect the file.
|
|
13
|
-
|
|
14
|
-
## Installation
|
|
15
|
-
|
|
16
|
-
```bash
|
|
17
|
-
uv pip install content-types
|
|
18
|
-
```
|
|
19
|
-
|
|
20
|
-
## Usage
|
|
21
|
-
|
|
22
|
-
```python
|
|
23
|
-
import content_types
|
|
24
|
-
|
|
25
|
-
# Forward lookup: filename -> MIME type
|
|
26
|
-
the_type = content_types.get_content_type("example.jpg")
|
|
27
|
-
print(the_type) # "image/jpeg"
|
|
28
|
-
|
|
29
|
-
# For very common files, you have shortcuts:
|
|
30
|
-
print(f'Content-Type for webp is {content_types.webp}.')
|
|
31
|
-
# Content-Type for webp is image/webp.
|
|
32
|
-
```
|
|
33
|
-
|
|
34
|
-
## CLI
|
|
35
|
-
|
|
36
|
-
To use the library as a CLI tool, just install it with **uv** or **pipx**.
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
|
-
uv tool install content-types
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
Now it will be available machine-wide.
|
|
43
|
-
|
|
44
|
-
```bash
|
|
45
|
-
content-types example.jpg
|
|
46
|
-
|
|
47
|
-
# Outputs image/jpeg
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
## Contributing
|
|
51
|
-
|
|
52
|
-
Contributions are welcome! Check out [the GitHub repo](https://github.com/mikeckennedy/content-types)
|
|
53
|
-
for more details on how to get involved.
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|