iparq 0.1.5__tar.gz → 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- iparq-0.2.0/.github/copilot-instructions.md +12 -0
- {iparq-0.1.5 → iparq-0.2.0}/.gitignore +1 -0
- iparq-0.2.0/CONTRIBUTING.md +32 -0
- iparq-0.1.5/README.md → iparq-0.2.0/PKG-INFO +91 -1
- iparq-0.1.5/PKG-INFO → iparq-0.2.0/README.md +73 -18
- {iparq-0.1.5 → iparq-0.2.0}/pyproject.toml +6 -5
- iparq-0.2.0/src/iparq/py.typed +1 -0
- iparq-0.2.0/src/iparq/source.py +259 -0
- iparq-0.2.0/tests/conftest.py +6 -0
- iparq-0.2.0/tests/test_cli.py +35 -0
- {iparq-0.1.5 → iparq-0.2.0}/uv.lock +104 -102
- iparq-0.1.5/src/iparq/source.py +0 -142
- iparq-0.1.5/tests/test_cli.py +0 -2
- {iparq-0.1.5 → iparq-0.2.0}/.github/dependabot.yml +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/.github/workflows/merge.yml +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/.github/workflows/python-package.yml +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/.github/workflows/python-publish.yml +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/.python-version +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/.vscode/launch.json +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/.vscode/settings.json +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/LICENSE +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/dummy.parquet +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/media/iparq.png +0 -0
- {iparq-0.1.5 → iparq-0.2.0}/src/iparq/__init__.py +0 -0
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Contributing to iparq
|
|
2
|
+
|
|
3
|
+
Thank you for considering contributing to iparq! We're excited to collaborate with you. Here are some guidelines to help you get started:
|
|
4
|
+
|
|
5
|
+
## How to Contribute
|
|
6
|
+
|
|
7
|
+
1. **Fork the repository**: Click the "Fork" button at the top right of this page to create a copy of the repository.
|
|
8
|
+
2. **Clone your fork**: Use `git clone <your-fork-url>` to clone your forked repository to your local machine.
|
|
9
|
+
3. **Create a branch**: Use `git checkout -b <branch-name>` to create a new branch for your changes.
|
|
10
|
+
4. **Make your changes**: Make the necessary changes in your local repository.
|
|
11
|
+
5. **Commit your changes**: Use `git commit -m "Description of changes"` to commit your changes.
|
|
12
|
+
6. **Push your changes**: Use `git push origin <branch-name>` to push your changes to your forked repository.
|
|
13
|
+
7. **Create a pull request**: Go to the original repository and create a pull request from your forked repository.
|
|
14
|
+
|
|
15
|
+
## Guidelines
|
|
16
|
+
|
|
17
|
+
- **Code of Conduct**: Please adhere to our [Code of Conduct](CODE_OF_CONDUCT.md) to ensure a welcoming and friendly environment.
|
|
18
|
+
- **Documentation**: Ensure your code changes are well-documented. Update any relevant documentation in the `docs` folder.
|
|
19
|
+
- **Tests**: Include tests for your changes to ensure functionality and avoid regressions.
|
|
20
|
+
- **Commit Messages**: Write clear and concise commit messages. Follow the format: `type(scope): message`.
|
|
21
|
+
|
|
22
|
+
## Reporting Issues
|
|
23
|
+
|
|
24
|
+
If you encounter any issues or bugs, please open an issue in the repository. Provide as much detail as possible, including steps to reproduce the issue and any relevant logs or screenshots.
|
|
25
|
+
|
|
26
|
+
## License
|
|
27
|
+
|
|
28
|
+
By contributing to this project, you agree that your contributions will be licensed under the [MIT License](LICENSE).
|
|
29
|
+
|
|
30
|
+
Thank you for your contributions and support!
|
|
31
|
+
|
|
32
|
+
Happy coding!
|
|
@@ -1,3 +1,21 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: iparq
|
|
3
|
+
Version: 0.2.0
|
|
4
|
+
Summary: Display version compression and bloom filter information about a parquet file
|
|
5
|
+
Author-email: MiguelElGallo <miguel.zurcher@gmail.com>
|
|
6
|
+
License-File: LICENSE
|
|
7
|
+
Requires-Python: >=3.9
|
|
8
|
+
Requires-Dist: pyarrow
|
|
9
|
+
Requires-Dist: pydantic
|
|
10
|
+
Requires-Dist: rich
|
|
11
|
+
Requires-Dist: typer[all]
|
|
12
|
+
Provides-Extra: checks
|
|
13
|
+
Requires-Dist: mypy>=1.14.1; extra == 'checks'
|
|
14
|
+
Requires-Dist: ruff>=0.9.3; extra == 'checks'
|
|
15
|
+
Provides-Extra: test
|
|
16
|
+
Requires-Dist: pytest>=7.0; extra == 'test'
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
|
|
1
19
|
# iparq
|
|
2
20
|
|
|
3
21
|
[](https://github.com/MiguelElGallo/iparq/actions/workflows/python-package.yml)
|
|
@@ -9,8 +27,24 @@
|
|
|
9
27
|

|
|
10
28
|
After reading [this blog](https://duckdb.org/2025/01/22/parquet-encodings.html), I began to wonder which Parquet version and compression methods the everyday tools we rely on actually use, only to find that there’s no straightforward way to determine this. That curiosity and the difficulty of quickly discovering such details motivated me to create iparq (Information Parquet). My goal with iparq is to help users easily identify the specifics of the Parquet files generated by different engines, making it clear which features—like newer encodings or certain compression algorithms—the creator of the parquet is using.
|
|
11
29
|
|
|
30
|
+
***New*** Bloom filters information: Displays if there are bloom filters.
|
|
31
|
+
Read more about bloom filters in this [great article](https://duckdb.org/2025/03/07/parquet-bloom-filters-in-duckdb.html).
|
|
32
|
+
|
|
33
|
+
|
|
12
34
|
## Installation
|
|
13
35
|
|
|
36
|
+
### Zero installation - Recommended
|
|
37
|
+
|
|
38
|
+
1) Make sure to have Astral’s UV installed by following the steps here:
|
|
39
|
+
|
|
40
|
+
<https://docs.astral.sh/uv/getting-started/installation/>
|
|
41
|
+
|
|
42
|
+
2) Execute the following command:
|
|
43
|
+
|
|
44
|
+
```sh
|
|
45
|
+
uvx iparq yourparquet.parquet
|
|
46
|
+
```
|
|
47
|
+
|
|
14
48
|
### Using pip
|
|
15
49
|
|
|
16
50
|
1) Install the package using pip:
|
|
@@ -63,7 +97,63 @@ iparq <filename>
|
|
|
63
97
|
|
|
64
98
|
Replace `<filename>` with the path to your .parquet file. The utility will read the metadata of the file and print the compression codecs used in the parquet file.
|
|
65
99
|
|
|
66
|
-
## Example
|
|
100
|
+
## Example ouput - Bloom Filters
|
|
101
|
+
|
|
102
|
+
```log
|
|
103
|
+
ParquetMetaModel(
|
|
104
|
+
created_by='DuckDB version v1.2.1 (build 8e52ec4395)',
|
|
105
|
+
num_columns=1,
|
|
106
|
+
num_rows=100000000,
|
|
107
|
+
num_row_groups=10,
|
|
108
|
+
format_version='1.0',
|
|
109
|
+
serialized_size=1196
|
|
110
|
+
)
|
|
111
|
+
Column Compression Info:
|
|
112
|
+
Row Group 0:
|
|
113
|
+
Column 'r' (Index 0): SNAPPY
|
|
114
|
+
Row Group 1:
|
|
115
|
+
Column 'r' (Index 0): SNAPPY
|
|
116
|
+
Row Group 2:
|
|
117
|
+
Column 'r' (Index 0): SNAPPY
|
|
118
|
+
Row Group 3:
|
|
119
|
+
Column 'r' (Index 0): SNAPPY
|
|
120
|
+
Row Group 4:
|
|
121
|
+
Column 'r' (Index 0): SNAPPY
|
|
122
|
+
Row Group 5:
|
|
123
|
+
Column 'r' (Index 0): SNAPPY
|
|
124
|
+
Row Group 6:
|
|
125
|
+
Column 'r' (Index 0): SNAPPY
|
|
126
|
+
Row Group 7:
|
|
127
|
+
Column 'r' (Index 0): SNAPPY
|
|
128
|
+
Row Group 8:
|
|
129
|
+
Column 'r' (Index 0): SNAPPY
|
|
130
|
+
Row Group 9:
|
|
131
|
+
Column 'r' (Index 0): SNAPPY
|
|
132
|
+
Bloom Filter Info:
|
|
133
|
+
Row Group 0:
|
|
134
|
+
Column 'r' (Index 0): Has bloom filter
|
|
135
|
+
Row Group 1:
|
|
136
|
+
Column 'r' (Index 0): Has bloom filter
|
|
137
|
+
Row Group 2:
|
|
138
|
+
Column 'r' (Index 0): Has bloom filter
|
|
139
|
+
Row Group 3:
|
|
140
|
+
Column 'r' (Index 0): Has bloom filter
|
|
141
|
+
Row Group 4:
|
|
142
|
+
Column 'r' (Index 0): Has bloom filter
|
|
143
|
+
Row Group 5:
|
|
144
|
+
Column 'r' (Index 0): Has bloom filter
|
|
145
|
+
Row Group 6:
|
|
146
|
+
Column 'r' (Index 0): Has bloom filter
|
|
147
|
+
Row Group 7:
|
|
148
|
+
Column 'r' (Index 0): Has bloom filter
|
|
149
|
+
Row Group 8:
|
|
150
|
+
Column 'r' (Index 0): Has bloom filter
|
|
151
|
+
Row Group 9:
|
|
152
|
+
Column 'r' (Index 0): Has bloom filter
|
|
153
|
+
Compression codecs: {'SNAPPY'}
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## Example output
|
|
67
157
|
|
|
68
158
|
```log
|
|
69
159
|
ParquetMetaModel(
|
|
@@ -1,20 +1,3 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: iparq
|
|
3
|
-
Version: 0.1.5
|
|
4
|
-
Summary: Display version and compression information about a parquet file
|
|
5
|
-
Author-email: MiguelElGallo <miguel.zurcher@gmail.com>
|
|
6
|
-
License-File: LICENSE
|
|
7
|
-
Requires-Python: >=3.9
|
|
8
|
-
Requires-Dist: pyarrow>=19.0.0
|
|
9
|
-
Requires-Dist: pydantic>=2.10.6
|
|
10
|
-
Requires-Dist: typer>=0.15.1
|
|
11
|
-
Provides-Extra: checks
|
|
12
|
-
Requires-Dist: mypy>=1.14.1; extra == 'checks'
|
|
13
|
-
Requires-Dist: ruff>=0.9.3; extra == 'checks'
|
|
14
|
-
Provides-Extra: test
|
|
15
|
-
Requires-Dist: pytest>=7.0; extra == 'test'
|
|
16
|
-
Description-Content-Type: text/markdown
|
|
17
|
-
|
|
18
1
|
# iparq
|
|
19
2
|
|
|
20
3
|
[](https://github.com/MiguelElGallo/iparq/actions/workflows/python-package.yml)
|
|
@@ -26,8 +9,24 @@ Description-Content-Type: text/markdown
|
|
|
26
9
|

|
|
27
10
|
After reading [this blog](https://duckdb.org/2025/01/22/parquet-encodings.html), I began to wonder which Parquet version and compression methods the everyday tools we rely on actually use, only to find that there’s no straightforward way to determine this. That curiosity and the difficulty of quickly discovering such details motivated me to create iparq (Information Parquet). My goal with iparq is to help users easily identify the specifics of the Parquet files generated by different engines, making it clear which features—like newer encodings or certain compression algorithms—the creator of the parquet is using.
|
|
28
11
|
|
|
12
|
+
***New*** Bloom filters information: Displays if there are bloom filters.
|
|
13
|
+
Read more about bloom filters in this [great article](https://duckdb.org/2025/03/07/parquet-bloom-filters-in-duckdb.html).
|
|
14
|
+
|
|
15
|
+
|
|
29
16
|
## Installation
|
|
30
17
|
|
|
18
|
+
### Zero installation - Recommended
|
|
19
|
+
|
|
20
|
+
1) Make sure to have Astral’s UV installed by following the steps here:
|
|
21
|
+
|
|
22
|
+
<https://docs.astral.sh/uv/getting-started/installation/>
|
|
23
|
+
|
|
24
|
+
2) Execute the following command:
|
|
25
|
+
|
|
26
|
+
```sh
|
|
27
|
+
uvx iparq yourparquet.parquet
|
|
28
|
+
```
|
|
29
|
+
|
|
31
30
|
### Using pip
|
|
32
31
|
|
|
33
32
|
1) Install the package using pip:
|
|
@@ -80,7 +79,63 @@ iparq <filename>
|
|
|
80
79
|
|
|
81
80
|
Replace `<filename>` with the path to your .parquet file. The utility will read the metadata of the file and print the compression codecs used in the parquet file.
|
|
82
81
|
|
|
83
|
-
## Example
|
|
82
|
+
## Example ouput - Bloom Filters
|
|
83
|
+
|
|
84
|
+
```log
|
|
85
|
+
ParquetMetaModel(
|
|
86
|
+
created_by='DuckDB version v1.2.1 (build 8e52ec4395)',
|
|
87
|
+
num_columns=1,
|
|
88
|
+
num_rows=100000000,
|
|
89
|
+
num_row_groups=10,
|
|
90
|
+
format_version='1.0',
|
|
91
|
+
serialized_size=1196
|
|
92
|
+
)
|
|
93
|
+
Column Compression Info:
|
|
94
|
+
Row Group 0:
|
|
95
|
+
Column 'r' (Index 0): SNAPPY
|
|
96
|
+
Row Group 1:
|
|
97
|
+
Column 'r' (Index 0): SNAPPY
|
|
98
|
+
Row Group 2:
|
|
99
|
+
Column 'r' (Index 0): SNAPPY
|
|
100
|
+
Row Group 3:
|
|
101
|
+
Column 'r' (Index 0): SNAPPY
|
|
102
|
+
Row Group 4:
|
|
103
|
+
Column 'r' (Index 0): SNAPPY
|
|
104
|
+
Row Group 5:
|
|
105
|
+
Column 'r' (Index 0): SNAPPY
|
|
106
|
+
Row Group 6:
|
|
107
|
+
Column 'r' (Index 0): SNAPPY
|
|
108
|
+
Row Group 7:
|
|
109
|
+
Column 'r' (Index 0): SNAPPY
|
|
110
|
+
Row Group 8:
|
|
111
|
+
Column 'r' (Index 0): SNAPPY
|
|
112
|
+
Row Group 9:
|
|
113
|
+
Column 'r' (Index 0): SNAPPY
|
|
114
|
+
Bloom Filter Info:
|
|
115
|
+
Row Group 0:
|
|
116
|
+
Column 'r' (Index 0): Has bloom filter
|
|
117
|
+
Row Group 1:
|
|
118
|
+
Column 'r' (Index 0): Has bloom filter
|
|
119
|
+
Row Group 2:
|
|
120
|
+
Column 'r' (Index 0): Has bloom filter
|
|
121
|
+
Row Group 3:
|
|
122
|
+
Column 'r' (Index 0): Has bloom filter
|
|
123
|
+
Row Group 4:
|
|
124
|
+
Column 'r' (Index 0): Has bloom filter
|
|
125
|
+
Row Group 5:
|
|
126
|
+
Column 'r' (Index 0): Has bloom filter
|
|
127
|
+
Row Group 6:
|
|
128
|
+
Column 'r' (Index 0): Has bloom filter
|
|
129
|
+
Row Group 7:
|
|
130
|
+
Column 'r' (Index 0): Has bloom filter
|
|
131
|
+
Row Group 8:
|
|
132
|
+
Column 'r' (Index 0): Has bloom filter
|
|
133
|
+
Row Group 9:
|
|
134
|
+
Column 'r' (Index 0): Has bloom filter
|
|
135
|
+
Compression codecs: {'SNAPPY'}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
## Example output
|
|
84
139
|
|
|
85
140
|
```log
|
|
86
141
|
ParquetMetaModel(
|
|
@@ -1,16 +1,17 @@
|
|
|
1
1
|
[project]
|
|
2
2
|
name = "iparq"
|
|
3
|
-
version = "0.
|
|
4
|
-
description = "Display version and
|
|
3
|
+
version = "0.2.0"
|
|
4
|
+
description = "Display version compression and bloom filter information about a parquet file"
|
|
5
5
|
readme = "README.md"
|
|
6
6
|
authors = [
|
|
7
7
|
{ name = "MiguelElGallo", email = "miguel.zurcher@gmail.com" }
|
|
8
8
|
]
|
|
9
9
|
requires-python = ">=3.9"
|
|
10
10
|
dependencies = [
|
|
11
|
-
"pyarrow
|
|
12
|
-
"
|
|
13
|
-
"
|
|
11
|
+
"pyarrow",
|
|
12
|
+
"typer[all]",
|
|
13
|
+
"pydantic",
|
|
14
|
+
"rich",
|
|
14
15
|
]
|
|
15
16
|
|
|
16
17
|
[project.optional-dependencies]
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
# This empty file marks the package as typed for mypy
|
|
@@ -0,0 +1,259 @@
|
|
|
1
|
+
from typing import List, Optional
|
|
2
|
+
|
|
3
|
+
import pyarrow.parquet as pq
|
|
4
|
+
import typer
|
|
5
|
+
from pydantic import BaseModel
|
|
6
|
+
from rich import print
|
|
7
|
+
from rich.console import Console
|
|
8
|
+
from rich.table import Table
|
|
9
|
+
|
|
10
|
+
app = typer.Typer()
|
|
11
|
+
console = Console()
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
class ParquetMetaModel(BaseModel):
|
|
15
|
+
"""
|
|
16
|
+
ParquetMetaModel is a data model representing metadata for a Parquet file.
|
|
17
|
+
|
|
18
|
+
Attributes:
|
|
19
|
+
created_by (str): The creator of the Parquet file.
|
|
20
|
+
num_columns (int): The number of columns in the Parquet file.
|
|
21
|
+
num_rows (int): The number of rows in the Parquet file.
|
|
22
|
+
num_row_groups (int): The number of row groups in the Parquet file.
|
|
23
|
+
format_version (str): The version of the Parquet format used.
|
|
24
|
+
serialized_size (int): The size of the serialized Parquet file in bytes.
|
|
25
|
+
"""
|
|
26
|
+
|
|
27
|
+
created_by: str
|
|
28
|
+
num_columns: int
|
|
29
|
+
num_rows: int
|
|
30
|
+
num_row_groups: int
|
|
31
|
+
format_version: str
|
|
32
|
+
serialized_size: int
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
class ColumnInfo(BaseModel):
|
|
36
|
+
"""
|
|
37
|
+
ColumnInfo is a data model representing information about a column in a Parquet file.
|
|
38
|
+
|
|
39
|
+
Attributes:
|
|
40
|
+
row_group (int): The row group index.
|
|
41
|
+
column_name (str): The name of the column.
|
|
42
|
+
column_index (int): The index of the column.
|
|
43
|
+
compression_type (str): The compression type used for the column.
|
|
44
|
+
has_bloom_filter (bool): Whether the column has a bloom filter.
|
|
45
|
+
"""
|
|
46
|
+
|
|
47
|
+
row_group: int
|
|
48
|
+
column_name: str
|
|
49
|
+
column_index: int
|
|
50
|
+
compression_type: str
|
|
51
|
+
has_bloom_filter: Optional[bool] = False
|
|
52
|
+
|
|
53
|
+
|
|
54
|
+
class ParquetColumnInfo(BaseModel):
|
|
55
|
+
"""
|
|
56
|
+
ParquetColumnInfo is a data model representing information about all columns in a Parquet file.
|
|
57
|
+
|
|
58
|
+
Attributes:
|
|
59
|
+
columns (List[ColumnInfo]): List of column information.
|
|
60
|
+
"""
|
|
61
|
+
|
|
62
|
+
columns: List[ColumnInfo] = []
|
|
63
|
+
|
|
64
|
+
|
|
65
|
+
def read_parquet_metadata(filename: str):
|
|
66
|
+
"""
|
|
67
|
+
Reads the metadata of a Parquet file and extracts the compression codecs used.
|
|
68
|
+
|
|
69
|
+
Args:
|
|
70
|
+
filename (str): The path to the Parquet file.
|
|
71
|
+
|
|
72
|
+
Returns:
|
|
73
|
+
tuple: A tuple containing:
|
|
74
|
+
- parquet_metadata (pyarrow.parquet.FileMetaData): The metadata of the Parquet file.
|
|
75
|
+
- compression_codecs (set): A set of compression codecs used in the Parquet file.
|
|
76
|
+
"""
|
|
77
|
+
try:
|
|
78
|
+
compression_codecs = set([])
|
|
79
|
+
parquet_metadata = pq.ParquetFile(filename).metadata
|
|
80
|
+
|
|
81
|
+
for i in range(parquet_metadata.num_row_groups):
|
|
82
|
+
for j in range(parquet_metadata.num_columns):
|
|
83
|
+
compression_codecs.add(
|
|
84
|
+
parquet_metadata.row_group(i).column(j).compression
|
|
85
|
+
)
|
|
86
|
+
|
|
87
|
+
except FileNotFoundError:
|
|
88
|
+
console.print(
|
|
89
|
+
f"Cannot open: {filename}.", style="blink bold red underline on white"
|
|
90
|
+
)
|
|
91
|
+
exit(1)
|
|
92
|
+
|
|
93
|
+
return parquet_metadata, compression_codecs
|
|
94
|
+
|
|
95
|
+
|
|
96
|
+
def print_parquet_metadata(parquet_metadata):
|
|
97
|
+
"""
|
|
98
|
+
Prints the metadata of a Parquet file.
|
|
99
|
+
|
|
100
|
+
Args:
|
|
101
|
+
parquet_metadata: An object containing metadata of a Parquet file.
|
|
102
|
+
Expected attributes are:
|
|
103
|
+
- created_by: The creator of the Parquet file.
|
|
104
|
+
- num_columns: The number of columns in the Parquet file.
|
|
105
|
+
- num_rows: The number of rows in the Parquet file.
|
|
106
|
+
- num_row_groups: The number of row groups in the Parquet file.
|
|
107
|
+
- format_version: The format version of the Parquet file.
|
|
108
|
+
- serialized_size: The serialized size of the Parquet file.
|
|
109
|
+
|
|
110
|
+
Raises:
|
|
111
|
+
AttributeError: If the provided parquet_metadata object does not have the expected attributes.
|
|
112
|
+
"""
|
|
113
|
+
try:
|
|
114
|
+
meta = ParquetMetaModel(
|
|
115
|
+
created_by=parquet_metadata.created_by,
|
|
116
|
+
num_columns=parquet_metadata.num_columns,
|
|
117
|
+
num_rows=parquet_metadata.num_rows,
|
|
118
|
+
num_row_groups=parquet_metadata.num_row_groups,
|
|
119
|
+
format_version=str(parquet_metadata.format_version),
|
|
120
|
+
serialized_size=parquet_metadata.serialized_size,
|
|
121
|
+
)
|
|
122
|
+
console.print(meta)
|
|
123
|
+
|
|
124
|
+
except AttributeError as e:
|
|
125
|
+
console.print(f"Error: {e}", style="blink bold red underline on white")
|
|
126
|
+
finally:
|
|
127
|
+
pass
|
|
128
|
+
|
|
129
|
+
|
|
130
|
+
def print_compression_types(parquet_metadata, column_info: ParquetColumnInfo) -> None:
|
|
131
|
+
"""
|
|
132
|
+
Collects compression type information for each column and adds it to the column_info model.
|
|
133
|
+
|
|
134
|
+
Args:
|
|
135
|
+
parquet_metadata: The Parquet file metadata.
|
|
136
|
+
column_info: The ParquetColumnInfo model to update.
|
|
137
|
+
"""
|
|
138
|
+
try:
|
|
139
|
+
num_row_groups = parquet_metadata.num_row_groups
|
|
140
|
+
num_columns = parquet_metadata.num_columns
|
|
141
|
+
|
|
142
|
+
for i in range(num_row_groups):
|
|
143
|
+
row_group = parquet_metadata.row_group(i)
|
|
144
|
+
for j in range(num_columns):
|
|
145
|
+
column_chunk = row_group.column(j)
|
|
146
|
+
compression = column_chunk.compression
|
|
147
|
+
column_name = parquet_metadata.schema.names[j]
|
|
148
|
+
|
|
149
|
+
# Create or update column info
|
|
150
|
+
column_info.columns.append(
|
|
151
|
+
ColumnInfo(
|
|
152
|
+
row_group=i,
|
|
153
|
+
column_name=column_name,
|
|
154
|
+
column_index=j,
|
|
155
|
+
compression_type=compression,
|
|
156
|
+
)
|
|
157
|
+
)
|
|
158
|
+
except Exception as e:
|
|
159
|
+
console.print(
|
|
160
|
+
f"Error while collecting compression types: {e}",
|
|
161
|
+
style="blink bold red underline on white",
|
|
162
|
+
)
|
|
163
|
+
|
|
164
|
+
|
|
165
|
+
def print_bloom_filter_info(parquet_metadata, column_info: ParquetColumnInfo) -> None:
|
|
166
|
+
"""
|
|
167
|
+
Updates the column_info model with bloom filter information.
|
|
168
|
+
|
|
169
|
+
Args:
|
|
170
|
+
parquet_metadata: The Parquet file metadata.
|
|
171
|
+
column_info: The ParquetColumnInfo model to update.
|
|
172
|
+
"""
|
|
173
|
+
try:
|
|
174
|
+
num_row_groups = parquet_metadata.num_row_groups
|
|
175
|
+
num_columns = parquet_metadata.num_columns
|
|
176
|
+
|
|
177
|
+
for i in range(num_row_groups):
|
|
178
|
+
row_group = parquet_metadata.row_group(i)
|
|
179
|
+
|
|
180
|
+
for j in range(num_columns):
|
|
181
|
+
column_chunk = row_group.column(j)
|
|
182
|
+
|
|
183
|
+
# Find the corresponding column in our model
|
|
184
|
+
for col in column_info.columns:
|
|
185
|
+
if col.row_group == i and col.column_index == j:
|
|
186
|
+
# Check if this column has bloom filters
|
|
187
|
+
has_bloom_filter = (
|
|
188
|
+
hasattr(column_chunk, "is_stats_set")
|
|
189
|
+
and column_chunk.is_stats_set
|
|
190
|
+
)
|
|
191
|
+
col.has_bloom_filter = has_bloom_filter
|
|
192
|
+
break
|
|
193
|
+
except Exception as e:
|
|
194
|
+
console.print(
|
|
195
|
+
f"Error while collecting bloom filter information: {e}",
|
|
196
|
+
style="blink bold red underline on white",
|
|
197
|
+
)
|
|
198
|
+
|
|
199
|
+
|
|
200
|
+
def print_column_info_table(column_info: ParquetColumnInfo) -> None:
|
|
201
|
+
"""
|
|
202
|
+
Prints the column information using a Rich table.
|
|
203
|
+
|
|
204
|
+
Args:
|
|
205
|
+
column_info: The ParquetColumnInfo model to display.
|
|
206
|
+
"""
|
|
207
|
+
table = Table(title="Parquet Column Information")
|
|
208
|
+
|
|
209
|
+
# Add table columns
|
|
210
|
+
table.add_column("Row Group", justify="center", style="cyan")
|
|
211
|
+
table.add_column("Column Name", style="green")
|
|
212
|
+
table.add_column("Index", justify="center")
|
|
213
|
+
table.add_column("Compression", style="magenta")
|
|
214
|
+
table.add_column("Bloom Filter", justify="center")
|
|
215
|
+
|
|
216
|
+
# Add rows to the table
|
|
217
|
+
for col in column_info.columns:
|
|
218
|
+
table.add_row(
|
|
219
|
+
str(col.row_group),
|
|
220
|
+
col.column_name,
|
|
221
|
+
str(col.column_index),
|
|
222
|
+
col.compression_type,
|
|
223
|
+
"✅" if col.has_bloom_filter else "❌",
|
|
224
|
+
)
|
|
225
|
+
|
|
226
|
+
# Print the table
|
|
227
|
+
console.print(table)
|
|
228
|
+
|
|
229
|
+
|
|
230
|
+
@app.command()
|
|
231
|
+
def main(filename: str):
|
|
232
|
+
"""
|
|
233
|
+
Main function to read and print Parquet file metadata.
|
|
234
|
+
|
|
235
|
+
Args:
|
|
236
|
+
filename (str): The path to the Parquet file.
|
|
237
|
+
|
|
238
|
+
Returns:
|
|
239
|
+
Metadata of the Parquet file and the compression codecs used.
|
|
240
|
+
"""
|
|
241
|
+
(parquet_metadata, compression) = read_parquet_metadata(filename)
|
|
242
|
+
|
|
243
|
+
print_parquet_metadata(parquet_metadata)
|
|
244
|
+
|
|
245
|
+
# Create a model to store column information
|
|
246
|
+
column_info = ParquetColumnInfo()
|
|
247
|
+
|
|
248
|
+
# Collect information
|
|
249
|
+
print_compression_types(parquet_metadata, column_info)
|
|
250
|
+
print_bloom_filter_info(parquet_metadata, column_info)
|
|
251
|
+
|
|
252
|
+
# Print the information as a table
|
|
253
|
+
print_column_info_table(column_info)
|
|
254
|
+
|
|
255
|
+
print(f"Compression codecs: {compression}")
|
|
256
|
+
|
|
257
|
+
|
|
258
|
+
if __name__ == "__main__":
|
|
259
|
+
app()
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
from typer.testing import CliRunner
|
|
2
|
+
|
|
3
|
+
from src.iparq.source import app
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
def test_empty():
|
|
7
|
+
assert True
|
|
8
|
+
|
|
9
|
+
|
|
10
|
+
def test_parquet_info():
|
|
11
|
+
"""Test that the CLI correctly displays parquet file information."""
|
|
12
|
+
runner = CliRunner()
|
|
13
|
+
result = runner.invoke(app, ["dummy.parquet"])
|
|
14
|
+
|
|
15
|
+
assert result.exit_code == 0
|
|
16
|
+
|
|
17
|
+
expected_output = """ParquetMetaModel(
|
|
18
|
+
created_by='parquet-cpp-arrow version 14.0.2',
|
|
19
|
+
num_columns=3,
|
|
20
|
+
num_rows=3,
|
|
21
|
+
num_row_groups=1,
|
|
22
|
+
format_version='2.6',
|
|
23
|
+
serialized_size=2223
|
|
24
|
+
)
|
|
25
|
+
Parquet Column Information
|
|
26
|
+
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
|
|
27
|
+
┃ Row Group ┃ Column Name ┃ Index ┃ Compression ┃ Bloom Filter ┃
|
|
28
|
+
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
|
|
29
|
+
│ 0 │ one │ 0 │ SNAPPY │ ✅ │
|
|
30
|
+
│ 0 │ two │ 1 │ SNAPPY │ ✅ │
|
|
31
|
+
│ 0 │ three │ 2 │ SNAPPY │ ✅ │
|
|
32
|
+
└───────────┴─────────────┴───────┴─────────────┴──────────────┘
|
|
33
|
+
Compression codecs: {'SNAPPY'}"""
|
|
34
|
+
|
|
35
|
+
assert expected_output in result.stdout
|
|
@@ -1,4 +1,5 @@
|
|
|
1
1
|
version = 1
|
|
2
|
+
revision = 1
|
|
2
3
|
requires-python = ">=3.9"
|
|
3
4
|
|
|
4
5
|
[[package]]
|
|
@@ -51,7 +52,7 @@ wheels = [
|
|
|
51
52
|
|
|
52
53
|
[[package]]
|
|
53
54
|
name = "iparq"
|
|
54
|
-
version = "0.
|
|
55
|
+
version = "0.2.0"
|
|
55
56
|
source = { editable = "." }
|
|
56
57
|
dependencies = [
|
|
57
58
|
{ name = "pyarrow" },
|
|
@@ -77,6 +78,7 @@ requires-dist = [
|
|
|
77
78
|
{ name = "ruff", marker = "extra == 'checks'", specifier = ">=0.9.3" },
|
|
78
79
|
{ name = "typer", specifier = ">=0.15.1" },
|
|
79
80
|
]
|
|
81
|
+
provides-extras = ["test", "checks"]
|
|
80
82
|
|
|
81
83
|
[[package]]
|
|
82
84
|
name = "markdown-it-py"
|
|
@@ -101,46 +103,46 @@ wheels = [
|
|
|
101
103
|
|
|
102
104
|
[[package]]
|
|
103
105
|
name = "mypy"
|
|
104
|
-
version = "1.
|
|
106
|
+
version = "1.15.0"
|
|
105
107
|
source = { registry = "https://pypi.org/simple" }
|
|
106
108
|
dependencies = [
|
|
107
109
|
{ name = "mypy-extensions" },
|
|
108
110
|
{ name = "tomli", marker = "python_full_version < '3.11'" },
|
|
109
111
|
{ name = "typing-extensions" },
|
|
110
112
|
]
|
|
111
|
-
sdist = { url = "https://files.pythonhosted.org/packages/
|
|
113
|
+
sdist = { url = "https://files.pythonhosted.org/packages/ce/43/d5e49a86afa64bd3839ea0d5b9c7103487007d728e1293f52525d6d5486a/mypy-1.15.0.tar.gz", hash = "sha256:404534629d51d3efea5c800ee7c42b72a6554d6c400e6a79eafe15d11341fd43", size = 3239717 }
|
|
112
114
|
wheels = [
|
|
113
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
114
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
115
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
116
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
117
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
118
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
119
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
120
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
121
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
122
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
123
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
124
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
125
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
126
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
127
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
128
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
129
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
130
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
131
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
132
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
133
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
134
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
135
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
136
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
137
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
138
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
139
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
140
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
141
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
142
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
143
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
115
|
+
{ url = "https://files.pythonhosted.org/packages/68/f8/65a7ce8d0e09b6329ad0c8d40330d100ea343bd4dd04c4f8ae26462d0a17/mypy-1.15.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:979e4e1a006511dacf628e36fadfecbcc0160a8af6ca7dad2f5025529e082c13", size = 10738433 },
|
|
116
|
+
{ url = "https://files.pythonhosted.org/packages/b4/95/9c0ecb8eacfe048583706249439ff52105b3f552ea9c4024166c03224270/mypy-1.15.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:c4bb0e1bd29f7d34efcccd71cf733580191e9a264a2202b0239da95984c5b559", size = 9861472 },
|
|
117
|
+
{ url = "https://files.pythonhosted.org/packages/84/09/9ec95e982e282e20c0d5407bc65031dfd0f0f8ecc66b69538296e06fcbee/mypy-1.15.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:be68172e9fd9ad8fb876c6389f16d1c1b5f100ffa779f77b1fb2176fcc9ab95b", size = 11611424 },
|
|
118
|
+
{ url = "https://files.pythonhosted.org/packages/78/13/f7d14e55865036a1e6a0a69580c240f43bc1f37407fe9235c0d4ef25ffb0/mypy-1.15.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c7be1e46525adfa0d97681432ee9fcd61a3964c2446795714699a998d193f1a3", size = 12365450 },
|
|
119
|
+
{ url = "https://files.pythonhosted.org/packages/48/e1/301a73852d40c241e915ac6d7bcd7fedd47d519246db2d7b86b9d7e7a0cb/mypy-1.15.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:2e2c2e6d3593f6451b18588848e66260ff62ccca522dd231cd4dd59b0160668b", size = 12551765 },
|
|
120
|
+
{ url = "https://files.pythonhosted.org/packages/77/ba/c37bc323ae5fe7f3f15a28e06ab012cd0b7552886118943e90b15af31195/mypy-1.15.0-cp310-cp310-win_amd64.whl", hash = "sha256:6983aae8b2f653e098edb77f893f7b6aca69f6cffb19b2cc7443f23cce5f4828", size = 9274701 },
|
|
121
|
+
{ url = "https://files.pythonhosted.org/packages/03/bc/f6339726c627bd7ca1ce0fa56c9ae2d0144604a319e0e339bdadafbbb599/mypy-1.15.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:2922d42e16d6de288022e5ca321cd0618b238cfc5570e0263e5ba0a77dbef56f", size = 10662338 },
|
|
122
|
+
{ url = "https://files.pythonhosted.org/packages/e2/90/8dcf506ca1a09b0d17555cc00cd69aee402c203911410136cd716559efe7/mypy-1.15.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:2ee2d57e01a7c35de00f4634ba1bbf015185b219e4dc5909e281016df43f5ee5", size = 9787540 },
|
|
123
|
+
{ url = "https://files.pythonhosted.org/packages/05/05/a10f9479681e5da09ef2f9426f650d7b550d4bafbef683b69aad1ba87457/mypy-1.15.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:973500e0774b85d9689715feeffcc980193086551110fd678ebe1f4342fb7c5e", size = 11538051 },
|
|
124
|
+
{ url = "https://files.pythonhosted.org/packages/e9/9a/1f7d18b30edd57441a6411fcbc0c6869448d1a4bacbaee60656ac0fc29c8/mypy-1.15.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5a95fb17c13e29d2d5195869262f8125dfdb5c134dc8d9a9d0aecf7525b10c2c", size = 12286751 },
|
|
125
|
+
{ url = "https://files.pythonhosted.org/packages/72/af/19ff499b6f1dafcaf56f9881f7a965ac2f474f69f6f618b5175b044299f5/mypy-1.15.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:1905f494bfd7d85a23a88c5d97840888a7bd516545fc5aaedff0267e0bb54e2f", size = 12421783 },
|
|
126
|
+
{ url = "https://files.pythonhosted.org/packages/96/39/11b57431a1f686c1aed54bf794870efe0f6aeca11aca281a0bd87a5ad42c/mypy-1.15.0-cp311-cp311-win_amd64.whl", hash = "sha256:c9817fa23833ff189db061e6d2eff49b2f3b6ed9856b4a0a73046e41932d744f", size = 9265618 },
|
|
127
|
+
{ url = "https://files.pythonhosted.org/packages/98/3a/03c74331c5eb8bd025734e04c9840532226775c47a2c39b56a0c8d4f128d/mypy-1.15.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:aea39e0583d05124836ea645f412e88a5c7d0fd77a6d694b60d9b6b2d9f184fd", size = 10793981 },
|
|
128
|
+
{ url = "https://files.pythonhosted.org/packages/f0/1a/41759b18f2cfd568848a37c89030aeb03534411eef981df621d8fad08a1d/mypy-1.15.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:2f2147ab812b75e5b5499b01ade1f4a81489a147c01585cda36019102538615f", size = 9749175 },
|
|
129
|
+
{ url = "https://files.pythonhosted.org/packages/12/7e/873481abf1ef112c582db832740f4c11b2bfa510e829d6da29b0ab8c3f9c/mypy-1.15.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ce436f4c6d218a070048ed6a44c0bbb10cd2cc5e272b29e7845f6a2f57ee4464", size = 11455675 },
|
|
130
|
+
{ url = "https://files.pythonhosted.org/packages/b3/d0/92ae4cde706923a2d3f2d6c39629134063ff64b9dedca9c1388363da072d/mypy-1.15.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8023ff13985661b50a5928fc7a5ca15f3d1affb41e5f0a9952cb68ef090b31ee", size = 12410020 },
|
|
131
|
+
{ url = "https://files.pythonhosted.org/packages/46/8b/df49974b337cce35f828ba6fda228152d6db45fed4c86ba56ffe442434fd/mypy-1.15.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:1124a18bc11a6a62887e3e137f37f53fbae476dc36c185d549d4f837a2a6a14e", size = 12498582 },
|
|
132
|
+
{ url = "https://files.pythonhosted.org/packages/13/50/da5203fcf6c53044a0b699939f31075c45ae8a4cadf538a9069b165c1050/mypy-1.15.0-cp312-cp312-win_amd64.whl", hash = "sha256:171a9ca9a40cd1843abeca0e405bc1940cd9b305eaeea2dda769ba096932bb22", size = 9366614 },
|
|
133
|
+
{ url = "https://files.pythonhosted.org/packages/6a/9b/fd2e05d6ffff24d912f150b87db9e364fa8282045c875654ce7e32fffa66/mypy-1.15.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:93faf3fdb04768d44bf28693293f3904bbb555d076b781ad2530214ee53e3445", size = 10788592 },
|
|
134
|
+
{ url = "https://files.pythonhosted.org/packages/74/37/b246d711c28a03ead1fd906bbc7106659aed7c089d55fe40dd58db812628/mypy-1.15.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:811aeccadfb730024c5d3e326b2fbe9249bb7413553f15499a4050f7c30e801d", size = 9753611 },
|
|
135
|
+
{ url = "https://files.pythonhosted.org/packages/a6/ac/395808a92e10cfdac8003c3de9a2ab6dc7cde6c0d2a4df3df1b815ffd067/mypy-1.15.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:98b7b9b9aedb65fe628c62a6dc57f6d5088ef2dfca37903a7d9ee374d03acca5", size = 11438443 },
|
|
136
|
+
{ url = "https://files.pythonhosted.org/packages/d2/8b/801aa06445d2de3895f59e476f38f3f8d610ef5d6908245f07d002676cbf/mypy-1.15.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c43a7682e24b4f576d93072216bf56eeff70d9140241f9edec0c104d0c515036", size = 12402541 },
|
|
137
|
+
{ url = "https://files.pythonhosted.org/packages/c7/67/5a4268782eb77344cc613a4cf23540928e41f018a9a1ec4c6882baf20ab8/mypy-1.15.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:baefc32840a9f00babd83251560e0ae1573e2f9d1b067719479bfb0e987c6357", size = 12494348 },
|
|
138
|
+
{ url = "https://files.pythonhosted.org/packages/83/3e/57bb447f7bbbfaabf1712d96f9df142624a386d98fb026a761532526057e/mypy-1.15.0-cp313-cp313-win_amd64.whl", hash = "sha256:b9378e2c00146c44793c98b8d5a61039a048e31f429fb0eb546d93f4b000bedf", size = 9373648 },
|
|
139
|
+
{ url = "https://files.pythonhosted.org/packages/5a/fa/79cf41a55b682794abe71372151dbbf856e3008f6767057229e6649d294a/mypy-1.15.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:e601a7fa172c2131bff456bb3ee08a88360760d0d2f8cbd7a75a65497e2df078", size = 10737129 },
|
|
140
|
+
{ url = "https://files.pythonhosted.org/packages/d3/33/dd8feb2597d648de29e3da0a8bf4e1afbda472964d2a4a0052203a6f3594/mypy-1.15.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:712e962a6357634fef20412699a3655c610110e01cdaa6180acec7fc9f8513ba", size = 9856335 },
|
|
141
|
+
{ url = "https://files.pythonhosted.org/packages/e4/b5/74508959c1b06b96674b364ffeb7ae5802646b32929b7701fc6b18447592/mypy-1.15.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f95579473af29ab73a10bada2f9722856792a36ec5af5399b653aa28360290a5", size = 11611935 },
|
|
142
|
+
{ url = "https://files.pythonhosted.org/packages/6c/53/da61b9d9973efcd6507183fdad96606996191657fe79701b2c818714d573/mypy-1.15.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8f8722560a14cde92fdb1e31597760dc35f9f5524cce17836c0d22841830fd5b", size = 12365827 },
|
|
143
|
+
{ url = "https://files.pythonhosted.org/packages/c1/72/965bd9ee89540c79a25778cc080c7e6ef40aa1eeac4d52cec7eae6eb5228/mypy-1.15.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:1fbb8da62dc352133d7d7ca90ed2fb0e9d42bb1a32724c287d3c76c58cbaa9c2", size = 12541924 },
|
|
144
|
+
{ url = "https://files.pythonhosted.org/packages/46/d0/f41645c2eb263e6c77ada7d76f894c580c9ddb20d77f0c24d34273a4dab2/mypy-1.15.0-cp39-cp39-win_amd64.whl", hash = "sha256:d10d994b41fb3497719bbf866f227b3489048ea4bbbb5015357db306249f7980", size = 9271176 },
|
|
145
|
+
{ url = "https://files.pythonhosted.org/packages/09/4e/a7d65c7322c510de2c409ff3828b03354a7c43f5a8ed458a7a131b41c7b9/mypy-1.15.0-py3-none-any.whl", hash = "sha256:5469affef548bd1895d86d3bf10ce2b44e33d86923c29e4d675b3e323437ea3e", size = 2221777 },
|
|
144
146
|
]
|
|
145
147
|
|
|
146
148
|
[[package]]
|
|
@@ -172,51 +174,51 @@ wheels = [
|
|
|
172
174
|
|
|
173
175
|
[[package]]
|
|
174
176
|
name = "pyarrow"
|
|
175
|
-
version = "19.0.
|
|
177
|
+
version = "19.0.1"
|
|
176
178
|
source = { registry = "https://pypi.org/simple" }
|
|
177
|
-
sdist = { url = "https://files.pythonhosted.org/packages/
|
|
179
|
+
sdist = { url = "https://files.pythonhosted.org/packages/7f/09/a9046344212690f0632b9c709f9bf18506522feb333c894d0de81d62341a/pyarrow-19.0.1.tar.gz", hash = "sha256:3bf266b485df66a400f282ac0b6d1b500b9d2ae73314a153dbe97d6d5cc8a99e", size = 1129437 }
|
|
178
180
|
wheels = [
|
|
179
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
180
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
181
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
182
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
183
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
184
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
185
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
186
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
187
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
188
|
-
{ url = "https://files.pythonhosted.org/packages/2f/
|
|
189
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
190
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
191
|
-
{ url = "https://files.pythonhosted.org/packages/b8/
|
|
192
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
193
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
194
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
195
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
196
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
197
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
198
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
199
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
200
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
201
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
202
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
203
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
204
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
205
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
206
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
207
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
208
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
209
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
210
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
211
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
212
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
213
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
214
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
215
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
216
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
217
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
218
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
219
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
181
|
+
{ url = "https://files.pythonhosted.org/packages/36/01/b23b514d86b839956238d3f8ef206fd2728eee87ff1b8ce150a5678d9721/pyarrow-19.0.1-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:fc28912a2dc924dddc2087679cc8b7263accc71b9ff025a1362b004711661a69", size = 30688914 },
|
|
182
|
+
{ url = "https://files.pythonhosted.org/packages/c6/68/218ff7cf4a0652a933e5f2ed11274f724dd43b9813cb18dd72c0a35226a2/pyarrow-19.0.1-cp310-cp310-macosx_12_0_x86_64.whl", hash = "sha256:fca15aabbe9b8355800d923cc2e82c8ef514af321e18b437c3d782aa884eaeec", size = 32102866 },
|
|
183
|
+
{ url = "https://files.pythonhosted.org/packages/98/01/c295050d183014f4a2eb796d7d2bbfa04b6cccde7258bb68aacf6f18779b/pyarrow-19.0.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ad76aef7f5f7e4a757fddcdcf010a8290958f09e3470ea458c80d26f4316ae89", size = 41147682 },
|
|
184
|
+
{ url = "https://files.pythonhosted.org/packages/40/17/a6c3db0b5f3678f33bbb552d2acbc16def67f89a72955b67b0109af23eb0/pyarrow-19.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d03c9d6f2a3dffbd62671ca070f13fc527bb1867b4ec2b98c7eeed381d4f389a", size = 42179192 },
|
|
185
|
+
{ url = "https://files.pythonhosted.org/packages/cf/75/c7c8e599300d8cebb6cb339014800e1c720c9db2a3fcb66aa64ec84bac72/pyarrow-19.0.1-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:65cf9feebab489b19cdfcfe4aa82f62147218558d8d3f0fc1e9dea0ab8e7905a", size = 40517272 },
|
|
186
|
+
{ url = "https://files.pythonhosted.org/packages/ef/c9/68ab123ee1528699c4d5055f645ecd1dd68ff93e4699527249d02f55afeb/pyarrow-19.0.1-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:41f9706fbe505e0abc10e84bf3a906a1338905cbbcf1177b71486b03e6ea6608", size = 42069036 },
|
|
187
|
+
{ url = "https://files.pythonhosted.org/packages/54/e3/d5cfd7654084e6c0d9c3ce949e5d9e0ccad569ae1e2d5a68a3ec03b2be89/pyarrow-19.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:c6cb2335a411b713fdf1e82a752162f72d4a7b5dbc588e32aa18383318b05866", size = 25277951 },
|
|
188
|
+
{ url = "https://files.pythonhosted.org/packages/a0/55/f1a8d838ec07fe3ca53edbe76f782df7b9aafd4417080eebf0b42aab0c52/pyarrow-19.0.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:cc55d71898ea30dc95900297d191377caba257612f384207fe9f8293b5850f90", size = 30713987 },
|
|
189
|
+
{ url = "https://files.pythonhosted.org/packages/13/12/428861540bb54c98a140ae858a11f71d041ef9e501e6b7eb965ca7909505/pyarrow-19.0.1-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:7a544ec12de66769612b2d6988c36adc96fb9767ecc8ee0a4d270b10b1c51e00", size = 32135613 },
|
|
190
|
+
{ url = "https://files.pythonhosted.org/packages/2f/8a/23d7cc5ae2066c6c736bce1db8ea7bc9ac3ef97ac7e1c1667706c764d2d9/pyarrow-19.0.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0148bb4fc158bfbc3d6dfe5001d93ebeed253793fff4435167f6ce1dc4bddeae", size = 41149147 },
|
|
191
|
+
{ url = "https://files.pythonhosted.org/packages/a2/7a/845d151bb81a892dfb368bf11db584cf8b216963ccce40a5cf50a2492a18/pyarrow-19.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f24faab6ed18f216a37870d8c5623f9c044566d75ec586ef884e13a02a9d62c5", size = 42178045 },
|
|
192
|
+
{ url = "https://files.pythonhosted.org/packages/a7/31/e7282d79a70816132cf6cae7e378adfccce9ae10352d21c2fecf9d9756dd/pyarrow-19.0.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:4982f8e2b7afd6dae8608d70ba5bd91699077323f812a0448d8b7abdff6cb5d3", size = 40532998 },
|
|
193
|
+
{ url = "https://files.pythonhosted.org/packages/b8/82/20f3c290d6e705e2ee9c1fa1d5a0869365ee477e1788073d8b548da8b64c/pyarrow-19.0.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:49a3aecb62c1be1d822f8bf629226d4a96418228a42f5b40835c1f10d42e4db6", size = 42084055 },
|
|
194
|
+
{ url = "https://files.pythonhosted.org/packages/ff/77/e62aebd343238863f2c9f080ad2ef6ace25c919c6ab383436b5b81cbeef7/pyarrow-19.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:008a4009efdb4ea3d2e18f05cd31f9d43c388aad29c636112c2966605ba33466", size = 25283133 },
|
|
195
|
+
{ url = "https://files.pythonhosted.org/packages/78/b4/94e828704b050e723f67d67c3535cf7076c7432cd4cf046e4bb3b96a9c9d/pyarrow-19.0.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:80b2ad2b193e7d19e81008a96e313fbd53157945c7be9ac65f44f8937a55427b", size = 30670749 },
|
|
196
|
+
{ url = "https://files.pythonhosted.org/packages/7e/3b/4692965e04bb1df55e2c314c4296f1eb12b4f3052d4cf43d29e076aedf66/pyarrow-19.0.1-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:ee8dec072569f43835932a3b10c55973593abc00936c202707a4ad06af7cb294", size = 32128007 },
|
|
197
|
+
{ url = "https://files.pythonhosted.org/packages/22/f7/2239af706252c6582a5635c35caa17cb4d401cd74a87821ef702e3888957/pyarrow-19.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4d5d1ec7ec5324b98887bdc006f4d2ce534e10e60f7ad995e7875ffa0ff9cb14", size = 41144566 },
|
|
198
|
+
{ url = "https://files.pythonhosted.org/packages/fb/e3/c9661b2b2849cfefddd9fd65b64e093594b231b472de08ff658f76c732b2/pyarrow-19.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f3ad4c0eb4e2a9aeb990af6c09e6fa0b195c8c0e7b272ecc8d4d2b6574809d34", size = 42202991 },
|
|
199
|
+
{ url = "https://files.pythonhosted.org/packages/fe/4f/a2c0ed309167ef436674782dfee4a124570ba64299c551e38d3fdaf0a17b/pyarrow-19.0.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:d383591f3dcbe545f6cc62daaef9c7cdfe0dff0fb9e1c8121101cabe9098cfa6", size = 40507986 },
|
|
200
|
+
{ url = "https://files.pythonhosted.org/packages/27/2e/29bb28a7102a6f71026a9d70d1d61df926887e36ec797f2e6acfd2dd3867/pyarrow-19.0.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:b4c4156a625f1e35d6c0b2132635a237708944eb41df5fbe7d50f20d20c17832", size = 42087026 },
|
|
201
|
+
{ url = "https://files.pythonhosted.org/packages/16/33/2a67c0f783251106aeeee516f4806161e7b481f7d744d0d643d2f30230a5/pyarrow-19.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:5bd1618ae5e5476b7654c7b55a6364ae87686d4724538c24185bbb2952679960", size = 25250108 },
|
|
202
|
+
{ url = "https://files.pythonhosted.org/packages/2b/8d/275c58d4b00781bd36579501a259eacc5c6dfb369be4ddeb672ceb551d2d/pyarrow-19.0.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:e45274b20e524ae5c39d7fc1ca2aa923aab494776d2d4b316b49ec7572ca324c", size = 30653552 },
|
|
203
|
+
{ url = "https://files.pythonhosted.org/packages/a0/9e/e6aca5cc4ef0c7aec5f8db93feb0bde08dbad8c56b9014216205d271101b/pyarrow-19.0.1-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:d9dedeaf19097a143ed6da37f04f4051aba353c95ef507764d344229b2b740ae", size = 32103413 },
|
|
204
|
+
{ url = "https://files.pythonhosted.org/packages/6a/fa/a7033f66e5d4f1308c7eb0dfcd2ccd70f881724eb6fd1776657fdf65458f/pyarrow-19.0.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6ebfb5171bb5f4a52319344ebbbecc731af3f021e49318c74f33d520d31ae0c4", size = 41134869 },
|
|
205
|
+
{ url = "https://files.pythonhosted.org/packages/2d/92/34d2569be8e7abdc9d145c98dc410db0071ac579b92ebc30da35f500d630/pyarrow-19.0.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f2a21d39fbdb948857f67eacb5bbaaf36802de044ec36fbef7a1c8f0dd3a4ab2", size = 42192626 },
|
|
206
|
+
{ url = "https://files.pythonhosted.org/packages/0a/1f/80c617b1084fc833804dc3309aa9d8daacd46f9ec8d736df733f15aebe2c/pyarrow-19.0.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:99bc1bec6d234359743b01e70d4310d0ab240c3d6b0da7e2a93663b0158616f6", size = 40496708 },
|
|
207
|
+
{ url = "https://files.pythonhosted.org/packages/e6/90/83698fcecf939a611c8d9a78e38e7fed7792dcc4317e29e72cf8135526fb/pyarrow-19.0.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:1b93ef2c93e77c442c979b0d596af45e4665d8b96da598db145b0fec014b9136", size = 42075728 },
|
|
208
|
+
{ url = "https://files.pythonhosted.org/packages/40/49/2325f5c9e7a1c125c01ba0c509d400b152c972a47958768e4e35e04d13d8/pyarrow-19.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:d9d46e06846a41ba906ab25302cf0fd522f81aa2a85a71021826f34639ad31ef", size = 25242568 },
|
|
209
|
+
{ url = "https://files.pythonhosted.org/packages/3f/72/135088d995a759d4d916ec4824cb19e066585b4909ebad4ab196177aa825/pyarrow-19.0.1-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:c0fe3dbbf054a00d1f162fda94ce236a899ca01123a798c561ba307ca38af5f0", size = 30702371 },
|
|
210
|
+
{ url = "https://files.pythonhosted.org/packages/2e/01/00beeebd33d6bac701f20816a29d2018eba463616bbc07397fdf99ac4ce3/pyarrow-19.0.1-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:96606c3ba57944d128e8a8399da4812f56c7f61de8c647e3470b417f795d0ef9", size = 32116046 },
|
|
211
|
+
{ url = "https://files.pythonhosted.org/packages/1f/c9/23b1ea718dfe967cbd986d16cf2a31fe59d015874258baae16d7ea0ccabc/pyarrow-19.0.1-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8f04d49a6b64cf24719c080b3c2029a3a5b16417fd5fd7c4041f94233af732f3", size = 41091183 },
|
|
212
|
+
{ url = "https://files.pythonhosted.org/packages/3a/d4/b4a3aa781a2c715520aa8ab4fe2e7fa49d33a1d4e71c8fc6ab7b5de7a3f8/pyarrow-19.0.1-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5a9137cf7e1640dce4c190551ee69d478f7121b5c6f323553b319cac936395f6", size = 42171896 },
|
|
213
|
+
{ url = "https://files.pythonhosted.org/packages/23/1b/716d4cd5a3cbc387c6e6745d2704c4b46654ba2668260d25c402626c5ddb/pyarrow-19.0.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:7c1bca1897c28013db5e4c83944a2ab53231f541b9e0c3f4791206d0c0de389a", size = 40464851 },
|
|
214
|
+
{ url = "https://files.pythonhosted.org/packages/ed/bd/54907846383dcc7ee28772d7e646f6c34276a17da740002a5cefe90f04f7/pyarrow-19.0.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:58d9397b2e273ef76264b45531e9d552d8ec8a6688b7390b5be44c02a37aade8", size = 42085744 },
|
|
215
|
+
{ url = "https://files.pythonhosted.org/packages/16/26/0ec396ebe98adefaffc0fff8e0dc14c8912e61093226284cf4b76faffd22/pyarrow-19.0.1-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:b9766a47a9cb56fefe95cb27f535038b5a195707a08bf61b180e642324963b46", size = 30701112 },
|
|
216
|
+
{ url = "https://files.pythonhosted.org/packages/ba/10/c35d96686bf7f13e55bb87f06fe06e7d95533c271ef7f9a5a76e26b16fc2/pyarrow-19.0.1-cp39-cp39-macosx_12_0_x86_64.whl", hash = "sha256:6c5941c1aac89a6c2f2b16cd64fe76bcdb94b2b1e99ca6459de4e6f07638d755", size = 32117180 },
|
|
217
|
+
{ url = "https://files.pythonhosted.org/packages/8c/0d/81881a55302b6847ea2ea187517faa039c219d80b55050904e354c2eddde/pyarrow-19.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fd44d66093a239358d07c42a91eebf5015aa54fccba959db899f932218ac9cc8", size = 41161334 },
|
|
218
|
+
{ url = "https://files.pythonhosted.org/packages/af/17/ea60a07ec6f6bb0740f11715e0d22ab8fdfcc94bc729832321f498370d75/pyarrow-19.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:335d170e050bcc7da867a1ed8ffb8b44c57aaa6e0843b156a501298657b1e972", size = 42190375 },
|
|
219
|
+
{ url = "https://files.pythonhosted.org/packages/f2/87/4ef05a088b18082cde4950bdfca752dd31effb3ec201b8026e4816d0f3fa/pyarrow-19.0.1-cp39-cp39-manylinux_2_28_aarch64.whl", hash = "sha256:1c7556165bd38cf0cd992df2636f8bcdd2d4b26916c6b7e646101aff3c16f76f", size = 40530649 },
|
|
220
|
+
{ url = "https://files.pythonhosted.org/packages/59/1e/9fb9a66a64eae4ff332a8f149d803d8c6c556714803d20d54ed2e9524a3b/pyarrow-19.0.1-cp39-cp39-manylinux_2_28_x86_64.whl", hash = "sha256:699799f9c80bebcf1da0983ba86d7f289c5a2a5c04b945e2f2bcf7e874a91911", size = 42081576 },
|
|
221
|
+
{ url = "https://files.pythonhosted.org/packages/1b/ee/c110d8da8bdde8e832ccf1ff90be747cb684874e2dc8acf26840058b0c32/pyarrow-19.0.1-cp39-cp39-win_amd64.whl", hash = "sha256:8464c9fbe6d94a7fe1599e7e8965f350fd233532868232ab2596a71586c5a429", size = 25465593 },
|
|
220
222
|
]
|
|
221
223
|
|
|
222
224
|
[[package]]
|
|
@@ -341,7 +343,7 @@ wheels = [
|
|
|
341
343
|
|
|
342
344
|
[[package]]
|
|
343
345
|
name = "pytest"
|
|
344
|
-
version = "8.3.
|
|
346
|
+
version = "8.3.5"
|
|
345
347
|
source = { registry = "https://pypi.org/simple" }
|
|
346
348
|
dependencies = [
|
|
347
349
|
{ name = "colorama", marker = "sys_platform == 'win32'" },
|
|
@@ -351,9 +353,9 @@ dependencies = [
|
|
|
351
353
|
{ name = "pluggy" },
|
|
352
354
|
{ name = "tomli", marker = "python_full_version < '3.11'" },
|
|
353
355
|
]
|
|
354
|
-
sdist = { url = "https://files.pythonhosted.org/packages/
|
|
356
|
+
sdist = { url = "https://files.pythonhosted.org/packages/ae/3c/c9d525a414d506893f0cd8a8d0de7706446213181570cdbd766691164e40/pytest-8.3.5.tar.gz", hash = "sha256:f4efe70cc14e511565ac476b57c279e12a855b11f48f212af1080ef2263d3845", size = 1450891 }
|
|
355
357
|
wheels = [
|
|
356
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
358
|
+
{ url = "https://files.pythonhosted.org/packages/30/3d/64ad57c803f1fa1e963a7946b6e0fea4a70df53c1a7fed304586539c2bac/pytest-8.3.5-py3-none-any.whl", hash = "sha256:c69214aa47deac29fad6c2a4f590b9c4a9fdb16a403176fe154b79c0b4d4d820", size = 343634 },
|
|
357
359
|
]
|
|
358
360
|
|
|
359
361
|
[[package]]
|
|
@@ -372,27 +374,27 @@ wheels = [
|
|
|
372
374
|
|
|
373
375
|
[[package]]
|
|
374
376
|
name = "ruff"
|
|
375
|
-
version = "0.9.
|
|
377
|
+
version = "0.9.10"
|
|
376
378
|
source = { registry = "https://pypi.org/simple" }
|
|
377
|
-
sdist = { url = "https://files.pythonhosted.org/packages/
|
|
379
|
+
sdist = { url = "https://files.pythonhosted.org/packages/20/8e/fafaa6f15c332e73425d9c44ada85360501045d5ab0b81400076aff27cf6/ruff-0.9.10.tar.gz", hash = "sha256:9bacb735d7bada9cfb0f2c227d3658fc443d90a727b47f206fb33f52f3c0eac7", size = 3759776 }
|
|
378
380
|
wheels = [
|
|
379
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
380
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
381
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
382
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
383
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
384
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
385
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
386
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
387
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
388
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
389
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
390
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
391
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
392
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
393
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
394
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
395
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
381
|
+
{ url = "https://files.pythonhosted.org/packages/73/b2/af7c2cc9e438cbc19fafeec4f20bfcd72165460fe75b2b6e9a0958c8c62b/ruff-0.9.10-py3-none-linux_armv6l.whl", hash = "sha256:eb4d25532cfd9fe461acc83498361ec2e2252795b4f40b17e80692814329e42d", size = 10049494 },
|
|
382
|
+
{ url = "https://files.pythonhosted.org/packages/6d/12/03f6dfa1b95ddd47e6969f0225d60d9d7437c91938a310835feb27927ca0/ruff-0.9.10-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:188a6638dab1aa9bb6228a7302387b2c9954e455fb25d6b4470cb0641d16759d", size = 10853584 },
|
|
383
|
+
{ url = "https://files.pythonhosted.org/packages/02/49/1c79e0906b6ff551fb0894168763f705bf980864739572b2815ecd3c9df0/ruff-0.9.10-py3-none-macosx_11_0_arm64.whl", hash = "sha256:5284dcac6b9dbc2fcb71fdfc26a217b2ca4ede6ccd57476f52a587451ebe450d", size = 10155692 },
|
|
384
|
+
{ url = "https://files.pythonhosted.org/packages/5b/01/85e8082e41585e0e1ceb11e41c054e9e36fed45f4b210991052d8a75089f/ruff-0.9.10-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:47678f39fa2a3da62724851107f438c8229a3470f533894b5568a39b40029c0c", size = 10369760 },
|
|
385
|
+
{ url = "https://files.pythonhosted.org/packages/a1/90/0bc60bd4e5db051f12445046d0c85cc2c617095c0904f1aa81067dc64aea/ruff-0.9.10-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:99713a6e2766b7a17147b309e8c915b32b07a25c9efd12ada79f217c9c778b3e", size = 9912196 },
|
|
386
|
+
{ url = "https://files.pythonhosted.org/packages/66/ea/0b7e8c42b1ec608033c4d5a02939c82097ddcb0b3e393e4238584b7054ab/ruff-0.9.10-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:524ee184d92f7c7304aa568e2db20f50c32d1d0caa235d8ddf10497566ea1a12", size = 11434985 },
|
|
387
|
+
{ url = "https://files.pythonhosted.org/packages/d5/86/3171d1eff893db4f91755175a6e1163c5887be1f1e2f4f6c0c59527c2bfd/ruff-0.9.10-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:df92aeac30af821f9acf819fc01b4afc3dfb829d2782884f8739fb52a8119a16", size = 12155842 },
|
|
388
|
+
{ url = "https://files.pythonhosted.org/packages/89/9e/700ca289f172a38eb0bca752056d0a42637fa17b81649b9331786cb791d7/ruff-0.9.10-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:de42e4edc296f520bb84954eb992a07a0ec5a02fecb834498415908469854a52", size = 11613804 },
|
|
389
|
+
{ url = "https://files.pythonhosted.org/packages/f2/92/648020b3b5db180f41a931a68b1c8575cca3e63cec86fd26807422a0dbad/ruff-0.9.10-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d257f95b65806104b6b1ffca0ea53f4ef98454036df65b1eda3693534813ecd1", size = 13823776 },
|
|
390
|
+
{ url = "https://files.pythonhosted.org/packages/5e/a6/cc472161cd04d30a09d5c90698696b70c169eeba2c41030344194242db45/ruff-0.9.10-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b60dec7201c0b10d6d11be00e8f2dbb6f40ef1828ee75ed739923799513db24c", size = 11302673 },
|
|
391
|
+
{ url = "https://files.pythonhosted.org/packages/6c/db/d31c361c4025b1b9102b4d032c70a69adb9ee6fde093f6c3bf29f831c85c/ruff-0.9.10-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:d838b60007da7a39c046fcdd317293d10b845001f38bcb55ba766c3875b01e43", size = 10235358 },
|
|
392
|
+
{ url = "https://files.pythonhosted.org/packages/d1/86/d6374e24a14d4d93ebe120f45edd82ad7dcf3ef999ffc92b197d81cdc2a5/ruff-0.9.10-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:ccaf903108b899beb8e09a63ffae5869057ab649c1e9231c05ae354ebc62066c", size = 9886177 },
|
|
393
|
+
{ url = "https://files.pythonhosted.org/packages/00/62/a61691f6eaaac1e945a1f3f59f1eea9a218513139d5b6c2b8f88b43b5b8f/ruff-0.9.10-py3-none-musllinux_1_2_i686.whl", hash = "sha256:f9567d135265d46e59d62dc60c0bfad10e9a6822e231f5b24032dba5a55be6b5", size = 10864747 },
|
|
394
|
+
{ url = "https://files.pythonhosted.org/packages/ee/94/2c7065e1d92a8a8a46d46d9c3cf07b0aa7e0a1e0153d74baa5e6620b4102/ruff-0.9.10-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:5f202f0d93738c28a89f8ed9eaba01b7be339e5d8d642c994347eaa81c6d75b8", size = 11360441 },
|
|
395
|
+
{ url = "https://files.pythonhosted.org/packages/a7/8f/1f545ea6f9fcd7bf4368551fb91d2064d8f0577b3079bb3f0ae5779fb773/ruff-0.9.10-py3-none-win32.whl", hash = "sha256:bfb834e87c916521ce46b1788fbb8484966e5113c02df216680102e9eb960029", size = 10247401 },
|
|
396
|
+
{ url = "https://files.pythonhosted.org/packages/4f/18/fb703603ab108e5c165f52f5b86ee2aa9be43bb781703ec87c66a5f5d604/ruff-0.9.10-py3-none-win_amd64.whl", hash = "sha256:f2160eeef3031bf4b17df74e307d4c5fb689a6f3a26a2de3f7ef4044e3c484f1", size = 11366360 },
|
|
397
|
+
{ url = "https://files.pythonhosted.org/packages/35/85/338e603dc68e7d9994d5d84f24adbf69bae760ba5efd3e20f5ff2cec18da/ruff-0.9.10-py3-none-win_arm64.whl", hash = "sha256:5fd804c0327a5e5ea26615550e706942f348b197d5475ff34c19733aee4b2e69", size = 10436892 },
|
|
396
398
|
]
|
|
397
399
|
|
|
398
400
|
[[package]]
|
|
@@ -445,7 +447,7 @@ wheels = [
|
|
|
445
447
|
|
|
446
448
|
[[package]]
|
|
447
449
|
name = "typer"
|
|
448
|
-
version = "0.15.
|
|
450
|
+
version = "0.15.2"
|
|
449
451
|
source = { registry = "https://pypi.org/simple" }
|
|
450
452
|
dependencies = [
|
|
451
453
|
{ name = "click" },
|
|
@@ -453,9 +455,9 @@ dependencies = [
|
|
|
453
455
|
{ name = "shellingham" },
|
|
454
456
|
{ name = "typing-extensions" },
|
|
455
457
|
]
|
|
456
|
-
sdist = { url = "https://files.pythonhosted.org/packages/
|
|
458
|
+
sdist = { url = "https://files.pythonhosted.org/packages/8b/6f/3991f0f1c7fcb2df31aef28e0594d8d54b05393a0e4e34c65e475c2a5d41/typer-0.15.2.tar.gz", hash = "sha256:ab2fab47533a813c49fe1f16b1a370fd5819099c00b119e0633df65f22144ba5", size = 100711 }
|
|
457
459
|
wheels = [
|
|
458
|
-
{ url = "https://files.pythonhosted.org/packages/
|
|
460
|
+
{ url = "https://files.pythonhosted.org/packages/7f/fc/5b29fea8cee020515ca82cc68e3b8e1e34bb19a3535ad854cac9257b414c/typer-0.15.2-py3-none-any.whl", hash = "sha256:46a499c6107d645a9c13f7ee46c5d5096cae6f5fc57dd11eccbbb9ae3e44ddfc", size = 45061 },
|
|
459
461
|
]
|
|
460
462
|
|
|
461
463
|
[[package]]
|
iparq-0.1.5/src/iparq/source.py
DELETED
|
@@ -1,142 +0,0 @@
|
|
|
1
|
-
import pyarrow.parquet as pq
|
|
2
|
-
import typer
|
|
3
|
-
from pydantic import BaseModel
|
|
4
|
-
from rich import print
|
|
5
|
-
from rich.console import Console
|
|
6
|
-
|
|
7
|
-
app = typer.Typer()
|
|
8
|
-
console = Console()
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
class ParquetMetaModel(BaseModel):
|
|
12
|
-
"""
|
|
13
|
-
ParquetMetaModel is a data model representing metadata for a Parquet file.
|
|
14
|
-
|
|
15
|
-
Attributes:
|
|
16
|
-
created_by (str): The creator of the Parquet file.
|
|
17
|
-
num_columns (int): The number of columns in the Parquet file.
|
|
18
|
-
num_rows (int): The number of rows in the Parquet file.
|
|
19
|
-
num_row_groups (int): The number of row groups in the Parquet file.
|
|
20
|
-
format_version (str): The version of the Parquet format used.
|
|
21
|
-
serialized_size (int): The size of the serialized Parquet file in bytes.
|
|
22
|
-
"""
|
|
23
|
-
|
|
24
|
-
created_by: str
|
|
25
|
-
num_columns: int
|
|
26
|
-
num_rows: int
|
|
27
|
-
num_row_groups: int
|
|
28
|
-
format_version: str
|
|
29
|
-
serialized_size: int
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
def read_parquet_metadata(filename: str):
|
|
33
|
-
"""
|
|
34
|
-
Reads the metadata of a Parquet file and extracts the compression codecs used.
|
|
35
|
-
|
|
36
|
-
Args:
|
|
37
|
-
filename (str): The path to the Parquet file.
|
|
38
|
-
|
|
39
|
-
Returns:
|
|
40
|
-
tuple: A tuple containing:
|
|
41
|
-
- parquet_metadata (pyarrow.parquet.FileMetaData): The metadata of the Parquet file.
|
|
42
|
-
- compression_codecs (set): A set of compression codecs used in the Parquet file.
|
|
43
|
-
"""
|
|
44
|
-
try:
|
|
45
|
-
compression_codecs = set([])
|
|
46
|
-
parquet_metadata = pq.ParquetFile(filename).metadata
|
|
47
|
-
|
|
48
|
-
for i in range(parquet_metadata.num_row_groups):
|
|
49
|
-
for j in range(parquet_metadata.num_columns):
|
|
50
|
-
compression_codecs.add(
|
|
51
|
-
parquet_metadata.row_group(i).column(j).compression
|
|
52
|
-
)
|
|
53
|
-
|
|
54
|
-
except FileNotFoundError:
|
|
55
|
-
console.print(
|
|
56
|
-
f"Cannot open: {filename}.", style="blink bold red underline on white"
|
|
57
|
-
)
|
|
58
|
-
exit(1)
|
|
59
|
-
|
|
60
|
-
return parquet_metadata, compression_codecs
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
def print_parquet_metadata(parquet_metadata):
|
|
64
|
-
"""
|
|
65
|
-
Prints the metadata of a Parquet file.
|
|
66
|
-
|
|
67
|
-
Args:
|
|
68
|
-
parquet_metadata: An object containing metadata of a Parquet file.
|
|
69
|
-
Expected attributes are:
|
|
70
|
-
- created_by: The creator of the Parquet file.
|
|
71
|
-
- num_columns: The number of columns in the Parquet file.
|
|
72
|
-
- num_rows: The number of rows in the Parquet file.
|
|
73
|
-
- num_row_groups: The number of row groups in the Parquet file.
|
|
74
|
-
- format_version: The format version of the Parquet file.
|
|
75
|
-
- serialized_size: The serialized size of the Parquet file.
|
|
76
|
-
|
|
77
|
-
Raises:
|
|
78
|
-
AttributeError: If the provided parquet_metadata object does not have the expected attributes.
|
|
79
|
-
"""
|
|
80
|
-
try:
|
|
81
|
-
meta = ParquetMetaModel(
|
|
82
|
-
created_by=parquet_metadata.created_by,
|
|
83
|
-
num_columns=parquet_metadata.num_columns,
|
|
84
|
-
num_rows=parquet_metadata.num_rows,
|
|
85
|
-
num_row_groups=parquet_metadata.num_row_groups,
|
|
86
|
-
format_version=str(parquet_metadata.format_version),
|
|
87
|
-
serialized_size=parquet_metadata.serialized_size,
|
|
88
|
-
)
|
|
89
|
-
console.print(meta)
|
|
90
|
-
|
|
91
|
-
except AttributeError as e:
|
|
92
|
-
console.print(f"Error: {e}", style="blink bold red underline on white")
|
|
93
|
-
finally:
|
|
94
|
-
pass
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
def print_compression_types(parquet_metadata) -> None:
|
|
98
|
-
"""
|
|
99
|
-
Prints the compression type for each column in each row group of the Parquet file.
|
|
100
|
-
"""
|
|
101
|
-
try:
|
|
102
|
-
num_row_groups = parquet_metadata.num_row_groups
|
|
103
|
-
num_columns = parquet_metadata.num_columns
|
|
104
|
-
console.print("[bold underline]Column Compression Info:[/bold underline]")
|
|
105
|
-
for i in range(num_row_groups):
|
|
106
|
-
console.print(f"[bold]Row Group {i}:[/bold]")
|
|
107
|
-
for j in range(num_columns):
|
|
108
|
-
column_chunk = parquet_metadata.row_group(i).column(j)
|
|
109
|
-
compression = column_chunk.compression
|
|
110
|
-
column_name = parquet_metadata.schema.column(j).name
|
|
111
|
-
console.print(
|
|
112
|
-
f" Column '{column_name}' (Index {j}): [italic]{compression}[/italic]"
|
|
113
|
-
)
|
|
114
|
-
except Exception as e:
|
|
115
|
-
console.print(
|
|
116
|
-
f"Error while printing compression types: {e}",
|
|
117
|
-
style="blink bold red underline on white",
|
|
118
|
-
)
|
|
119
|
-
finally:
|
|
120
|
-
pass
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
@app.command()
|
|
124
|
-
def main(filename: str):
|
|
125
|
-
"""
|
|
126
|
-
Main function to read and print Parquet file metadata.
|
|
127
|
-
|
|
128
|
-
Args:
|
|
129
|
-
filename (str): The path to the Parquet file.
|
|
130
|
-
|
|
131
|
-
Returns:
|
|
132
|
-
Metadata of the Parquet file and the compression codecs used.
|
|
133
|
-
"""
|
|
134
|
-
(parquet_metadata, compression) = read_parquet_metadata(filename)
|
|
135
|
-
|
|
136
|
-
print_parquet_metadata(parquet_metadata)
|
|
137
|
-
print_compression_types(parquet_metadata)
|
|
138
|
-
print(f"Compression codecs: {compression}")
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
if __name__ == "__main__":
|
|
142
|
-
app()
|
iparq-0.1.5/tests/test_cli.py
DELETED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|