atlas-ftag-tools 0.2.10__py3-none-any.whl → 0.2.11__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- atlas_ftag_tools-0.2.11.dist-info/METADATA +53 -0
- {atlas_ftag_tools-0.2.10.dist-info → atlas_ftag_tools-0.2.11.dist-info}/RECORD +13 -11
- {atlas_ftag_tools-0.2.10.dist-info → atlas_ftag_tools-0.2.11.dist-info}/WHEEL +1 -1
- {atlas_ftag_tools-0.2.10.dist-info → atlas_ftag_tools-0.2.11.dist-info}/entry_points.txt +1 -0
- atlas_ftag_tools-0.2.11.dist-info/licenses/LICENSE +201 -0
- ftag/__init__.py +11 -11
- ftag/flavours.yaml +17 -12
- ftag/hdf5/__init__.py +5 -3
- ftag/hdf5/h5add_col.py +391 -0
- ftag/hdf5/h5writer.py +12 -1
- ftag/utils/__init__.py +2 -2
- ftag/vds.py +39 -4
- atlas_ftag_tools-0.2.10.dist-info/METADATA +0 -151
- {atlas_ftag_tools-0.2.10.dist-info → atlas_ftag_tools-0.2.11.dist-info}/top_level.txt +0 -0
@@ -0,0 +1,53 @@
|
|
1
|
+
Metadata-Version: 2.4
|
2
|
+
Name: atlas-ftag-tools
|
3
|
+
Version: 0.2.11
|
4
|
+
Summary: ATLAS Flavour Tagging Tools
|
5
|
+
Author: Sam Van Stroud, Philipp Gadow
|
6
|
+
License: MIT
|
7
|
+
Project-URL: Homepage, https://github.com/umami-hep/atlas-ftag-tools/
|
8
|
+
Requires-Python: <3.12,>=3.8
|
9
|
+
Description-Content-Type: text/markdown
|
10
|
+
License-File: LICENSE
|
11
|
+
Requires-Dist: h5py>=3.0
|
12
|
+
Requires-Dist: numpy>=2.2.3
|
13
|
+
Requires-Dist: PyYAML>=5.1
|
14
|
+
Requires-Dist: scipy>=1.15.2
|
15
|
+
Provides-Extra: dev
|
16
|
+
Requires-Dist: ruff==0.6.2; extra == "dev"
|
17
|
+
Requires-Dist: mypy==1.11.2; extra == "dev"
|
18
|
+
Requires-Dist: pre-commit==3.1.1; extra == "dev"
|
19
|
+
Requires-Dist: pytest==7.2.2; extra == "dev"
|
20
|
+
Requires-Dist: pytest-cov==4.0.0; extra == "dev"
|
21
|
+
Requires-Dist: pytest_notebook==0.10.0; extra == "dev"
|
22
|
+
Requires-Dist: ipykernel==6.21.3; extra == "dev"
|
23
|
+
Dynamic: license-file
|
24
|
+
|
25
|
+
[](https://github.com/psf/black)
|
26
|
+
[](https://umami-hep.github.io/atlas-ftag-tools/main)
|
27
|
+
[](https://badge.fury.io/py/atlas-ftag-tools)
|
28
|
+
[](https://codecov.io/gh/umami-hep/atlas-ftag-tools)
|
29
|
+
|
30
|
+
# ATLAS FTAG Python Tools
|
31
|
+
|
32
|
+
This is a collection of Python tools for working with files produced with the FTAG [ntuple dumper](https://gitlab.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper/).
|
33
|
+
The code is intended to be used a [library](https://iscinumpy.dev/post/app-vs-library/) for other projects.
|
34
|
+
Please see the [example notebook](ftag/example.ipynb) for usage.
|
35
|
+
|
36
|
+
# Quickstart
|
37
|
+
|
38
|
+
## Installation
|
39
|
+
|
40
|
+
If you want to use this package without modification, you can install from [pypi](https://pypi.org/project/atlas-ftag-tools/) using `pip`.
|
41
|
+
|
42
|
+
```bash
|
43
|
+
pip install atlas-ftag-tools
|
44
|
+
```
|
45
|
+
|
46
|
+
To additionally install the development dependencies (for formatting and linting) use
|
47
|
+
```bash
|
48
|
+
pip install atlas-ftag-tools[dev]
|
49
|
+
```
|
50
|
+
|
51
|
+
## Usage
|
52
|
+
|
53
|
+
Extensive examples are given in the [Examples](https://umami-hep.github.io/atlas-ftag-tools/main/examples/index.html)
|
@@ -1,8 +1,9 @@
|
|
1
|
-
|
1
|
+
atlas_ftag_tools-0.2.11.dist-info/licenses/LICENSE,sha256=R4o6bZfajQ1KxwcIeavTC00qYTdL33YGNe1hzfV53gM,11349
|
2
|
+
ftag/__init__.py,sha256=BGQ1MtuhqCHFXRAh9S9f_ZnOCLWB5RA0ZtL9lW2tofs,748
|
2
3
|
ftag/cli_utils.py,sha256=w3TtQmUHSyAKChS3ewvOtcSDAUJAZGIIomaNi8f446U,298
|
3
4
|
ftag/cuts.py,sha256=9_ooLZHaO3SnIQBNxwbaPZn-qptGdKnB27FdKQGTiTY,2933
|
4
5
|
ftag/flavours.py,sha256=ShH4M2UjQZpZ_NlCctTm2q1tJbzYxjmGteioQ2GcqEU,114
|
5
|
-
ftag/flavours.yaml,sha256=
|
6
|
+
ftag/flavours.yaml,sha256=CrVTJKndHeL15LT2nkjPodi6Ck9mk_oUtdRby6X_Rcc,9921
|
6
7
|
ftag/fraction_optimization.py,sha256=IlMEJe5fD0soX40f-LO4dYAYld2gMqgZRuBLctoPn9A,5566
|
7
8
|
ftag/git_check.py,sha256=Y-XqM80CVXZ5ZKrDdZcYOJt3X64uU6W3OP6Z0D7AZU0,1663
|
8
9
|
ftag/labeller.py,sha256=IXUgU9UBir39PxVWRKs5r5fqI66Tv0x7nJD3-RYpbrg,2780
|
@@ -12,19 +13,20 @@ ftag/region.py,sha256=ANv0dGI2W6NJqD9fp7EfqAUReH4FOjc1gwl_Qn8llcM,360
|
|
12
13
|
ftag/sample.py,sha256=3N0FrRcu9l1sX8ohuGOHuMYGD0See6gMO4--7NzR2tE,2538
|
13
14
|
ftag/track_selector.py,sha256=fJNk_kIBQriBqV4CPT_3ReJbOUnavDDzO-u3EQlRuyk,2654
|
14
15
|
ftag/transform.py,sha256=uEGGJSnqoKOzLYQv650XdK_kDNw4Aw-5dc60z9Dp_y0,3963
|
15
|
-
ftag/vds.py,sha256=
|
16
|
+
ftag/vds.py,sha256=wqj1cA6mIJ4enk8inkearo7ccTw5KCbvuNo2oon51fc,4565
|
16
17
|
ftag/working_points.py,sha256=RJws2jPMEDQDspCbXUZBifS1CCBmlMJ5ax0eMyDzCRA,15949
|
17
|
-
ftag/hdf5/__init__.py,sha256=
|
18
|
+
ftag/hdf5/__init__.py,sha256=8yzVQITge-HKkBQQ60eJwWmWDycYZjgVs-qVg4ShVr0,385
|
19
|
+
ftag/hdf5/h5add_col.py,sha256=htS5wn4Tm4S3U6mrJ8s24VUnbI7o28Z6Ll-J_V68xTA,12558
|
18
20
|
ftag/hdf5/h5move.py,sha256=oYpRu0IDCIJIQ2ML52HBAdoyDxmKkHTeM9JdbPEgKfI,947
|
19
21
|
ftag/hdf5/h5reader.py,sha256=i31pDAqmOSaxdeRhc4iSBlld8xJ0pmp4rNd7CugNzw0,13706
|
20
22
|
ftag/hdf5/h5split.py,sha256=4Wy6Xc3J58MdD9aBaSZHf5ZcVFnJSkWsm42R5Pgo-R4,2448
|
21
23
|
ftag/hdf5/h5utils.py,sha256=-4zKTMtNCrDZr_9Ww7uzfsB7M7muBKpmm_1IkKJnHOI,3222
|
22
|
-
ftag/hdf5/h5writer.py,sha256=
|
23
|
-
ftag/utils/__init__.py,sha256=
|
24
|
+
ftag/hdf5/h5writer.py,sha256=2gBztierWdwZIqcFItoYz8oua_7hphOI8mbDg7xBdPs,5784
|
25
|
+
ftag/utils/__init__.py,sha256=U3YyLY77-FzxRUbudxciieDoy_mnLlY3OfBquA3PnTE,524
|
24
26
|
ftag/utils/logging.py,sha256=54NaQiC9Bh4vSznSqzoPfR-7tj1PXfmoH7yKgv_ZHZk,3192
|
25
27
|
ftag/utils/metrics.py,sha256=zQI4nPeRDSyzqKpdOPmu0GU560xSWoW1wgL13rrja-I,12664
|
26
|
-
atlas_ftag_tools-0.2.
|
27
|
-
atlas_ftag_tools-0.2.
|
28
|
-
atlas_ftag_tools-0.2.
|
29
|
-
atlas_ftag_tools-0.2.
|
30
|
-
atlas_ftag_tools-0.2.
|
28
|
+
atlas_ftag_tools-0.2.11.dist-info/METADATA,sha256=DVmllPN7YQNNmyDcTs3hEGo8mX8ogSReXq9gs6MwUR0,2152
|
29
|
+
atlas_ftag_tools-0.2.11.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
30
|
+
atlas_ftag_tools-0.2.11.dist-info/entry_points.txt,sha256=acr7WwxMIJ3x2I7AheNxNnpWE7sS8XE9MA1eUJGcU5A,169
|
31
|
+
atlas_ftag_tools-0.2.11.dist-info/top_level.txt,sha256=qiYQuKcAvMim-31FwkT3MTQu7WQm0s58tPAia5KKWqs,5
|
32
|
+
atlas_ftag_tools-0.2.11.dist-info/RECORD,,
|
@@ -0,0 +1,201 @@
|
|
1
|
+
Apache License
|
2
|
+
Version 2.0, January 2004
|
3
|
+
http://www.apache.org/licenses/
|
4
|
+
|
5
|
+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
6
|
+
|
7
|
+
1. Definitions.
|
8
|
+
|
9
|
+
"License" shall mean the terms and conditions for use, reproduction,
|
10
|
+
and distribution as defined by Sections 1 through 9 of this document.
|
11
|
+
|
12
|
+
"Licensor" shall mean the copyright owner or entity authorized by
|
13
|
+
the copyright owner that is granting the License.
|
14
|
+
|
15
|
+
"Legal Entity" shall mean the union of the acting entity and all
|
16
|
+
other entities that control, are controlled by, or are under common
|
17
|
+
control with that entity. For the purposes of this definition,
|
18
|
+
"control" means (i) the power, direct or indirect, to cause the
|
19
|
+
direction or management of such entity, whether by contract or
|
20
|
+
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
21
|
+
outstanding shares, or (iii) beneficial ownership of such entity.
|
22
|
+
|
23
|
+
"You" (or "Your") shall mean an individual or Legal Entity
|
24
|
+
exercising permissions granted by this License.
|
25
|
+
|
26
|
+
"Source" form shall mean the preferred form for making modifications,
|
27
|
+
including but not limited to software source code, documentation
|
28
|
+
source, and configuration files.
|
29
|
+
|
30
|
+
"Object" form shall mean any form resulting from mechanical
|
31
|
+
transformation or translation of a Source form, including but
|
32
|
+
not limited to compiled object code, generated documentation,
|
33
|
+
and conversions to other media types.
|
34
|
+
|
35
|
+
"Work" shall mean the work of authorship, whether in Source or
|
36
|
+
Object form, made available under the License, as indicated by a
|
37
|
+
copyright notice that is included in or attached to the work
|
38
|
+
(an example is provided in the Appendix below).
|
39
|
+
|
40
|
+
"Derivative Works" shall mean any work, whether in Source or Object
|
41
|
+
form, that is based on (or derived from) the Work and for which the
|
42
|
+
editorial revisions, annotations, elaborations, or other modifications
|
43
|
+
represent, as a whole, an original work of authorship. For the purposes
|
44
|
+
of this License, Derivative Works shall not include works that remain
|
45
|
+
separable from, or merely link (or bind by name) to the interfaces of,
|
46
|
+
the Work and Derivative Works thereof.
|
47
|
+
|
48
|
+
"Contribution" shall mean any work of authorship, including
|
49
|
+
the original version of the Work and any modifications or additions
|
50
|
+
to that Work or Derivative Works thereof, that is intentionally
|
51
|
+
submitted to Licensor for inclusion in the Work by the copyright owner
|
52
|
+
or by an individual or Legal Entity authorized to submit on behalf of
|
53
|
+
the copyright owner. For the purposes of this definition, "submitted"
|
54
|
+
means any form of electronic, verbal, or written communication sent
|
55
|
+
to the Licensor or its representatives, including but not limited to
|
56
|
+
communication on electronic mailing lists, source code control systems,
|
57
|
+
and issue tracking systems that are managed by, or on behalf of, the
|
58
|
+
Licensor for the purpose of discussing and improving the Work, but
|
59
|
+
excluding communication that is conspicuously marked or otherwise
|
60
|
+
designated in writing by the copyright owner as "Not a Contribution."
|
61
|
+
|
62
|
+
"Contributor" shall mean Licensor and any individual or Legal Entity
|
63
|
+
on behalf of whom a Contribution has been received by Licensor and
|
64
|
+
subsequently incorporated within the Work.
|
65
|
+
|
66
|
+
2. Grant of Copyright License. Subject to the terms and conditions of
|
67
|
+
this License, each Contributor hereby grants to You a perpetual,
|
68
|
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
69
|
+
copyright license to reproduce, prepare Derivative Works of,
|
70
|
+
publicly display, publicly perform, sublicense, and distribute the
|
71
|
+
Work and such Derivative Works in Source or Object form.
|
72
|
+
|
73
|
+
3. Grant of Patent License. Subject to the terms and conditions of
|
74
|
+
this License, each Contributor hereby grants to You a perpetual,
|
75
|
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
76
|
+
(except as stated in this section) patent license to make, have made,
|
77
|
+
use, offer to sell, sell, import, and otherwise transfer the Work,
|
78
|
+
where such license applies only to those patent claims licensable
|
79
|
+
by such Contributor that are necessarily infringed by their
|
80
|
+
Contribution(s) alone or by combination of their Contribution(s)
|
81
|
+
with the Work to which such Contribution(s) was submitted. If You
|
82
|
+
institute patent litigation against any entity (including a
|
83
|
+
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
84
|
+
or a Contribution incorporated within the Work constitutes direct
|
85
|
+
or contributory patent infringement, then any patent licenses
|
86
|
+
granted to You under this License for that Work shall terminate
|
87
|
+
as of the date such litigation is filed.
|
88
|
+
|
89
|
+
4. Redistribution. You may reproduce and distribute copies of the
|
90
|
+
Work or Derivative Works thereof in any medium, with or without
|
91
|
+
modifications, and in Source or Object form, provided that You
|
92
|
+
meet the following conditions:
|
93
|
+
|
94
|
+
(a) You must give any other recipients of the Work or
|
95
|
+
Derivative Works a copy of this License; and
|
96
|
+
|
97
|
+
(b) You must cause any modified files to carry prominent notices
|
98
|
+
stating that You changed the files; and
|
99
|
+
|
100
|
+
(c) You must retain, in the Source form of any Derivative Works
|
101
|
+
that You distribute, all copyright, patent, trademark, and
|
102
|
+
attribution notices from the Source form of the Work,
|
103
|
+
excluding those notices that do not pertain to any part of
|
104
|
+
the Derivative Works; and
|
105
|
+
|
106
|
+
(d) If the Work includes a "NOTICE" text file as part of its
|
107
|
+
distribution, then any Derivative Works that You distribute must
|
108
|
+
include a readable copy of the attribution notices contained
|
109
|
+
within such NOTICE file, excluding those notices that do not
|
110
|
+
pertain to any part of the Derivative Works, in at least one
|
111
|
+
of the following places: within a NOTICE text file distributed
|
112
|
+
as part of the Derivative Works; within the Source form or
|
113
|
+
documentation, if provided along with the Derivative Works; or,
|
114
|
+
within a display generated by the Derivative Works, if and
|
115
|
+
wherever such third-party notices normally appear. The contents
|
116
|
+
of the NOTICE file are for informational purposes only and
|
117
|
+
do not modify the License. You may add Your own attribution
|
118
|
+
notices within Derivative Works that You distribute, alongside
|
119
|
+
or as an addendum to the NOTICE text from the Work, provided
|
120
|
+
that such additional attribution notices cannot be construed
|
121
|
+
as modifying the License.
|
122
|
+
|
123
|
+
You may add Your own copyright statement to Your modifications and
|
124
|
+
may provide additional or different license terms and conditions
|
125
|
+
for use, reproduction, or distribution of Your modifications, or
|
126
|
+
for any such Derivative Works as a whole, provided Your use,
|
127
|
+
reproduction, and distribution of the Work otherwise complies with
|
128
|
+
the conditions stated in this License.
|
129
|
+
|
130
|
+
5. Submission of Contributions. Unless You explicitly state otherwise,
|
131
|
+
any Contribution intentionally submitted for inclusion in the Work
|
132
|
+
by You to the Licensor shall be under the terms and conditions of
|
133
|
+
this License, without any additional terms or conditions.
|
134
|
+
Notwithstanding the above, nothing herein shall supersede or modify
|
135
|
+
the terms of any separate license agreement you may have executed
|
136
|
+
with Licensor regarding such Contributions.
|
137
|
+
|
138
|
+
6. Trademarks. This License does not grant permission to use the trade
|
139
|
+
names, trademarks, service marks, or product names of the Licensor,
|
140
|
+
except as required for reasonable and customary use in describing the
|
141
|
+
origin of the Work and reproducing the content of the NOTICE file.
|
142
|
+
|
143
|
+
7. Disclaimer of Warranty. Unless required by applicable law or
|
144
|
+
agreed to in writing, Licensor provides the Work (and each
|
145
|
+
Contributor provides its Contributions) on an "AS IS" BASIS,
|
146
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
147
|
+
implied, including, without limitation, any warranties or conditions
|
148
|
+
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
149
|
+
PARTICULAR PURPOSE. You are solely responsible for determining the
|
150
|
+
appropriateness of using or redistributing the Work and assume any
|
151
|
+
risks associated with Your exercise of permissions under this License.
|
152
|
+
|
153
|
+
8. Limitation of Liability. In no event and under no legal theory,
|
154
|
+
whether in tort (including negligence), contract, or otherwise,
|
155
|
+
unless required by applicable law (such as deliberate and grossly
|
156
|
+
negligent acts) or agreed to in writing, shall any Contributor be
|
157
|
+
liable to You for damages, including any direct, indirect, special,
|
158
|
+
incidental, or consequential damages of any character arising as a
|
159
|
+
result of this License or out of the use or inability to use the
|
160
|
+
Work (including but not limited to damages for loss of goodwill,
|
161
|
+
work stoppage, computer failure or malfunction, or any and all
|
162
|
+
other commercial damages or losses), even if such Contributor
|
163
|
+
has been advised of the possibility of such damages.
|
164
|
+
|
165
|
+
9. Accepting Warranty or Additional Liability. While redistributing
|
166
|
+
the Work or Derivative Works thereof, You may choose to offer,
|
167
|
+
and charge a fee for, acceptance of support, warranty, indemnity,
|
168
|
+
or other liability obligations and/or rights consistent with this
|
169
|
+
License. However, in accepting such obligations, You may act only
|
170
|
+
on Your own behalf and on Your sole responsibility, not on behalf
|
171
|
+
of any other Contributor, and only if You agree to indemnify,
|
172
|
+
defend, and hold each Contributor harmless for any liability
|
173
|
+
incurred by, or claims asserted against, such Contributor by reason
|
174
|
+
of your accepting any such warranty or additional liability.
|
175
|
+
|
176
|
+
END OF TERMS AND CONDITIONS
|
177
|
+
|
178
|
+
APPENDIX: How to apply the Apache License to your work.
|
179
|
+
|
180
|
+
To apply the Apache License to your work, attach the following
|
181
|
+
boilerplate notice, with the fields enclosed by brackets "[]"
|
182
|
+
replaced with your own identifying information. (Don't include
|
183
|
+
the brackets!) The text should be enclosed in the appropriate
|
184
|
+
comment syntax for the file format. We also recommend that a
|
185
|
+
file or class name and description of purpose be included on the
|
186
|
+
same "printed page" as the copyright notice for easier
|
187
|
+
identification within third-party archives.
|
188
|
+
|
189
|
+
Copyright [2025] [Alexander Froch]
|
190
|
+
|
191
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
192
|
+
you may not use this file except in compliance with the License.
|
193
|
+
You may obtain a copy of the License at
|
194
|
+
|
195
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
196
|
+
|
197
|
+
Unless required by applicable law or agreed to in writing, software
|
198
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
199
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
200
|
+
See the License for the specific language governing permissions and
|
201
|
+
limitations under the License.
|
ftag/__init__.py
CHANGED
@@ -2,18 +2,18 @@
|
|
2
2
|
|
3
3
|
from __future__ import annotations
|
4
4
|
|
5
|
-
__version__ = "v0.2.
|
5
|
+
__version__ = "v0.2.11"
|
6
6
|
|
7
|
-
from
|
8
|
-
from
|
9
|
-
from
|
10
|
-
from
|
11
|
-
from
|
12
|
-
from
|
13
|
-
from
|
14
|
-
from
|
15
|
-
from
|
16
|
-
from
|
7
|
+
from . import hdf5, utils
|
8
|
+
from .cuts import Cuts
|
9
|
+
from .flavours import Flavours
|
10
|
+
from .fraction_optimization import calculate_best_fraction_values
|
11
|
+
from .labeller import Labeller
|
12
|
+
from .labels import Label, LabelContainer
|
13
|
+
from .mock import get_mock_file
|
14
|
+
from .sample import Sample
|
15
|
+
from .transform import Transform
|
16
|
+
from .working_points import get_working_points
|
17
17
|
|
18
18
|
__all__ = [
|
19
19
|
"Cuts",
|
ftag/flavours.yaml
CHANGED
@@ -107,42 +107,47 @@
|
|
107
107
|
colour: "#38761D"
|
108
108
|
category: xbb
|
109
109
|
- name: qcdbb
|
110
|
-
label: QCD
|
111
|
-
cuts: ["R10TruthLabel_R22v1 == 10", "GhostBHadronsFinalCount
|
110
|
+
label: $\mathrm{QCD} \rightarrow b \bar{b}$
|
111
|
+
cuts: ["R10TruthLabel_R22v1 == 10", "GhostBHadronsFinalCount >= 2"]
|
112
112
|
colour: "red"
|
113
113
|
category: xbb
|
114
114
|
- name: qcdnonbb
|
115
|
-
label: QCD
|
115
|
+
label: $\mathrm{QCD} \rightarrow \mathrm{non-} b \bar{b}$
|
116
116
|
cuts: ["R10TruthLabel_R22v1 == 10", "GhostBHadronsFinalCount != 2"]
|
117
117
|
colour: "silver"
|
118
118
|
category: xbb
|
119
119
|
- name: qcdbx
|
120
|
-
label: QCD
|
120
|
+
label: $\mathrm{QCD} \rightarrow bX$
|
121
121
|
cuts: ["R10TruthLabel_R22v1 == 10", "GhostBHadronsFinalCount == 1"]
|
122
122
|
colour: "gold"
|
123
123
|
category: xbb
|
124
124
|
- name: qcdcx
|
125
|
-
label: QCD
|
125
|
+
label: $\mathrm{QCD} \rightarrow cX$
|
126
126
|
cuts: ["R10TruthLabel_R22v1 == 10", "GhostCHadronsFinalCount >= 1", "GhostBHadronsFinalCount == 0"]
|
127
127
|
colour: "pink"
|
128
128
|
category: xbb
|
129
129
|
- name: qcdll
|
130
|
-
label: QCD
|
130
|
+
label: $\mathrm{QCD} \rightarrow ll$
|
131
131
|
cuts: ["R10TruthLabel_R22v1 == 10", "GhostBHadronsFinalCount == 0", "GhostCHadronsFinalCount == 0"]
|
132
132
|
colour: "green"
|
133
133
|
category: xbb
|
134
|
-
- name:
|
135
|
-
label: $
|
134
|
+
- name: Wqq
|
135
|
+
label: $W \rightarrow q\bar{q}$
|
136
|
+
cuts: ["R10TruthLabel_R22v1 == 2", "GhostBHadronsFinalCount < 2", "GhostCHadronsFinalCount < 2"]
|
137
|
+
colour: "purple"
|
138
|
+
category: xbb
|
139
|
+
- name: htautauel
|
140
|
+
label: $H \rightarrow \tau_{\mathrm{had}} \tau_{e}$
|
136
141
|
cuts: ["R10TruthLabel_R22v1 == 14"]
|
137
142
|
colour: "#b40612"
|
138
143
|
category: xbb
|
139
|
-
- name:
|
140
|
-
label: $H \rightarrow \
|
144
|
+
- name: htautaumu
|
145
|
+
label: $H \rightarrow \tau_{\mathrm{had}} \tau_{\mu}$
|
141
146
|
cuts: ["R10TruthLabel_R22v1 == 15"]
|
142
147
|
colour: "#b40657"
|
143
148
|
category: xbb
|
144
|
-
- name:
|
145
|
-
label: $H \rightarrow \
|
149
|
+
- name: htautauhad
|
150
|
+
label: $H \rightarrow \tau_{\mathrm{had}} \tau_{\mathrm{had}}$
|
146
151
|
cuts: ["R10TruthLabel_R22v1 == 16"]
|
147
152
|
colour: "#b406a0"
|
148
153
|
category: xbb
|
ftag/hdf5/__init__.py
CHANGED
@@ -1,14 +1,16 @@
|
|
1
1
|
from __future__ import annotations
|
2
2
|
|
3
|
-
from
|
4
|
-
from
|
5
|
-
from
|
3
|
+
from .h5add_col import h5_add_column
|
4
|
+
from .h5reader import H5Reader
|
5
|
+
from .h5utils import cast_dtype, get_dtype, join_structured_arrays, structured_from_dict
|
6
|
+
from .h5writer import H5Writer
|
6
7
|
|
7
8
|
__all__ = [
|
8
9
|
"H5Reader",
|
9
10
|
"H5Writer",
|
10
11
|
"cast_dtype",
|
11
12
|
"get_dtype",
|
13
|
+
"h5_add_column",
|
12
14
|
"join_structured_arrays",
|
13
15
|
"structured_from_dict",
|
14
16
|
]
|
ftag/hdf5/h5add_col.py
ADDED
@@ -0,0 +1,391 @@
|
|
1
|
+
# Utils to take an input h5 file, and append one or more columns to it
|
2
|
+
from __future__ import annotations
|
3
|
+
|
4
|
+
import argparse
|
5
|
+
import importlib.util
|
6
|
+
from pathlib import Path
|
7
|
+
from typing import Callable
|
8
|
+
|
9
|
+
import h5py
|
10
|
+
import numpy as np
|
11
|
+
|
12
|
+
from ftag.hdf5.h5reader import H5Reader
|
13
|
+
from ftag.hdf5.h5writer import H5Writer
|
14
|
+
|
15
|
+
|
16
|
+
def merge_dicts(dicts: list[dict[str, dict[str, np.ndarray]]]) -> dict[str, dict[str, np.ndarray]]:
|
17
|
+
"""Merges a list of dictionaries.
|
18
|
+
|
19
|
+
Each dict is of the form:
|
20
|
+
{
|
21
|
+
group1: {
|
22
|
+
variable_1: np.array
|
23
|
+
variable_2: np.array
|
24
|
+
},
|
25
|
+
group2: {
|
26
|
+
variable_1: np.array
|
27
|
+
variable_2: np.array
|
28
|
+
}
|
29
|
+
}
|
30
|
+
|
31
|
+
E.g.
|
32
|
+
|
33
|
+
dict1 = {
|
34
|
+
"jets": {
|
35
|
+
"pt": np.array([1, 2, 3]),
|
36
|
+
"eta": np.array([4, 5, 6])
|
37
|
+
},
|
38
|
+
}
|
39
|
+
dict2 = {
|
40
|
+
"jets": {
|
41
|
+
"phi": np.array([7, 8, 9]),
|
42
|
+
"energy": np.array([10, 11, 12])
|
43
|
+
},
|
44
|
+
}
|
45
|
+
|
46
|
+
merged = {
|
47
|
+
"jets": {
|
48
|
+
"pt": np.array([1, 2, 3]),
|
49
|
+
"eta": np.array([4, 5, 6]),
|
50
|
+
"phi": np.array([7, 8, 9]),
|
51
|
+
"energy": np.array([10, 11, 12])
|
52
|
+
}
|
53
|
+
}
|
54
|
+
|
55
|
+
Parameters
|
56
|
+
----------
|
57
|
+
dicts : list[dict[str, dict[str, np.ndarray]]]
|
58
|
+
List of dictionaries to merge. Each dictionary should be of the form:
|
59
|
+
|
60
|
+
Returns
|
61
|
+
-------
|
62
|
+
dict[str, dict[str, np.ndarray]]
|
63
|
+
Merged dictionary of the form:
|
64
|
+
{
|
65
|
+
group1: {
|
66
|
+
variable_1: np.array
|
67
|
+
variable_2: np.array
|
68
|
+
},
|
69
|
+
group2: {
|
70
|
+
variable_1: np.array
|
71
|
+
variable_2: np.array
|
72
|
+
}
|
73
|
+
}
|
74
|
+
|
75
|
+
Raises
|
76
|
+
------
|
77
|
+
ValueError
|
78
|
+
If a variable already exists in the merged dictionary.
|
79
|
+
"""
|
80
|
+
merged: dict[str, dict[str, np.ndarray]] = {}
|
81
|
+
for d in dicts:
|
82
|
+
for group, variables in d.items():
|
83
|
+
if group not in merged:
|
84
|
+
merged[group] = {}
|
85
|
+
for variable, data in variables.items():
|
86
|
+
if variable not in merged[group]:
|
87
|
+
merged[group][variable] = data
|
88
|
+
else:
|
89
|
+
raise ValueError(f"Variable {variable} already exists in group {group}.")
|
90
|
+
return merged
|
91
|
+
|
92
|
+
|
93
|
+
def get_shape(num_jets: int, batch: dict[str, np.ndarray]) -> dict[str, tuple[int, ...]]:
|
94
|
+
"""Returns a dictionary with the correct output shapes for the H5Writer.
|
95
|
+
|
96
|
+
Parameters
|
97
|
+
----------
|
98
|
+
num_jets : int
|
99
|
+
Number of jets to write in total
|
100
|
+
batch : dict[str, np.ndarray]
|
101
|
+
Dictionary representing the batch
|
102
|
+
|
103
|
+
Returns
|
104
|
+
-------
|
105
|
+
dict[str, tuple[int, ...]]
|
106
|
+
Dictionary with the shapes of the output arrays
|
107
|
+
"""
|
108
|
+
shape: dict[str, tuple[int, ...]] = {}
|
109
|
+
|
110
|
+
for key, values in batch.items():
|
111
|
+
if values.ndim == 1:
|
112
|
+
shape[key] = (num_jets,)
|
113
|
+
else:
|
114
|
+
shape[key] = (num_jets,) + values.shape[1:]
|
115
|
+
return shape
|
116
|
+
|
117
|
+
|
118
|
+
def get_all_groups(file: Path | str) -> dict[str, None]:
|
119
|
+
"""Returns a dictionary with all the groups in the h5 file.
|
120
|
+
|
121
|
+
Parameters
|
122
|
+
----------
|
123
|
+
file : Path | str
|
124
|
+
Path to the h5 file
|
125
|
+
|
126
|
+
Returns
|
127
|
+
-------
|
128
|
+
dict[str, None]
|
129
|
+
A dictionary with all the groups in the h5 file as keys and None as values,
|
130
|
+
such that h5read.stream(all_groups) will return all the groups in the file.
|
131
|
+
"""
|
132
|
+
with h5py.File(file, "r") as f:
|
133
|
+
groups = list(f.keys())
|
134
|
+
return dict.fromkeys(groups)
|
135
|
+
|
136
|
+
|
137
|
+
def h5_add_column(
|
138
|
+
input_file: str | Path,
|
139
|
+
output_file: str | Path,
|
140
|
+
append_function: Callable | list[Callable],
|
141
|
+
num_jets: int = -1,
|
142
|
+
input_groups: list[str] | None = None,
|
143
|
+
output_groups: list[str] | None = None,
|
144
|
+
reader_kwargs: dict | None = None,
|
145
|
+
writer_kwargs: dict | None = None,
|
146
|
+
overwrite: bool = False,
|
147
|
+
) -> None:
|
148
|
+
"""Appends one or more columns to one or more groups in an h5 file.
|
149
|
+
|
150
|
+
Parameters
|
151
|
+
----------
|
152
|
+
input_file : str | Path
|
153
|
+
Input h5 file to read from.
|
154
|
+
output_file : str | Path
|
155
|
+
Output h5 file to write to.
|
156
|
+
append_function : callable | list[callable]
|
157
|
+
A function, or list of functions, which take a batch from H5Reader and returns a dictionary
|
158
|
+
of the form:
|
159
|
+
{
|
160
|
+
group1 : {
|
161
|
+
new_column1 : data,
|
162
|
+
new_column2 : data,
|
163
|
+
},
|
164
|
+
group2 : {
|
165
|
+
new_column3 : data,
|
166
|
+
new_column4 : data,
|
167
|
+
},
|
168
|
+
...
|
169
|
+
}
|
170
|
+
num_jets : int, optional
|
171
|
+
Number of jets to read from the input file. If -1, reads all jets. By default -1.
|
172
|
+
input_groups : list[str] | None, optional
|
173
|
+
List of groups to read from the input file. If None, reads all groups. By default None.
|
174
|
+
output_groups : list[str] | None, optional
|
175
|
+
List of groups to write to the output file. If None, writes all groups. By default None.
|
176
|
+
Note that this is a subset of the input groups, and must include all groups that the
|
177
|
+
append functions wish to write to.
|
178
|
+
reader_kwargs : dict, optional
|
179
|
+
Additional arguments to pass to the H5Reader. By default None.
|
180
|
+
writer_kwargs : dict, optional
|
181
|
+
Additional arguments to pass to the H5Writer. By default None.
|
182
|
+
overwrite : bool, optional
|
183
|
+
If True, will overwrite the output file if it exists. By default False.
|
184
|
+
If False, will raise a FileExistsError if the output file exists.
|
185
|
+
If None, will check if the output file exists and raise an error if it does unless
|
186
|
+
overwrite is True.
|
187
|
+
|
188
|
+
Raises
|
189
|
+
------
|
190
|
+
FileNotFoundError
|
191
|
+
If the input file does not exist.
|
192
|
+
FileExistsError
|
193
|
+
If the output file exists and overwrite is False.
|
194
|
+
ValueError
|
195
|
+
If the new variable already exists, shape is incorrect, or the output group is not in
|
196
|
+
the input groups.
|
197
|
+
|
198
|
+
"""
|
199
|
+
input_file = Path(input_file)
|
200
|
+
output_file = Path(output_file) if output_file is not None else None
|
201
|
+
|
202
|
+
if not input_file.exists():
|
203
|
+
raise FileNotFoundError(f"Input file {input_file} does not exist.")
|
204
|
+
if output_file is not None and output_file.exists() and not overwrite:
|
205
|
+
raise FileExistsError(
|
206
|
+
f"Output file {output_file} already exists. Please choose a different name."
|
207
|
+
)
|
208
|
+
if not reader_kwargs:
|
209
|
+
reader_kwargs = {}
|
210
|
+
if not writer_kwargs:
|
211
|
+
writer_kwargs = {}
|
212
|
+
if output_file is None:
|
213
|
+
output_file = input_file.with_name(input_file.name.replace(".h5", "_additional.h5"))
|
214
|
+
|
215
|
+
if not isinstance(append_function, list):
|
216
|
+
append_function = [append_function]
|
217
|
+
|
218
|
+
reader = H5Reader(input_file, shuffle=False, **reader_kwargs)
|
219
|
+
if "precision" not in writer_kwargs:
|
220
|
+
writer_kwargs["precision"] = "full"
|
221
|
+
|
222
|
+
njets = reader.num_jets if num_jets == -1 else num_jets
|
223
|
+
writer = None
|
224
|
+
|
225
|
+
input_variables = (
|
226
|
+
get_all_groups(input_file) if input_groups is None else dict.fromkeys(input_groups)
|
227
|
+
)
|
228
|
+
if output_groups is None:
|
229
|
+
output_groups = list(input_variables.keys())
|
230
|
+
|
231
|
+
assert all(
|
232
|
+
o in input_variables for o in output_groups
|
233
|
+
), f"Output groups {output_groups} not in input groups {input_variables.keys()}"
|
234
|
+
|
235
|
+
num_batches = njets // reader.batch_size + 1
|
236
|
+
for i, batch in enumerate(reader.stream(input_variables, num_jets=njets)):
|
237
|
+
if (i + 1) % 10 == 0:
|
238
|
+
print(f"Processing batch {i + 1}/{num_batches} ({(i + 1) / num_batches * 100:.2f}%)")
|
239
|
+
|
240
|
+
to_append = merge_dicts([af(batch) for af in append_function])
|
241
|
+
for k, newvars in to_append.items():
|
242
|
+
if k not in output_groups:
|
243
|
+
raise ValueError(f"Trying to output to {k} but only {output_groups} are allowed")
|
244
|
+
for newkey, newval in newvars.items():
|
245
|
+
if newkey in batch[k].dtype.names:
|
246
|
+
raise ValueError(
|
247
|
+
f"Trying to append {newkey} to {k} but it already exists in batch"
|
248
|
+
)
|
249
|
+
if newval.shape != batch[k].shape:
|
250
|
+
raise ValueError(
|
251
|
+
f"Trying to append {newkey} to {k} but the shape is not correct"
|
252
|
+
)
|
253
|
+
|
254
|
+
to_write = {}
|
255
|
+
|
256
|
+
for key, str_array in batch.items():
|
257
|
+
if key not in output_groups:
|
258
|
+
continue
|
259
|
+
if key in to_append:
|
260
|
+
combined = np.lib.recfunctions.append_fields(
|
261
|
+
str_array,
|
262
|
+
list(to_append[key].keys()),
|
263
|
+
list(to_append[key].values()),
|
264
|
+
usemask=False,
|
265
|
+
)
|
266
|
+
to_write[key] = combined
|
267
|
+
else:
|
268
|
+
to_write[key] = str_array
|
269
|
+
if writer is None:
|
270
|
+
writer = H5Writer(
|
271
|
+
output_file,
|
272
|
+
dtypes={key: str_array.dtype for key, str_array in to_write.items()},
|
273
|
+
shapes=get_shape(njets, to_write),
|
274
|
+
shuffle=False,
|
275
|
+
**writer_kwargs,
|
276
|
+
)
|
277
|
+
|
278
|
+
writer.write(to_write)
|
279
|
+
|
280
|
+
|
281
|
+
def parse_append_function(func_path: str) -> Callable:
|
282
|
+
"""Attempts to load the function specified by func_path.
|
283
|
+
The function should be specified as 'path/to/file.py:function_name'.
|
284
|
+
|
285
|
+
Parameters
|
286
|
+
----------
|
287
|
+
func_path : str
|
288
|
+
Path to the function to load. Should be of the form 'path/to/file.py:function_name'.
|
289
|
+
|
290
|
+
Returns
|
291
|
+
-------
|
292
|
+
Callable
|
293
|
+
The function specified by func_path.
|
294
|
+
|
295
|
+
Raises
|
296
|
+
------
|
297
|
+
ValueError
|
298
|
+
If the function path is not of the form 'path/to/file.py:function_name'.
|
299
|
+
FileNotFoundError
|
300
|
+
If the file does not exist.
|
301
|
+
ImportError
|
302
|
+
If the file cannot be imported.
|
303
|
+
AttributeError
|
304
|
+
If the function does not exist in the file.
|
305
|
+
"""
|
306
|
+
if isinstance(func_path, Path):
|
307
|
+
func_path = str(func_path)
|
308
|
+
if ":" not in func_path:
|
309
|
+
print(func_path)
|
310
|
+
raise ValueError("Function should be specified as 'path/to/file.py:function_name'")
|
311
|
+
|
312
|
+
file_str, func_name = func_path.split(":")
|
313
|
+
file_path = Path(file_str).resolve()
|
314
|
+
|
315
|
+
if not file_path.is_file():
|
316
|
+
raise FileNotFoundError(f"No such file: {file_path}")
|
317
|
+
|
318
|
+
module_name = file_path.stem # Just the filename without extension
|
319
|
+
|
320
|
+
spec = importlib.util.spec_from_file_location(module_name, str(file_path))
|
321
|
+
if spec is None or spec.loader is None:
|
322
|
+
raise ImportError(f"Cannot load spec for {file_path}")
|
323
|
+
|
324
|
+
module = importlib.util.module_from_spec(spec)
|
325
|
+
spec.loader.exec_module(module)
|
326
|
+
|
327
|
+
if not hasattr(module, func_name):
|
328
|
+
raise AttributeError(f"Module {module_name} has no attribute {func_name}")
|
329
|
+
|
330
|
+
return getattr(module, func_name)
|
331
|
+
|
332
|
+
|
333
|
+
def get_args(args):
|
334
|
+
parser = argparse.ArgumentParser(description="Append columns to an h5 file.")
|
335
|
+
parser.add_argument("--input", "-i", type=str, required=True, help="Input h5 file")
|
336
|
+
parser.add_argument(
|
337
|
+
"--append_function",
|
338
|
+
type=str,
|
339
|
+
nargs="+",
|
340
|
+
help="Function to append to the h5 file. Can be a list of functions.",
|
341
|
+
required=True,
|
342
|
+
)
|
343
|
+
parser.add_argument("--output", type=str, help="Output h5 file")
|
344
|
+
parser.add_argument(
|
345
|
+
"--num_jets", type=int, default=-1, help="Number of jets to read from the input file"
|
346
|
+
)
|
347
|
+
parser.add_argument(
|
348
|
+
"--input_groups",
|
349
|
+
type=str,
|
350
|
+
nargs="+",
|
351
|
+
default=None,
|
352
|
+
help="List of groups to read from the input file",
|
353
|
+
)
|
354
|
+
parser.add_argument(
|
355
|
+
"--output_groups",
|
356
|
+
type=str,
|
357
|
+
nargs="+",
|
358
|
+
default=None,
|
359
|
+
help="List of groups to write to the output file",
|
360
|
+
)
|
361
|
+
parser.add_argument(
|
362
|
+
"--reader_kwargs", type=dict, default=None, help="Additional arguments for H5Reader"
|
363
|
+
)
|
364
|
+
parser.add_argument(
|
365
|
+
"--writer_kwargs", type=dict, default=None, help="Additional arguments for H5Writer"
|
366
|
+
)
|
367
|
+
parser.add_argument(
|
368
|
+
"--overwrite", action="store_true", help="Overwrite the output file if it exists"
|
369
|
+
)
|
370
|
+
|
371
|
+
return parser.parse_args(args)
|
372
|
+
|
373
|
+
|
374
|
+
def main(args=None):
|
375
|
+
args = get_args(args)
|
376
|
+
append_function = [
|
377
|
+
parse_append_function(func_path) if isinstance(func_path, str) else func_path
|
378
|
+
for func_path in args.append_function
|
379
|
+
]
|
380
|
+
|
381
|
+
h5_add_column(
|
382
|
+
args.input,
|
383
|
+
args.output,
|
384
|
+
append_function,
|
385
|
+
num_jets=args.num_jets,
|
386
|
+
input_groups=args.input_groups,
|
387
|
+
output_groups=args.output_groups,
|
388
|
+
reader_kwargs=args.reader_kwargs,
|
389
|
+
writer_kwargs=args.writer_kwargs,
|
390
|
+
overwrite=args.overwrite,
|
391
|
+
)
|
ftag/hdf5/h5writer.py
CHANGED
@@ -31,8 +31,11 @@ class H5Writer:
|
|
31
31
|
Compression algorithm to use. Default is "lzf".
|
32
32
|
precision : str | None, optional
|
33
33
|
Precision to use. Default is None.
|
34
|
+
full_precision_vars : list[str] | None, optional
|
35
|
+
List of variables to store in full precision. Default is None.
|
34
36
|
shuffle : bool, optional
|
35
37
|
Whether to shuffle the jets before writing. Default is True.
|
38
|
+
|
36
39
|
"""
|
37
40
|
|
38
41
|
dst: Path | str
|
@@ -42,6 +45,7 @@ class H5Writer:
|
|
42
45
|
add_flavour_label: bool = False
|
43
46
|
compression: str = "lzf"
|
44
47
|
precision: str = "full"
|
48
|
+
full_precision_vars: list[str] | None = None
|
45
49
|
shuffle: bool = True
|
46
50
|
|
47
51
|
def __post_init__(self):
|
@@ -85,8 +89,15 @@ class H5Writer:
|
|
85
89
|
dtype = np.dtype([*dtype.descr, ("flavour_label", "i4")])
|
86
90
|
|
87
91
|
# adjust dtype based on specified precision
|
92
|
+
full_precision_vars = [] if self.full_precision_vars is None else self.full_precision_vars
|
93
|
+
# If the field is in full_precision_vars, use the full precision dtype
|
88
94
|
dtype = np.dtype([
|
89
|
-
(
|
95
|
+
(
|
96
|
+
field,
|
97
|
+
self.fp_dtype
|
98
|
+
if field not in full_precision_vars and np.issubdtype(dt, np.floating)
|
99
|
+
else dt,
|
100
|
+
)
|
90
101
|
for field, dt in dtype.descr
|
91
102
|
])
|
92
103
|
|
ftag/utils/__init__.py
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
from __future__ import annotations
|
2
2
|
|
3
|
-
from
|
4
|
-
from
|
3
|
+
from .logging import logger, set_log_level
|
4
|
+
from .metrics import (
|
5
5
|
calculate_efficiency,
|
6
6
|
calculate_efficiency_error,
|
7
7
|
calculate_rejection,
|
ftag/vds.py
CHANGED
@@ -2,6 +2,9 @@ from __future__ import annotations
|
|
2
2
|
|
3
3
|
import argparse
|
4
4
|
import glob
|
5
|
+
import os
|
6
|
+
import re
|
7
|
+
import sys
|
5
8
|
from pathlib import Path
|
6
9
|
|
7
10
|
import h5py
|
@@ -13,6 +16,8 @@ def parse_args(args):
|
|
13
16
|
)
|
14
17
|
parser.add_argument("pattern", type=Path, help="quotes-enclosed glob pattern of files to merge")
|
15
18
|
parser.add_argument("output", type=Path, help="path to output virtual file")
|
19
|
+
parser.add_argument("--use_regex", help="if provided pattern is a regex", action="store_true")
|
20
|
+
parser.add_argument("--regex_path", type=str, required="--regex" in sys.argv, default=None)
|
16
21
|
return parser.parse_args(args)
|
17
22
|
|
18
23
|
|
@@ -43,13 +48,36 @@ def get_virtual_layout(fnames: list[str], group: str):
|
|
43
48
|
return layout
|
44
49
|
|
45
50
|
|
51
|
+
def glob_re(pattern, regex_path):
|
52
|
+
return list(filter(re.compile(pattern).match, os.listdir(regex_path)))
|
53
|
+
|
54
|
+
|
55
|
+
def regex_files_from_dir(reg_matched_fnames, regex_path):
|
56
|
+
parent_dir = regex_path or str(Path.cwd())
|
57
|
+
full_paths = [parent_dir + "/" + fname for fname in reg_matched_fnames]
|
58
|
+
paths_to_glob = [fname + "/*.h5" if Path(fname).is_dir() else fname for fname in full_paths]
|
59
|
+
nested_fnames = [glob.glob(fname) for fname in paths_to_glob]
|
60
|
+
return sum(nested_fnames, [])
|
61
|
+
|
62
|
+
|
46
63
|
def create_virtual_file(
|
47
|
-
pattern: Path | str,
|
64
|
+
pattern: Path | str,
|
65
|
+
out_fname: Path | None = None,
|
66
|
+
use_regex: bool = False,
|
67
|
+
regex_path: str | None = None,
|
68
|
+
overwrite: bool = False,
|
48
69
|
):
|
49
70
|
# get list of filenames
|
50
|
-
|
71
|
+
pattern_str = str(pattern)
|
72
|
+
if use_regex:
|
73
|
+
reg_matched_fnames = glob_re(pattern_str, regex_path)
|
74
|
+
print("reg matched fnames: ", reg_matched_fnames)
|
75
|
+
fnames = regex_files_from_dir(reg_matched_fnames, regex_path)
|
76
|
+
else:
|
77
|
+
fnames = glob.glob(pattern_str)
|
51
78
|
if not fnames:
|
52
79
|
raise FileNotFoundError(f"No files matched pattern {pattern}")
|
80
|
+
print("Files to merge to vds: ", fnames)
|
53
81
|
|
54
82
|
# infer output path if not given
|
55
83
|
if out_fname is None:
|
@@ -94,8 +122,15 @@ def create_virtual_file(
|
|
94
122
|
|
95
123
|
def main(args=None) -> None:
|
96
124
|
args = parse_args(args)
|
97
|
-
|
98
|
-
|
125
|
+
matching_mode = "Applying regex to" if args.use_regex else "Globbing"
|
126
|
+
print(f"{matching_mode} {args.pattern}...")
|
127
|
+
create_virtual_file(
|
128
|
+
args.pattern,
|
129
|
+
args.output,
|
130
|
+
use_regex=args.use_regex,
|
131
|
+
regex_path=args.regex_path,
|
132
|
+
overwrite=True,
|
133
|
+
)
|
99
134
|
with h5py.File(args.output) as f:
|
100
135
|
key = next(iter(f.keys()))
|
101
136
|
num = len(f[key])
|
@@ -1,151 +0,0 @@
|
|
1
|
-
Metadata-Version: 2.4
|
2
|
-
Name: atlas-ftag-tools
|
3
|
-
Version: 0.2.10
|
4
|
-
Summary: ATLAS Flavour Tagging Tools
|
5
|
-
Author: Sam Van Stroud, Philipp Gadow
|
6
|
-
License: MIT
|
7
|
-
Project-URL: Homepage, https://github.com/umami-hep/atlas-ftag-tools/
|
8
|
-
Requires-Python: <3.12,>=3.8
|
9
|
-
Description-Content-Type: text/markdown
|
10
|
-
Requires-Dist: h5py>=3.0
|
11
|
-
Requires-Dist: numpy>=2.2.3
|
12
|
-
Requires-Dist: PyYAML>=5.1
|
13
|
-
Requires-Dist: scipy>=1.15.2
|
14
|
-
Provides-Extra: dev
|
15
|
-
Requires-Dist: ruff==0.6.2; extra == "dev"
|
16
|
-
Requires-Dist: mypy==1.11.2; extra == "dev"
|
17
|
-
Requires-Dist: pre-commit==3.1.1; extra == "dev"
|
18
|
-
Requires-Dist: pytest==7.2.2; extra == "dev"
|
19
|
-
Requires-Dist: pytest-cov==4.0.0; extra == "dev"
|
20
|
-
Requires-Dist: pytest_notebook==0.10.0; extra == "dev"
|
21
|
-
Requires-Dist: ipykernel==6.21.3; extra == "dev"
|
22
|
-
|
23
|
-
[](https://github.com/psf/black)
|
24
|
-
[](https://badge.fury.io/py/atlas-ftag-tools)
|
25
|
-
[](https://codecov.io/gh/umami-hep/atlas-ftag-tools)
|
26
|
-
|
27
|
-
# ATLAS FTAG Python Tools
|
28
|
-
|
29
|
-
This is a collection of Python tools for working with files produced with the FTAG [ntuple dumper](https://gitlab.cern.ch/atlas-flavor-tagging-tools/training-dataset-dumper/).
|
30
|
-
The code is intended to be used a [library](https://iscinumpy.dev/post/app-vs-library/) for other projects.
|
31
|
-
Please see the [example notebook](ftag/example.ipynb) for usage.
|
32
|
-
|
33
|
-
# Quickstart
|
34
|
-
|
35
|
-
## Installation
|
36
|
-
|
37
|
-
If you want to use this package without modification, you can install from [pypi](https://pypi.org/project/atlas-ftag-tools/) using `pip`.
|
38
|
-
|
39
|
-
```bash
|
40
|
-
pip install atlas-ftag-tools
|
41
|
-
```
|
42
|
-
|
43
|
-
To additionally install the development dependencies (for formatting and linting) use
|
44
|
-
```bash
|
45
|
-
pip install atlas-ftag-tools[dev]
|
46
|
-
```
|
47
|
-
|
48
|
-
## Development
|
49
|
-
|
50
|
-
If you plan on making changes to teh code, instead clone the repository and install the package from source in editable mode with
|
51
|
-
|
52
|
-
```bash
|
53
|
-
python -m pip install -e .
|
54
|
-
```
|
55
|
-
|
56
|
-
Include development dependencies with
|
57
|
-
|
58
|
-
```bash
|
59
|
-
python -m pip install -e ".[dev]"
|
60
|
-
```
|
61
|
-
|
62
|
-
You can set up and run pre-commit hooks with
|
63
|
-
|
64
|
-
```bash
|
65
|
-
pre-commit install
|
66
|
-
pre-commmit run --all-files
|
67
|
-
```
|
68
|
-
|
69
|
-
To run the tests you can use the `pytest` or `coverage` command, for example
|
70
|
-
|
71
|
-
```bash
|
72
|
-
coverage run --source ftag -m pytest --show-capture=stdout
|
73
|
-
```
|
74
|
-
|
75
|
-
Running `coverage report` will display the test coverage.
|
76
|
-
|
77
|
-
|
78
|
-
# Usage
|
79
|
-
|
80
|
-
Please see the [example notebook](ftag/example.ipynb) for full usage.
|
81
|
-
Additional functionality is also documented below.
|
82
|
-
|
83
|
-
## Calculate WPs
|
84
|
-
|
85
|
-
This package contains a script to calculate tagger working points (WPs).
|
86
|
-
The script is `working_points.py` and can be run after installing this package with
|
87
|
-
|
88
|
-
```
|
89
|
-
wps \
|
90
|
-
--ttbar "path/to/ttbar/*.h5" \
|
91
|
-
--tagger GN2v01 \
|
92
|
-
--fc 0.1
|
93
|
-
```
|
94
|
-
|
95
|
-
Both the `--tagger` and `--fc` options accept a list if you want to get the WPs for multiple taggers.
|
96
|
-
If you are doing c-tagging or xbb-tagging, dedicated fx arguments are available ()you can find them all with `-h`.
|
97
|
-
|
98
|
-
If you want to use the `ttbar` WPs get the efficiencies and rejections for the `zprime` sample, you can add `--zprime "path/to/zprime/*.h5"` to the command.
|
99
|
-
Note that a default selection of $p_T > 250 ~GeV$ to jets in the `zprime` sample.
|
100
|
-
|
101
|
-
If instead of defining the working points for a series of signal efficiencies, you wish to calculate a WP corresponding to a specific background rejection, the `--rejection` option can be given along with the desired background.
|
102
|
-
|
103
|
-
By default the working points are printed to the terminal, but you can save the results to a YAML file with the `--outfile` option.
|
104
|
-
|
105
|
-
See `wps --help` for more options and information.
|
106
|
-
|
107
|
-
## Calculate efficiency at discriminant cut
|
108
|
-
|
109
|
-
The same script can be used to calculate the efficiency and rejection values at a given discriminant cut value.
|
110
|
-
The script `working_points.py` can be run after intalling this package as follows
|
111
|
-
|
112
|
-
```
|
113
|
-
wps \
|
114
|
-
--ttbar "path/to/ttbar/*.h5" \
|
115
|
-
--tagger GN2v01 \
|
116
|
-
--fx 0.1
|
117
|
-
--disc_cuts 1.0 1.5
|
118
|
-
```
|
119
|
-
The `--tagger`, `--fx`, and `--outfile` follow the same procedure as in the 'Calculate WPs' script as described above.
|
120
|
-
|
121
|
-
## H5 Utils
|
122
|
-
|
123
|
-
### Create virtual file
|
124
|
-
|
125
|
-
This package contains a script to easily merge a set of H5 files.
|
126
|
-
A virtual file is a fast and lightweight way to wrap a set of files.
|
127
|
-
See the [h5py documentation](https://docs.h5py.org/en/stable/vds.html) for more information on virtual datasets.
|
128
|
-
|
129
|
-
The script is `vds.py` and can be run after installing this package with
|
130
|
-
|
131
|
-
```
|
132
|
-
vds <pattern> <output path>
|
133
|
-
```
|
134
|
-
|
135
|
-
The `<pattern>` argument should be a quotes enclosed [glob pattern](https://en.wikipedia.org/wiki/Glob_(programming)), for example `"dsid/path/*.h5"`
|
136
|
-
|
137
|
-
See `vds --help` for more options and information.
|
138
|
-
|
139
|
-
|
140
|
-
### [h5move](ftag/hdf5/h5move.py)
|
141
|
-
|
142
|
-
A script to move/rename datasets inside an h5file.
|
143
|
-
Useful for correcting discrepancies between group names.
|
144
|
-
See [h5move.py](ftag/hdf5/h5move.py) for more info.
|
145
|
-
|
146
|
-
|
147
|
-
### [h5split](ftag/hdf5/h5split.py)
|
148
|
-
|
149
|
-
A script to split a large h5 file into several smaller files.
|
150
|
-
Useful if output files are too large for EOS/grid storage.
|
151
|
-
See [h5split.py](ftag/hdf5/h5split.py) for more info.
|
File without changes
|