eba-xbridge 1.5.0rc2__py3-none-any.whl → 1.5.0rc4__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- eba_xbridge-1.5.0rc4.dist-info/METADATA +308 -0
- {eba_xbridge-1.5.0rc2.dist-info → eba_xbridge-1.5.0rc4.dist-info}/RECORD +11 -9
- eba_xbridge-1.5.0rc4.dist-info/entry_points.txt +3 -0
- xbridge/__init__.py +1 -1
- xbridge/__main__.py +82 -0
- xbridge/api.py +10 -1
- xbridge/converter.py +81 -130
- xbridge/instance.py +60 -3
- xbridge/modules.py +6 -4
- eba_xbridge-1.5.0rc2.dist-info/METADATA +0 -62
- {eba_xbridge-1.5.0rc2.dist-info → eba_xbridge-1.5.0rc4.dist-info}/WHEEL +0 -0
- {eba_xbridge-1.5.0rc2.dist-info → eba_xbridge-1.5.0rc4.dist-info}/licenses/LICENSE +0 -0
|
@@ -0,0 +1,308 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: eba-xbridge
|
|
3
|
+
Version: 1.5.0rc4
|
|
4
|
+
Summary: XBRL-XML to XBRL-CSV converter for EBA Taxonomy (version 4.1)
|
|
5
|
+
License: Apache 2.0
|
|
6
|
+
License-File: LICENSE
|
|
7
|
+
Keywords: xbrl,eba,taxonomy,csv,xml
|
|
8
|
+
Author: MeaningfulData
|
|
9
|
+
Author-email: info@meaningfuldata.eu
|
|
10
|
+
Maintainer: Antonio Olleros
|
|
11
|
+
Maintainer-email: antonio.olleros@meaningfuldata.eu
|
|
12
|
+
Requires-Python: >=3.9
|
|
13
|
+
Classifier: Development Status :: 5 - Production/Stable
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: Intended Audience :: Information Technology
|
|
16
|
+
Classifier: Intended Audience :: Science/Research
|
|
17
|
+
Classifier: Programming Language :: Python :: 3
|
|
18
|
+
Classifier: Typing :: Typed
|
|
19
|
+
Requires-Dist: lxml (>=5.2.1,<6.0)
|
|
20
|
+
Requires-Dist: numpy (>=1.23.2,<2) ; python_version < "3.13"
|
|
21
|
+
Requires-Dist: numpy (>=2.1.0) ; python_version >= "3.13"
|
|
22
|
+
Requires-Dist: pandas (>=2.1.4,<3.0)
|
|
23
|
+
Project-URL: Documentation, https://docs.xbridge.meaningfuldata.eu
|
|
24
|
+
Project-URL: IssueTracker, https://github.com/Meaningful-Data/xbridge/issues
|
|
25
|
+
Project-URL: MeaningfulData, https://www.meaningfuldata.eu/
|
|
26
|
+
Project-URL: Repository, https://github.com/Meaningful-Data/xbridge
|
|
27
|
+
Description-Content-Type: text/x-rst
|
|
28
|
+
|
|
29
|
+
XBridge (eba-xbridge)
|
|
30
|
+
#####################
|
|
31
|
+
|
|
32
|
+
.. image:: https://img.shields.io/pypi/v/eba-xbridge.svg
|
|
33
|
+
:target: https://pypi.org/project/eba-xbridge/
|
|
34
|
+
:alt: PyPI version
|
|
35
|
+
|
|
36
|
+
.. image:: https://img.shields.io/pypi/pyversions/eba-xbridge.svg
|
|
37
|
+
:target: https://pypi.org/project/eba-xbridge/
|
|
38
|
+
:alt: Python versions
|
|
39
|
+
|
|
40
|
+
.. image:: https://img.shields.io/github/license/Meaningful-Data/xbridge.svg
|
|
41
|
+
:target: https://github.com/Meaningful-Data/xbridge/blob/main/LICENSE
|
|
42
|
+
:alt: License
|
|
43
|
+
|
|
44
|
+
.. image:: https://img.shields.io/github/actions/workflow/status/Meaningful-Data/xbridge/testing.yml?branch=main
|
|
45
|
+
:target: https://github.com/Meaningful-Data/xbridge/actions
|
|
46
|
+
:alt: Build status
|
|
47
|
+
|
|
48
|
+
Overview
|
|
49
|
+
========
|
|
50
|
+
|
|
51
|
+
XBridge is a Python library for converting XBRL-XML files into XBRL-CSV files using the EBA (European Banking Authority) taxonomy. It provides a simple, reliable way to transform regulatory reporting data from XML format to CSV format.
|
|
52
|
+
|
|
53
|
+
The library currently supports **EBA Taxonomy version 4.2** and includes support for DORA (Digital Operational Resilience Act) CSV conversion. The library must be updated with each new EBA taxonomy version release.
|
|
54
|
+
|
|
55
|
+
Key Features
|
|
56
|
+
============
|
|
57
|
+
|
|
58
|
+
* **XBRL-XML to XBRL-CSV Conversion**: Seamlessly convert XBRL-XML instance files to XBRL-CSV format
|
|
59
|
+
* **Command-Line Interface**: Quick conversions without writing code using the ``xbridge`` CLI
|
|
60
|
+
* **Python API**: Programmatic conversion for integration with other tools and workflows
|
|
61
|
+
* **EBA Taxonomy 4.2 Support**: Built for the latest EBA taxonomy specification
|
|
62
|
+
* **DORA CSV Conversion**: Support for Digital Operational Resilience Act reporting
|
|
63
|
+
* **Configurable Validation**: Flexible filing indicator validation with strict or warning modes
|
|
64
|
+
* **Decimal Handling**: Intelligent decimal precision handling with configurable options
|
|
65
|
+
* **Type Safety**: Fully typed codebase with MyPy strict mode compliance
|
|
66
|
+
* **Python 3.9+**: Supports Python 3.9 through 3.13
|
|
67
|
+
|
|
68
|
+
Prerequisites
|
|
69
|
+
=============
|
|
70
|
+
|
|
71
|
+
* **Python**: 3.9 or higher
|
|
72
|
+
* **7z Command-Line Tool**: Required for loading compressed taxonomy files (7z or ZIP format)
|
|
73
|
+
|
|
74
|
+
* On Ubuntu/Debian: ``sudo apt-get install p7zip-full``
|
|
75
|
+
* On macOS: ``brew install p7zip``
|
|
76
|
+
* On Windows: Download from `7-zip.org <https://www.7-zip.org/>`_
|
|
77
|
+
|
|
78
|
+
Installation
|
|
79
|
+
============
|
|
80
|
+
|
|
81
|
+
Install XBridge from PyPI using pip:
|
|
82
|
+
|
|
83
|
+
.. code-block:: bash
|
|
84
|
+
|
|
85
|
+
pip install eba-xbridge
|
|
86
|
+
|
|
87
|
+
For development installation, see `CONTRIBUTING.md <CONTRIBUTING.md>`_.
|
|
88
|
+
|
|
89
|
+
Quick Start
|
|
90
|
+
===========
|
|
91
|
+
|
|
92
|
+
XBridge offers two ways to convert XBRL-XML files to XBRL-CSV: a command-line interface (CLI) for quick conversions, and a Python API for programmatic use.
|
|
93
|
+
|
|
94
|
+
Command-Line Interface
|
|
95
|
+
----------------------
|
|
96
|
+
|
|
97
|
+
The CLI provides a quick way to convert files without writing code:
|
|
98
|
+
|
|
99
|
+
.. code-block:: bash
|
|
100
|
+
|
|
101
|
+
# Basic conversion (output to same directory as input)
|
|
102
|
+
xbridge instance.xbrl
|
|
103
|
+
|
|
104
|
+
# Specify output directory
|
|
105
|
+
xbridge instance.xbrl --output-path ./output
|
|
106
|
+
|
|
107
|
+
# Continue with warnings instead of errors
|
|
108
|
+
xbridge instance.xbrl --no-strict-validation
|
|
109
|
+
|
|
110
|
+
# Include headers as datapoints
|
|
111
|
+
xbridge instance.xbrl --headers-as-datapoints
|
|
112
|
+
|
|
113
|
+
**CLI Options:**
|
|
114
|
+
|
|
115
|
+
* ``--output-path PATH``: Output directory (default: same as input file)
|
|
116
|
+
* ``--headers-as-datapoints``: Treat headers as datapoints (default: False)
|
|
117
|
+
* ``--strict-validation``: Raise errors on validation failures (default: True)
|
|
118
|
+
* ``--no-strict-validation``: Emit warnings instead of errors
|
|
119
|
+
|
|
120
|
+
For more CLI options, run ``xbridge --help``.
|
|
121
|
+
|
|
122
|
+
Python API - Basic Conversion
|
|
123
|
+
------------------------------
|
|
124
|
+
|
|
125
|
+
Convert an XBRL-XML instance file to XBRL-CSV using the Python API:
|
|
126
|
+
|
|
127
|
+
.. code-block:: python
|
|
128
|
+
|
|
129
|
+
from xbridge.api import convert_instance
|
|
130
|
+
|
|
131
|
+
# Basic conversion
|
|
132
|
+
input_path = "path/to/instance.xbrl"
|
|
133
|
+
output_path = "path/to/output"
|
|
134
|
+
|
|
135
|
+
convert_instance(input_path, output_path)
|
|
136
|
+
|
|
137
|
+
The converted XBRL-CSV files will be saved as a ZIP archive in the output directory.
|
|
138
|
+
|
|
139
|
+
Python API - Advanced Usage
|
|
140
|
+
----------------------------
|
|
141
|
+
|
|
142
|
+
Customize the conversion with additional parameters:
|
|
143
|
+
|
|
144
|
+
.. code-block:: python
|
|
145
|
+
|
|
146
|
+
from xbridge.api import convert_instance
|
|
147
|
+
|
|
148
|
+
# Conversion with custom options
|
|
149
|
+
convert_instance(
|
|
150
|
+
instance_path="path/to/instance.xbrl",
|
|
151
|
+
output_path="path/to/output",
|
|
152
|
+
headers_as_datapoints=True, # Treat headers as datapoints
|
|
153
|
+
validate_filing_indicators=True, # Validate filing indicators
|
|
154
|
+
strict_validation=False, # Emit warnings instead of errors for orphaned facts
|
|
155
|
+
)
|
|
156
|
+
|
|
157
|
+
Loading an Instance
|
|
158
|
+
-------------------
|
|
159
|
+
|
|
160
|
+
Load and inspect an XBRL-XML instance without converting:
|
|
161
|
+
|
|
162
|
+
.. code-block:: python
|
|
163
|
+
|
|
164
|
+
from xbridge.api import load_instance
|
|
165
|
+
|
|
166
|
+
instance = load_instance("path/to/instance.xbrl")
|
|
167
|
+
|
|
168
|
+
# Access instance properties
|
|
169
|
+
print(f"Entity: {instance.entity}")
|
|
170
|
+
print(f"Period: {instance.period}")
|
|
171
|
+
print(f"Facts count: {len(instance.facts)}")
|
|
172
|
+
|
|
173
|
+
How XBridge Works
|
|
174
|
+
=================
|
|
175
|
+
|
|
176
|
+
XBridge performs the conversion in several steps:
|
|
177
|
+
|
|
178
|
+
1. **Load the XBRL-XML instance**: Parse and extract facts, contexts, scenarios, and filing indicators
|
|
179
|
+
2. **Load the EBA taxonomy**: Access pre-processed taxonomy modules containing tables and variables
|
|
180
|
+
3. **Match and validate**: Join instance facts with taxonomy definitions
|
|
181
|
+
4. **Generate CSV files**: Create XBRL-CSV files including:
|
|
182
|
+
|
|
183
|
+
* Data tables with facts and dimensions
|
|
184
|
+
* Filing indicators showing reported tables
|
|
185
|
+
* Parameters (entity, period, base currency, decimals)
|
|
186
|
+
|
|
187
|
+
5. **Package output**: Bundle all CSV files into a ZIP archive
|
|
188
|
+
|
|
189
|
+
Output Structure
|
|
190
|
+
----------------
|
|
191
|
+
|
|
192
|
+
The output ZIP file contains:
|
|
193
|
+
|
|
194
|
+
* **META-INF/**: JSON report package metadata
|
|
195
|
+
* **reports/**: CSV files for each reported table
|
|
196
|
+
* **filing-indicators.csv**: Table reporting indicators
|
|
197
|
+
* **parameters.csv**: Report-level parameters
|
|
198
|
+
|
|
199
|
+
Documentation
|
|
200
|
+
=============
|
|
201
|
+
|
|
202
|
+
Comprehensive documentation is available at `docs.xbridge.meaningfuldata.eu <https://docs.xbridge.meaningfuldata.eu>`_.
|
|
203
|
+
|
|
204
|
+
The documentation includes:
|
|
205
|
+
|
|
206
|
+
* **API Reference**: Complete API documentation
|
|
207
|
+
* **Quickstart Guide**: Step-by-step tutorials
|
|
208
|
+
* **Technical Notes**: Architecture and design details
|
|
209
|
+
* **FAQ**: Frequently asked questions
|
|
210
|
+
|
|
211
|
+
Taxonomy Loading
|
|
212
|
+
================
|
|
213
|
+
|
|
214
|
+
If you need to work with the EBA taxonomy directly, you can load it using:
|
|
215
|
+
|
|
216
|
+
.. code-block:: bash
|
|
217
|
+
|
|
218
|
+
python -m xbridge.taxonomy_loader --input_path path/to/FullTaxonomy.7z
|
|
219
|
+
|
|
220
|
+
This generates an ``index.json`` file containing module references and pre-processed taxonomy data.
|
|
221
|
+
|
|
222
|
+
.. warning::
|
|
223
|
+
Loading the taxonomy from a 7z package may take several minutes. Ensure the ``7z`` command is available on your system.
|
|
224
|
+
|
|
225
|
+
Configuration Options
|
|
226
|
+
=====================
|
|
227
|
+
|
|
228
|
+
convert_instance Parameters
|
|
229
|
+
----------------------------
|
|
230
|
+
|
|
231
|
+
* **instance_path** (str | Path): Path to the XBRL-XML instance file
|
|
232
|
+
* **output_path** (str | Path | None): Output directory for CSV files (default: current directory)
|
|
233
|
+
* **headers_as_datapoints** (bool): Treat table headers as datapoints (default: False)
|
|
234
|
+
* **validate_filing_indicators** (bool): Validate that facts belong to reported tables (default: True)
|
|
235
|
+
* **strict_validation** (bool): Raise errors on validation failures; if False, emit warnings (default: True)
|
|
236
|
+
|
|
237
|
+
Troubleshooting
|
|
238
|
+
===============
|
|
239
|
+
|
|
240
|
+
Common Issues
|
|
241
|
+
-------------
|
|
242
|
+
|
|
243
|
+
**7z command not found**
|
|
244
|
+
Install the 7z command-line tool using your system's package manager (see Prerequisites).
|
|
245
|
+
|
|
246
|
+
**Taxonomy version mismatch**
|
|
247
|
+
Ensure you're using the correct version of XBridge for your taxonomy version. XBridge 1.5.x supports EBA Taxonomy 4.1.
|
|
248
|
+
|
|
249
|
+
**Orphaned facts warning/error**
|
|
250
|
+
Facts that don't belong to any reported table. Set ``strict_validation=False`` to continue with warnings instead of errors.
|
|
251
|
+
|
|
252
|
+
**Decimal precision issues**
|
|
253
|
+
XBridge automatically handles decimal precision from the taxonomy. Check the parameters.csv file for applied decimal settings.
|
|
254
|
+
|
|
255
|
+
For more issues, see our `FAQ <https://docs.xbridge.meaningfuldata.eu/faq.html>`_ or `open an issue <https://github.com/Meaningful-Data/xbridge/issues>`_.
|
|
256
|
+
|
|
257
|
+
Contributing
|
|
258
|
+
============
|
|
259
|
+
|
|
260
|
+
We welcome contributions! Please see `CONTRIBUTING.md <CONTRIBUTING.md>`_ for:
|
|
261
|
+
|
|
262
|
+
* Development setup instructions
|
|
263
|
+
* Code style guidelines
|
|
264
|
+
* Testing requirements
|
|
265
|
+
* Pull request process
|
|
266
|
+
|
|
267
|
+
Before contributing, please read our `Code of Conduct <CODE_OF_CONDUCT.md>`_.
|
|
268
|
+
|
|
269
|
+
Changelog
|
|
270
|
+
=========
|
|
271
|
+
|
|
272
|
+
See `CHANGELOG.md <CHANGELOG.md>`_ for a detailed history of changes.
|
|
273
|
+
|
|
274
|
+
Support
|
|
275
|
+
=======
|
|
276
|
+
|
|
277
|
+
* **Documentation**: https://docs.xbridge.meaningfuldata.eu
|
|
278
|
+
* **Issue Tracker**: https://github.com/Meaningful-Data/xbridge/issues
|
|
279
|
+
* **Email**: info@meaningfuldata.eu
|
|
280
|
+
* **Company**: https://www.meaningfuldata.eu/
|
|
281
|
+
|
|
282
|
+
Security
|
|
283
|
+
========
|
|
284
|
+
|
|
285
|
+
For security issues, please see our `Security Policy <SECURITY.md>`_.
|
|
286
|
+
|
|
287
|
+
License
|
|
288
|
+
=======
|
|
289
|
+
|
|
290
|
+
This project is licensed under the Apache License 2.0 - see the `LICENSE <LICENSE>`_ file for details.
|
|
291
|
+
|
|
292
|
+
Authors & Maintainers
|
|
293
|
+
=====================
|
|
294
|
+
|
|
295
|
+
**MeaningfulData** - https://www.meaningfuldata.eu/
|
|
296
|
+
|
|
297
|
+
Maintainers:
|
|
298
|
+
|
|
299
|
+
* Antonio Olleros (antonio.olleros@meaningfuldata.eu)
|
|
300
|
+
* Jesus Simon (jesus.simon@meaningfuldata.eu)
|
|
301
|
+
* Francisco Javier Hernandez del Caño (javier.hernandez@meaningfuldata.eu)
|
|
302
|
+
* Guillermo Garcia Martin (guillermo.garcia@meaningfuldata.eu)
|
|
303
|
+
|
|
304
|
+
Acknowledgments
|
|
305
|
+
===============
|
|
306
|
+
|
|
307
|
+
This project is designed to work with the European Banking Authority (EBA) taxonomy for regulatory reporting
|
|
308
|
+
|
|
@@ -1,7 +1,8 @@
|
|
|
1
|
-
xbridge/__init__.py,sha256=
|
|
2
|
-
xbridge/
|
|
3
|
-
xbridge/
|
|
4
|
-
xbridge/
|
|
1
|
+
xbridge/__init__.py,sha256=joASbfhYee_2irYZhRCZ6J4oTn6u1fjt6ilQbXwL4M4,68
|
|
2
|
+
xbridge/__main__.py,sha256=trtFEv7TRJgrLL84leIapPvgC_iVTj05qLHRRS1Olts,2219
|
|
3
|
+
xbridge/api.py,sha256=NCBz7VRJWE3gID6ndgL4Awoxw0w1yMIIf_OTLRuZyyQ,1559
|
|
4
|
+
xbridge/converter.py,sha256=uu6djzgGZcmq0nibrkmg5lW-npcolB4XtQoNWu1p_3o,23498
|
|
5
|
+
xbridge/instance.py,sha256=KQpXhsZIM9oTYJf2hyrzc9pqFY2-1JBF5y1xbnLbqk8,29991
|
|
5
6
|
xbridge/modules/ae_ae_4.2.json,sha256=AdFvwZqX0KVP3jF1iHeQc5QSnSMvvT3GvoA2G1AgXis,460165
|
|
6
7
|
xbridge/modules/ae_con_cir-680-2014_2017-04-04.json,sha256=4n0t9dKJNU8Nb5QHpssrDs8ZLwzI-Mw75ax-ar9pLu0,363273
|
|
7
8
|
xbridge/modules/ae_con_cir-680-2014_2018-03-31.json,sha256=aVWeLLs20p39kQQUthUzqrxBGKTycqhgX9WLk1rVlNw,363538
|
|
@@ -379,10 +380,11 @@ xbridge/modules/sbpimv_ind_its-2016-svbxx_2016-02-01.json,sha256=SED-dW--UKxhHNY
|
|
|
379
380
|
xbridge/modules/sbpimv_sbp_4.2.json,sha256=Bj4z7zofZngG9EJ7-q74F-JF41O1FK_mX8RTfYdLP9I,7023
|
|
380
381
|
xbridge/modules/sepa_ipr_pay_4.1.json,sha256=awsJeBUDhMIFs5so6CWUQmlcHSDcGMd8fnLy_r_iMik,27054
|
|
381
382
|
xbridge/modules/sepa_ipr_pay_4.2.json,sha256=JLJvR02LOAJy6SWPRuhV1TT02oXQhsG83FBn176KWsA,27742
|
|
382
|
-
xbridge/modules.py,sha256=
|
|
383
|
+
xbridge/modules.py,sha256=bTvBXtp3w4Gad2DpEQE7Hb-UfuUQLlRl8gywRstQtpU,22399
|
|
383
384
|
xbridge/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
384
385
|
xbridge/taxonomy_loader.py,sha256=K0lnJVryvkKsaoK3fMis-L2JpmwLO6z3Ruq3yj9FxDY,9317
|
|
385
|
-
eba_xbridge-1.5.
|
|
386
|
-
eba_xbridge-1.5.
|
|
387
|
-
eba_xbridge-1.5.
|
|
388
|
-
eba_xbridge-1.5.
|
|
386
|
+
eba_xbridge-1.5.0rc4.dist-info/METADATA,sha256=5BAX_xFnRrIxcQiJbNi3y68A_42F8dR-qpL6Z-bBT0U,10430
|
|
387
|
+
eba_xbridge-1.5.0rc4.dist-info/WHEEL,sha256=zp0Cn7JsFoX2ATtOhtaFYIiE2rmFAD4OcMhtUki8W3U,88
|
|
388
|
+
eba_xbridge-1.5.0rc4.dist-info/entry_points.txt,sha256=FATct4icSewM04cegjhybtm7xcQWhaSahL-DTtuFdZw,49
|
|
389
|
+
eba_xbridge-1.5.0rc4.dist-info/licenses/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
|
|
390
|
+
eba_xbridge-1.5.0rc4.dist-info/RECORD,,
|
xbridge/__init__.py
CHANGED
xbridge/__main__.py
ADDED
|
@@ -0,0 +1,82 @@
|
|
|
1
|
+
"""Command-line interface for xbridge."""
|
|
2
|
+
|
|
3
|
+
import argparse
|
|
4
|
+
import sys
|
|
5
|
+
from pathlib import Path
|
|
6
|
+
|
|
7
|
+
from xbridge.api import convert_instance
|
|
8
|
+
|
|
9
|
+
|
|
10
|
+
def main() -> None:
|
|
11
|
+
"""Main CLI entry point for xbridge converter."""
|
|
12
|
+
parser = argparse.ArgumentParser(
|
|
13
|
+
description="Convert XBRL-XML instances to XBRL-CSV format",
|
|
14
|
+
prog="xbridge",
|
|
15
|
+
)
|
|
16
|
+
|
|
17
|
+
parser.add_argument(
|
|
18
|
+
"input_file",
|
|
19
|
+
type=str,
|
|
20
|
+
help="Path to the input XBRL-XML file",
|
|
21
|
+
)
|
|
22
|
+
|
|
23
|
+
parser.add_argument(
|
|
24
|
+
"--output-path",
|
|
25
|
+
type=str,
|
|
26
|
+
default=None,
|
|
27
|
+
help="Output directory path (default: same folder as input file)",
|
|
28
|
+
)
|
|
29
|
+
|
|
30
|
+
parser.add_argument(
|
|
31
|
+
"--headers-as-datapoints",
|
|
32
|
+
action="store_true",
|
|
33
|
+
default=False,
|
|
34
|
+
help="Treat headers as datapoints (default: False)",
|
|
35
|
+
)
|
|
36
|
+
|
|
37
|
+
parser.add_argument(
|
|
38
|
+
"--strict-validation",
|
|
39
|
+
action="store_true",
|
|
40
|
+
default=True,
|
|
41
|
+
help="Raise errors on validation failures (default: True)",
|
|
42
|
+
)
|
|
43
|
+
|
|
44
|
+
parser.add_argument(
|
|
45
|
+
"--no-strict-validation",
|
|
46
|
+
action="store_false",
|
|
47
|
+
dest="strict_validation",
|
|
48
|
+
help="Emit warnings instead of errors for validation failures",
|
|
49
|
+
)
|
|
50
|
+
|
|
51
|
+
args = parser.parse_args()
|
|
52
|
+
|
|
53
|
+
# Determine output path
|
|
54
|
+
input_path = Path(args.input_file)
|
|
55
|
+
if not input_path.exists():
|
|
56
|
+
print(f"Error: Input file not found: {args.input_file}", file=sys.stderr)
|
|
57
|
+
sys.exit(1)
|
|
58
|
+
|
|
59
|
+
if args.output_path is None:
|
|
60
|
+
output_path = input_path.parent
|
|
61
|
+
else:
|
|
62
|
+
output_path = Path(args.output_path)
|
|
63
|
+
if not output_path.exists():
|
|
64
|
+
print(f"Error: Output path does not exist: {args.output_path}", file=sys.stderr)
|
|
65
|
+
sys.exit(1)
|
|
66
|
+
|
|
67
|
+
try:
|
|
68
|
+
result_path = convert_instance(
|
|
69
|
+
instance_path=input_path,
|
|
70
|
+
output_path=output_path,
|
|
71
|
+
headers_as_datapoints=args.headers_as_datapoints,
|
|
72
|
+
validate_filing_indicators=True,
|
|
73
|
+
strict_validation=args.strict_validation,
|
|
74
|
+
)
|
|
75
|
+
print(f"Conversion successful: {result_path}")
|
|
76
|
+
except Exception as e:
|
|
77
|
+
print(f"Conversion failed: {e}", file=sys.stderr)
|
|
78
|
+
sys.exit(1)
|
|
79
|
+
|
|
80
|
+
|
|
81
|
+
if __name__ == "__main__":
|
|
82
|
+
main()
|
xbridge/api.py
CHANGED
|
@@ -14,6 +14,7 @@ def convert_instance(
|
|
|
14
14
|
output_path: Optional[Union[str, Path]] = None,
|
|
15
15
|
headers_as_datapoints: bool = False,
|
|
16
16
|
validate_filing_indicators: bool = True,
|
|
17
|
+
strict_validation: bool = True,
|
|
17
18
|
) -> Path:
|
|
18
19
|
"""
|
|
19
20
|
Convert one single instance of XBRL-XML file to a CSV file
|
|
@@ -27,6 +28,9 @@ def convert_instance(
|
|
|
27
28
|
:param validate_filing_indicators: If True, validate that no facts are orphaned
|
|
28
29
|
(belong only to non-reported tables). Default is True.
|
|
29
30
|
|
|
31
|
+
:param strict_validation: If True (default), raise an error on orphaned facts. If False,
|
|
32
|
+
emit a warning instead and continue.
|
|
33
|
+
|
|
30
34
|
:return: Converted CSV file.
|
|
31
35
|
|
|
32
36
|
"""
|
|
@@ -34,7 +38,12 @@ def convert_instance(
|
|
|
34
38
|
output_path = Path(".")
|
|
35
39
|
|
|
36
40
|
converter = Converter(instance_path)
|
|
37
|
-
return converter.convert(
|
|
41
|
+
return converter.convert(
|
|
42
|
+
output_path,
|
|
43
|
+
headers_as_datapoints,
|
|
44
|
+
validate_filing_indicators,
|
|
45
|
+
strict_validation,
|
|
46
|
+
)
|
|
38
47
|
|
|
39
48
|
|
|
40
49
|
def load_instance(instance_path: Union[str, Path]) -> Instance:
|
xbridge/converter.py
CHANGED
|
@@ -6,10 +6,11 @@ from __future__ import annotations
|
|
|
6
6
|
|
|
7
7
|
import csv
|
|
8
8
|
import json
|
|
9
|
+
import warnings
|
|
9
10
|
from pathlib import Path
|
|
10
11
|
from shutil import rmtree
|
|
11
12
|
from tempfile import TemporaryDirectory
|
|
12
|
-
from typing import Any, Dict,
|
|
13
|
+
from typing import Any, Dict, Union
|
|
13
14
|
from zipfile import ZipFile
|
|
14
15
|
|
|
15
16
|
import pandas as pd
|
|
@@ -76,6 +77,7 @@ class Converter:
|
|
|
76
77
|
output_path: Union[str, Path],
|
|
77
78
|
headers_as_datapoints: bool = False,
|
|
78
79
|
validate_filing_indicators: bool = True,
|
|
80
|
+
strict_validation: bool = False,
|
|
79
81
|
) -> Path:
|
|
80
82
|
"""Convert the ``XML Instance`` to a CSV file or between CSV formats"""
|
|
81
83
|
if not output_path:
|
|
@@ -90,7 +92,9 @@ class Converter:
|
|
|
90
92
|
raise ValueError("Module of the instance file not found in the taxonomy")
|
|
91
93
|
|
|
92
94
|
if isinstance(self.instance, XmlInstance):
|
|
93
|
-
return self.convert_xml(
|
|
95
|
+
return self.convert_xml(
|
|
96
|
+
output_path, headers_as_datapoints, validate_filing_indicators, strict_validation
|
|
97
|
+
)
|
|
94
98
|
elif isinstance(self.instance, CsvInstance):
|
|
95
99
|
if self.module.architecture != "headers":
|
|
96
100
|
raise ValueError("Cannot convert CSV instance with non-headers architecture")
|
|
@@ -103,6 +107,7 @@ class Converter:
|
|
|
103
107
|
output_path: Path,
|
|
104
108
|
headers_as_datapoints: bool = False,
|
|
105
109
|
validate_filing_indicators: bool = True,
|
|
110
|
+
strict_validation: bool = True,
|
|
106
111
|
) -> Path:
|
|
107
112
|
module_filind_codes = [table.filing_indicator_code for table in self.module.tables]
|
|
108
113
|
|
|
@@ -147,7 +152,7 @@ class Converter:
|
|
|
147
152
|
self._convert_filing_indicator(report_dir)
|
|
148
153
|
|
|
149
154
|
if validate_filing_indicators:
|
|
150
|
-
self._validate_filing_indicators()
|
|
155
|
+
self._validate_filing_indicators(strict_validation=strict_validation)
|
|
151
156
|
|
|
152
157
|
with open(MAPPING_PATH / self.module.dim_dom_file_name, "r", encoding="utf-8") as fl:
|
|
153
158
|
mapping_dict: Dict[str, str] = json.load(fl)
|
|
@@ -280,111 +285,44 @@ class Converter:
|
|
|
280
285
|
instance_df = instance_df.loc[mask]
|
|
281
286
|
instance_df.drop(columns=nrd_list, inplace=True)
|
|
282
287
|
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
self, table_df: pd.DataFrame, datapoint_df: pd.DataFrame
|
|
287
|
-
) -> pd.DataFrame:
|
|
288
|
-
"""
|
|
289
|
-
Normalizes fact values against allowed_values for each variable.
|
|
290
|
-
|
|
291
|
-
For variables with allowed_values:
|
|
292
|
-
1. Extracts code part from fact values (after ":")
|
|
293
|
-
2. Maps to correct namespaced value from allowed_values
|
|
294
|
-
3. Updates dimension columns with normalized values
|
|
295
|
-
4. Validates no unmatched codes remain
|
|
296
|
-
|
|
297
|
-
:param table_df: The merged dataframe with facts and variables
|
|
298
|
-
:param datapoint_df: The dataframe with variable definitions including allowed_values
|
|
299
|
-
:return: The normalized dataframe
|
|
300
|
-
"""
|
|
301
|
-
if "allowed_values" not in datapoint_df.columns:
|
|
302
|
-
return table_df
|
|
303
|
-
|
|
304
|
-
# Build mapping: datapoint → {code → full_value}
|
|
305
|
-
datapoint_allowed_map: Dict[str, Dict[str, str]] = {}
|
|
288
|
+
# Rows missing values for required open keys do not belong to the table
|
|
289
|
+
if open_keys:
|
|
290
|
+
instance_df.dropna(subset=list(open_keys), inplace=True)
|
|
306
291
|
|
|
307
|
-
|
|
308
|
-
datapoint = row.get("datapoint")
|
|
309
|
-
allowed_values = row.get("allowed_values")
|
|
310
|
-
|
|
311
|
-
if not datapoint or not allowed_values or len(allowed_values) == 0:
|
|
312
|
-
continue
|
|
292
|
+
return instance_df
|
|
313
293
|
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
for allowed_val in allowed_values:
|
|
319
|
-
if ":" in allowed_val:
|
|
320
|
-
code = allowed_val.split(":")[-1]
|
|
321
|
-
code_map[code] = allowed_val
|
|
322
|
-
|
|
323
|
-
if code_map:
|
|
324
|
-
datapoint_allowed_map[datapoint] = code_map
|
|
325
|
-
|
|
326
|
-
if not datapoint_allowed_map:
|
|
327
|
-
return table_df
|
|
328
|
-
|
|
329
|
-
# Identify columns to normalize
|
|
330
|
-
# We normalize both dimension columns AND the value column (for enumerated values)
|
|
331
|
-
exclude_cols = {"datapoint", "decimals", "unit", "data_type", "allowed_values"}
|
|
332
|
-
columns_to_check = [col for col in table_df.columns if col not in exclude_cols]
|
|
333
|
-
|
|
334
|
-
# For each column that might contain namespaced values
|
|
335
|
-
for dim_col in columns_to_check:
|
|
336
|
-
if dim_col not in table_df.columns or table_df[dim_col].isna().all():
|
|
337
|
-
continue
|
|
294
|
+
def _matching_fact_indices(self, table: Table) -> set[int]:
|
|
295
|
+
"""Return indices of instance facts that actually match the table definition."""
|
|
296
|
+
if self.instance.instance_df is None:
|
|
297
|
+
return set()
|
|
338
298
|
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
continue
|
|
299
|
+
instance_df = self._get_instance_df(table)
|
|
300
|
+
if instance_df.empty or table.variable_df is None:
|
|
301
|
+
return set()
|
|
343
302
|
|
|
344
|
-
|
|
345
|
-
if not has_namespace:
|
|
346
|
-
continue
|
|
303
|
+
open_keys = set(table.open_keys)
|
|
347
304
|
|
|
348
|
-
|
|
349
|
-
mask = table_df[dim_col].notna()
|
|
350
|
-
temp_code_col = f"_{dim_col}_temp_code"
|
|
351
|
-
table_df.loc[mask, temp_code_col] = (
|
|
352
|
-
table_df.loc[mask, dim_col].astype(str).str.split(":").str[-1]
|
|
353
|
-
)
|
|
305
|
+
datapoint_df = table.variable_df.copy()
|
|
354
306
|
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
307
|
+
# For validation we match minimally on metric (concept) and any open keys present
|
|
308
|
+
merge_cols: list[str] = []
|
|
309
|
+
if "metric" in datapoint_df.columns and "metric" in instance_df.columns:
|
|
310
|
+
merge_cols.append("metric")
|
|
311
|
+
merge_cols.extend(
|
|
312
|
+
[key for key in open_keys if key in datapoint_df.columns and key in instance_df.columns]
|
|
313
|
+
)
|
|
358
314
|
|
|
359
|
-
|
|
360
|
-
|
|
315
|
+
instance_df = instance_df.copy()
|
|
316
|
+
instance_df["_idx"] = instance_df.index
|
|
361
317
|
|
|
362
|
-
|
|
363
|
-
original_values = table_df.loc[dp_mask, dim_col].copy()
|
|
364
|
-
|
|
365
|
-
# Map codes to correct full values
|
|
366
|
-
normalized_values = table_df.loc[dp_mask, temp_code_col].map(code_map)
|
|
367
|
-
|
|
368
|
-
# Update only the values that were successfully mapped
|
|
369
|
-
mapped_mask = dp_mask & normalized_values.notna()
|
|
370
|
-
table_df.loc[mapped_mask, dim_col] = normalized_values[mapped_mask]
|
|
371
|
-
|
|
372
|
-
# Check for values that couldn't be mapped (validation errors)
|
|
373
|
-
unmapped_mask = dp_mask & normalized_values.isna()
|
|
374
|
-
if unmapped_mask.any():
|
|
375
|
-
invalid_codes = table_df.loc[unmapped_mask, temp_code_col].unique()
|
|
376
|
-
valid_codes = list(code_map.keys())
|
|
377
|
-
raise ValueError(
|
|
378
|
-
f"Invalid values for datapoint '{datapoint}' in column '{dim_col}': "
|
|
379
|
-
f"Found codes {list(invalid_codes)} but only {valid_codes} are allowed. "
|
|
380
|
-
f"Original values: {original_values[unmapped_mask].tolist()}"
|
|
381
|
-
)
|
|
318
|
+
merged_df = pd.merge(datapoint_df, instance_df, on=merge_cols, how="inner")
|
|
382
319
|
|
|
383
|
-
|
|
384
|
-
if
|
|
385
|
-
|
|
320
|
+
if open_keys:
|
|
321
|
+
valid_open_keys = [key for key in open_keys if key in merged_df.columns]
|
|
322
|
+
if valid_open_keys:
|
|
323
|
+
merged_df.dropna(subset=valid_open_keys, inplace=True)
|
|
386
324
|
|
|
387
|
-
return
|
|
325
|
+
return set(merged_df["_idx"].tolist())
|
|
388
326
|
|
|
389
327
|
def _variable_generator(self, table: Table) -> pd.DataFrame:
|
|
390
328
|
"""Returns the dataframe with the CSV file for the table
|
|
@@ -406,7 +344,7 @@ class Converter:
|
|
|
406
344
|
)
|
|
407
345
|
|
|
408
346
|
# Do the intersection and drop from datapoints the columns and records
|
|
409
|
-
datapoint_df = table.variable_df
|
|
347
|
+
datapoint_df = table.variable_df.copy()
|
|
410
348
|
missing_cols = list(variable_columns - instance_columns)
|
|
411
349
|
if "data_type" in missing_cols:
|
|
412
350
|
missing_cols.remove("data_type")
|
|
@@ -417,10 +355,8 @@ class Converter:
|
|
|
417
355
|
|
|
418
356
|
# Join the dataframes on the datapoint_columns
|
|
419
357
|
merge_cols = list(variable_columns & instance_columns)
|
|
420
|
-
table_df = pd.merge(datapoint_df, instance_df, on=merge_cols, how="inner")
|
|
421
358
|
|
|
422
|
-
|
|
423
|
-
table_df = self._normalize_allowed_values(table_df, datapoint_df)
|
|
359
|
+
table_df = pd.merge(datapoint_df, instance_df, on=merge_cols, how="inner")
|
|
424
360
|
|
|
425
361
|
if "data_type" in table_df.columns and "decimals" in table_df.columns:
|
|
426
362
|
decimals_table = table_df[["decimals", "data_type"]].drop_duplicates()
|
|
@@ -432,17 +368,27 @@ class Converter:
|
|
|
432
368
|
decimals = row["decimals"]
|
|
433
369
|
|
|
434
370
|
if data_type not in self._decimals_parameters:
|
|
435
|
-
self._decimals_parameters[data_type] =
|
|
371
|
+
self._decimals_parameters[data_type] = (
|
|
372
|
+
int(decimals) if decimals not in {"INF", "#none"} else decimals
|
|
373
|
+
)
|
|
436
374
|
else:
|
|
437
375
|
# If new value is a special value, skip it (prefer numeric values)
|
|
438
376
|
if decimals in {"INF", "#none"}:
|
|
439
377
|
pass
|
|
440
378
|
# If new value is numeric
|
|
441
379
|
else:
|
|
380
|
+
try:
|
|
381
|
+
decimals = int(decimals)
|
|
382
|
+
except ValueError:
|
|
383
|
+
raise ValueError(
|
|
384
|
+
f"Invalid decimals value: {decimals}, "
|
|
385
|
+
"should be integer, 'INF' or '#none'"
|
|
386
|
+
)
|
|
387
|
+
|
|
442
388
|
# If existing value is special, replace with numeric
|
|
443
|
-
if
|
|
444
|
-
|
|
445
|
-
|
|
389
|
+
if (
|
|
390
|
+
self._decimals_parameters[data_type] in {"INF", "#none"}
|
|
391
|
+
or decimals < self._decimals_parameters[data_type]
|
|
446
392
|
):
|
|
447
393
|
self._decimals_parameters[data_type] = decimals
|
|
448
394
|
|
|
@@ -497,13 +443,6 @@ class Converter:
|
|
|
497
443
|
# Defined by the EBA in the JSON files. We take them from the taxonomy
|
|
498
444
|
# Because EBA is using exactly those for the JSON files.
|
|
499
445
|
|
|
500
|
-
for open_key in table.open_keys:
|
|
501
|
-
if open_key in datapoints.columns:
|
|
502
|
-
dim_name = mapping_dict.get(open_key)
|
|
503
|
-
# For open keys, there are no dim_names (they are not mapped)
|
|
504
|
-
if dim_name and not datapoints.empty:
|
|
505
|
-
datapoints[open_key] = dim_name + ":" + datapoints[open_key].astype(str)
|
|
506
|
-
|
|
507
446
|
datapoints.sort_values(by=["datapoint"], ascending=True, inplace=True)
|
|
508
447
|
output_path_table = temp_dir_path / (table.url or "table.csv")
|
|
509
448
|
|
|
@@ -550,7 +489,7 @@ class Converter:
|
|
|
550
489
|
if fil_ind.value and fil_ind.table:
|
|
551
490
|
self._reported_tables.append(fil_ind.table)
|
|
552
491
|
|
|
553
|
-
def _validate_filing_indicators(self) -> None:
|
|
492
|
+
def _validate_filing_indicators(self, strict_validation: bool = True) -> None:
|
|
554
493
|
"""Validate that no facts are orphaned (belong only to non-reported tables).
|
|
555
494
|
|
|
556
495
|
Raises:
|
|
@@ -559,44 +498,56 @@ class Converter:
|
|
|
559
498
|
if self.instance.instance_df is None or self.instance.instance_df.empty:
|
|
560
499
|
return
|
|
561
500
|
|
|
562
|
-
# Step 1:
|
|
563
|
-
|
|
501
|
+
# Step 1: Track which facts belong to ANY reported table without materializing a huge set
|
|
502
|
+
reported_mask = pd.Series(False, index=self.instance.instance_df.index)
|
|
564
503
|
for table in self.module.tables:
|
|
565
504
|
if table.filing_indicator_code in self._reported_tables:
|
|
566
|
-
|
|
567
|
-
if
|
|
568
|
-
|
|
569
|
-
reported_fact_indices.update(instance_df.index)
|
|
505
|
+
reported_indices = self._matching_fact_indices(table)
|
|
506
|
+
if reported_indices:
|
|
507
|
+
reported_mask.loc[list(reported_indices)] = True
|
|
570
508
|
|
|
571
509
|
# Step 2: Find facts that belong ONLY to non-reported tables
|
|
572
|
-
|
|
510
|
+
orphaned_mask = pd.Series(False, index=self.instance.instance_df.index)
|
|
573
511
|
orphaned_per_table = {}
|
|
574
512
|
|
|
575
513
|
for table in self.module.tables:
|
|
576
514
|
if table.filing_indicator_code not in self._reported_tables:
|
|
577
|
-
|
|
578
|
-
if
|
|
579
|
-
#
|
|
580
|
-
orphaned_in_this_table =
|
|
515
|
+
orphaned_indices = self._matching_fact_indices(table)
|
|
516
|
+
if orphaned_indices:
|
|
517
|
+
# Facts in this table that never appear in a reported table
|
|
518
|
+
orphaned_in_this_table = [
|
|
519
|
+
idx for idx in orphaned_indices if not reported_mask.loc[idx]
|
|
520
|
+
]
|
|
581
521
|
if orphaned_in_this_table:
|
|
522
|
+
orphaned_mask.loc[orphaned_in_this_table] = True
|
|
582
523
|
orphaned_per_table[table.filing_indicator_code] = len(
|
|
583
524
|
orphaned_in_this_table
|
|
584
525
|
)
|
|
585
|
-
all_orphaned_indices.update(orphaned_in_this_table)
|
|
586
526
|
|
|
587
|
-
|
|
527
|
+
total_orphaned = int(orphaned_mask.sum())
|
|
528
|
+
|
|
529
|
+
if total_orphaned:
|
|
588
530
|
error_msg = (
|
|
589
531
|
f"Filing indicator inconsistency detected:\n"
|
|
590
|
-
f"Found {
|
|
532
|
+
f"Found {total_orphaned} fact(s) that belong ONLY"
|
|
591
533
|
f" to non-reported tables:\n"
|
|
592
534
|
)
|
|
593
535
|
for table_code, count in orphaned_per_table.items():
|
|
594
536
|
error_msg += f" - {table_code}: {count} fact(s)\n"
|
|
537
|
+
|
|
538
|
+
if strict_validation:
|
|
539
|
+
error_msg += (
|
|
540
|
+
"\nThe conversion process will not continue due to strict validation mode. "
|
|
541
|
+
"Either set filed=true for the relevant tables "
|
|
542
|
+
"or remove these facts from the XML."
|
|
543
|
+
)
|
|
544
|
+
raise ValueError(error_msg)
|
|
595
545
|
error_msg += (
|
|
596
546
|
"\nThese facts will be excluded from the output. "
|
|
597
|
-
"
|
|
547
|
+
"Consider setting filed=true for the relevant tables "
|
|
548
|
+
"or removing these facts from the XML."
|
|
598
549
|
)
|
|
599
|
-
|
|
550
|
+
warnings.warn(error_msg)
|
|
600
551
|
|
|
601
552
|
def _convert_parameters(self, temp_dir_path: Path) -> None:
|
|
602
553
|
# Workaround;
|
xbridge/instance.py
CHANGED
|
@@ -13,6 +13,59 @@ from zipfile import ZipFile
|
|
|
13
13
|
import pandas as pd
|
|
14
14
|
from lxml import etree
|
|
15
15
|
|
|
16
|
+
# Cache namespace → CSV prefix derivations to avoid repeated string work during parse
|
|
17
|
+
_namespace_prefix_cache: Dict[str, str] = {}
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
def _derive_csv_prefix(namespace_uri: str) -> Optional[str]:
|
|
21
|
+
"""Derive the fixed CSV prefix from a namespace URI using the EBA convention."""
|
|
22
|
+
if not namespace_uri:
|
|
23
|
+
return None
|
|
24
|
+
|
|
25
|
+
cached = _namespace_prefix_cache.get(namespace_uri)
|
|
26
|
+
if cached is not None:
|
|
27
|
+
return cached
|
|
28
|
+
|
|
29
|
+
cleaned = namespace_uri.rstrip("#/")
|
|
30
|
+
if "#" in namespace_uri:
|
|
31
|
+
segment = namespace_uri.rsplit("#", 1)[-1]
|
|
32
|
+
else:
|
|
33
|
+
segment = cleaned.rsplit("/", 1)[-1] if "/" in cleaned else cleaned
|
|
34
|
+
|
|
35
|
+
if not segment:
|
|
36
|
+
return None
|
|
37
|
+
|
|
38
|
+
prefix = f"eba_{segment}"
|
|
39
|
+
_namespace_prefix_cache[namespace_uri] = prefix
|
|
40
|
+
return prefix
|
|
41
|
+
|
|
42
|
+
|
|
43
|
+
def _normalize_namespaced_value(
|
|
44
|
+
value: Optional[str], nsmap: Dict[Optional[str], str]
|
|
45
|
+
) -> Optional[str]:
|
|
46
|
+
"""
|
|
47
|
+
Normalize a namespaced value (e.g., 'dom:qAE' or '{uri}qAE') to the CSV prefix convention.
|
|
48
|
+
Returns the original value if no namespace can be resolved.
|
|
49
|
+
"""
|
|
50
|
+
if value is None:
|
|
51
|
+
return None
|
|
52
|
+
|
|
53
|
+
# Clark notation: {uri}local
|
|
54
|
+
if value.startswith("{") and "}" in value:
|
|
55
|
+
uri, local = value[1:].split("}", 1)
|
|
56
|
+
derived = _derive_csv_prefix(uri)
|
|
57
|
+
return f"{derived}:{local}" if derived else value
|
|
58
|
+
|
|
59
|
+
# Prefixed notation: prefix:local
|
|
60
|
+
if ":" in value:
|
|
61
|
+
potential_prefix, local = value.split(":", 1)
|
|
62
|
+
namespace_uri = nsmap.get(potential_prefix)
|
|
63
|
+
if namespace_uri:
|
|
64
|
+
derived = _derive_csv_prefix(namespace_uri)
|
|
65
|
+
return f"{derived}:{local}" if derived else value
|
|
66
|
+
|
|
67
|
+
return value
|
|
68
|
+
|
|
16
69
|
|
|
17
70
|
class Instance:
|
|
18
71
|
"""
|
|
@@ -548,7 +601,7 @@ class Scenario:
|
|
|
548
601
|
continue
|
|
549
602
|
dimension = dimension_raw.split(":")[1]
|
|
550
603
|
value = self.get_value(child)
|
|
551
|
-
value = value.
|
|
604
|
+
value = _normalize_namespaced_value(value, child.nsmap) or ""
|
|
552
605
|
self.dimensions[dimension] = value
|
|
553
606
|
|
|
554
607
|
@staticmethod
|
|
@@ -667,7 +720,7 @@ class Fact:
|
|
|
667
720
|
def parse(self) -> None:
|
|
668
721
|
"""Parse the XML node with the `fact <https://www.xbrl.org/guidance/xbrl-glossary/#:~:text=accounting%20standards%20body.-,Fact,-A%20fact%20is>`_."""
|
|
669
722
|
self.metric = self.fact_xml.tag
|
|
670
|
-
self.value = self.fact_xml.text
|
|
723
|
+
self.value = _normalize_namespaced_value(self.fact_xml.text, self.fact_xml.nsmap)
|
|
671
724
|
self.decimals = self.fact_xml.attrib.get("decimals")
|
|
672
725
|
self.context = self.fact_xml.attrib.get("contextRef")
|
|
673
726
|
self.unit = self.fact_xml.attrib.get("unitRef")
|
|
@@ -675,7 +728,11 @@ class Fact:
|
|
|
675
728
|
def __dict__(self) -> Dict[str, Any]: # type: ignore[override]
|
|
676
729
|
metric_clean = ""
|
|
677
730
|
if self.metric:
|
|
678
|
-
|
|
731
|
+
# Normalize metric to use consistent eba_* prefix like other dimensions
|
|
732
|
+
metric_clean = _normalize_namespaced_value(self.metric, self.fact_xml.nsmap) or ""
|
|
733
|
+
# If still in Clark notation, extract the local name
|
|
734
|
+
if metric_clean.startswith("{") and "}" in metric_clean:
|
|
735
|
+
metric_clean = metric_clean.split("}", 1)[1]
|
|
679
736
|
|
|
680
737
|
return {
|
|
681
738
|
"metric": metric_clean,
|
xbridge/modules.py
CHANGED
|
@@ -306,9 +306,9 @@ class Table:
|
|
|
306
306
|
variable_info: dict[str, Any] = {}
|
|
307
307
|
for dim_k, dim_v in variable.dimensions.items():
|
|
308
308
|
if dim_k not in ("unit", "decimals"):
|
|
309
|
-
variable_info[dim_k] = dim_v
|
|
309
|
+
variable_info[dim_k] = dim_v
|
|
310
310
|
if "concept" in variable.dimensions:
|
|
311
|
-
variable_info["metric"] = variable.dimensions["concept"]
|
|
311
|
+
variable_info["metric"] = variable.dimensions["concept"]
|
|
312
312
|
del variable_info["concept"]
|
|
313
313
|
|
|
314
314
|
if variable.code is None:
|
|
@@ -324,9 +324,11 @@ class Table:
|
|
|
324
324
|
if "dimensions" in column:
|
|
325
325
|
for dim_k, dim_v in column["dimensions"].items():
|
|
326
326
|
if dim_k == "concept":
|
|
327
|
-
variable_info["metric"] = dim_v
|
|
327
|
+
variable_info["metric"] = dim_v
|
|
328
328
|
elif dim_k not in ("unit", "decimals"):
|
|
329
|
-
|
|
329
|
+
# Keep the full dimension key and value with prefixes
|
|
330
|
+
dim_k_clean = dim_k.split(":")[1] if ":" in dim_k else dim_k
|
|
331
|
+
variable_info[dim_k_clean] = dim_v
|
|
330
332
|
|
|
331
333
|
if "decimals" in column:
|
|
332
334
|
variable_info["data_type"] = column["decimals"]
|
|
@@ -1,62 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: eba-xbridge
|
|
3
|
-
Version: 1.5.0rc2
|
|
4
|
-
Summary: XBRL-XML to XBRL-CSV converter for EBA Taxonomy (version 4.1)
|
|
5
|
-
License: Apache 2.0
|
|
6
|
-
License-File: LICENSE
|
|
7
|
-
Keywords: xbrl,eba,taxonomy,csv,xml
|
|
8
|
-
Author: MeaningfulData
|
|
9
|
-
Author-email: info@meaningfuldata.eu
|
|
10
|
-
Maintainer: Antonio Olleros
|
|
11
|
-
Maintainer-email: antonio.olleros@meaningfuldata.eu
|
|
12
|
-
Requires-Python: >=3.9
|
|
13
|
-
Classifier: Development Status :: 5 - Production/Stable
|
|
14
|
-
Classifier: Intended Audience :: Developers
|
|
15
|
-
Classifier: Intended Audience :: Information Technology
|
|
16
|
-
Classifier: Intended Audience :: Science/Research
|
|
17
|
-
Classifier: Programming Language :: Python :: 3
|
|
18
|
-
Classifier: Typing :: Typed
|
|
19
|
-
Requires-Dist: lxml (>=5.2.1,<6.0)
|
|
20
|
-
Requires-Dist: numpy (>=1.23.2,<2) ; python_version < "3.13"
|
|
21
|
-
Requires-Dist: numpy (>=2.1.0) ; python_version >= "3.13"
|
|
22
|
-
Requires-Dist: pandas (>=2.1.4,<3.0)
|
|
23
|
-
Project-URL: Documentation, https://docs.xbridge.meaningfuldata.eu
|
|
24
|
-
Project-URL: IssueTracker, https://github.com/Meaningful-Data/xbridge/issues
|
|
25
|
-
Project-URL: MeaningfulData, https://www.meaningfuldata.eu/
|
|
26
|
-
Project-URL: Repository, https://github.com/Meaningful-Data/xbridge
|
|
27
|
-
Description-Content-Type: text/x-rst
|
|
28
|
-
|
|
29
|
-
Overview
|
|
30
|
-
============
|
|
31
|
-
XBridge is a Python library which main function is to convert XBRL-XML files into XBRL-CSV files by using EBA's taxonomy.
|
|
32
|
-
It works with EBA Taxonomy latest published version (4.1). Library must be updated on each new EBA taxonomy version.
|
|
33
|
-
|
|
34
|
-
Installation
|
|
35
|
-
============
|
|
36
|
-
|
|
37
|
-
To install the library, run the following command:
|
|
38
|
-
|
|
39
|
-
.. code:: bash
|
|
40
|
-
|
|
41
|
-
pip install eba-xbridge
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
How XBridge works:
|
|
45
|
-
=========================
|
|
46
|
-
|
|
47
|
-
Firstly, an XBRL-XML file has to be selected to convert it. Then, that XBRL-XML file is input in the following function contained in the ``API`` package:
|
|
48
|
-
|
|
49
|
-
.. code:: python
|
|
50
|
-
|
|
51
|
-
>>> from xbridge.api import convert_instance
|
|
52
|
-
|
|
53
|
-
>>> input_path = "data/input"
|
|
54
|
-
|
|
55
|
-
>>> output_path = "data/output"
|
|
56
|
-
|
|
57
|
-
>>> convert_instance(input_path, output_path)
|
|
58
|
-
|
|
59
|
-
The sources to do this process are two: The XML-instances and EBA´s taxonomy.
|
|
60
|
-
|
|
61
|
-
The output is the converted XBRL-CSV file placed in the output_path, as zip format
|
|
62
|
-
|
|
File without changes
|
|
File without changes
|