notoecd 0.1.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: notoecd
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Library for interacting with the OECD Data Explorer through Python
|
|
5
|
+
Author-email: Daniel Vegara Balsa <daniel.vegarabalsa@oecd.org>
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/dani-37/notoecd
|
|
8
|
+
Requires-Python: >=3.10
|
|
9
|
+
Description-Content-Type: text/markdown
|
|
10
|
+
Requires-Dist: pandas>=2.0
|
|
11
|
+
Requires-Dist: requests>=2.31
|
|
12
|
+
|
|
13
|
+
# notoecd
|
|
14
|
+
|
|
15
|
+
⚠️ **Unofficial package, not endorsed by the OECD.**
|
|
16
|
+
|
|
17
|
+
A lightweight Python interface for exploring OECD SDMX structures and downloading OECD regional datasets.
|
|
18
|
+
The package provides utilities for:
|
|
19
|
+
|
|
20
|
+
- Discovering dataset metadata
|
|
21
|
+
- Searching for relevant datasets using keyword matching
|
|
22
|
+
- Exploring the structure and code lists of a dataset
|
|
23
|
+
- Fetching filtered SDMX data directly into a pandas DataFrame
|
|
24
|
+
|
|
25
|
+
------------------------------------------------------------
|
|
26
|
+
|
|
27
|
+
## Installation
|
|
28
|
+
|
|
29
|
+
You can install the package by running:
|
|
30
|
+
|
|
31
|
+
pip install notoecd
|
|
32
|
+
|
|
33
|
+
------------------------------------------------------------
|
|
34
|
+
|
|
35
|
+
## Quick Start
|
|
36
|
+
|
|
37
|
+
import notoecd
|
|
38
|
+
|
|
39
|
+
The main functions in this module are:
|
|
40
|
+
|
|
41
|
+
search_keywords(keywords) -> pd.DataFrame
|
|
42
|
+
get_structure(agencyID, dataflowID) -> Structure
|
|
43
|
+
get_df(agencyID, dataflowID, filters) -> pd.DataFrame
|
|
44
|
+
|
|
45
|
+
------------------------------------------------------------
|
|
46
|
+
|
|
47
|
+
## Searching for datasets
|
|
48
|
+
|
|
49
|
+
`search_keywords` performs:
|
|
50
|
+
|
|
51
|
+
- Normalized text matching
|
|
52
|
+
- Accent-insensitive search
|
|
53
|
+
- Multi-keyword OR matching
|
|
54
|
+
- Ranking by number of matched keywords
|
|
55
|
+
|
|
56
|
+
Example:
|
|
57
|
+
|
|
58
|
+
hits = notoecd.search_keywords(['gross domestic product', 'tl2', 'tl3'])
|
|
59
|
+
|
|
60
|
+
This returns datasets that mention GDP and regional levels (TL2/TL3). It gives their name, description, and identifiers (agencyID and dataflowID), which we will need for the next step.
|
|
61
|
+
|
|
62
|
+
------------------------------------------------------------
|
|
63
|
+
|
|
64
|
+
## Inspecting dataset structure
|
|
65
|
+
|
|
66
|
+
Once a dataset is identified, load its SDMX structure:
|
|
67
|
+
|
|
68
|
+
dataset = 'Gross domestic product - Regions'
|
|
69
|
+
agencyID = 'OECD.CFE.EDS'
|
|
70
|
+
dataflowID = 'DSD_REG_ECO@DF_GDP'
|
|
71
|
+
|
|
72
|
+
s = notoecd.get_structure(agencyID, dataflowID)
|
|
73
|
+
|
|
74
|
+
### Table of contents
|
|
75
|
+
|
|
76
|
+
s.toc
|
|
77
|
+
|
|
78
|
+
This shows all filters and their available values.
|
|
79
|
+
|
|
80
|
+
### Exploring code values
|
|
81
|
+
|
|
82
|
+
s.explain_vals('MEASURE')
|
|
83
|
+
s.explain_vals('UNIT_MEASURE')
|
|
84
|
+
|
|
85
|
+
This shows the available measures and units used in the dataset.
|
|
86
|
+
|
|
87
|
+
------------------------------------------------------------
|
|
88
|
+
|
|
89
|
+
## Filtering and downloading data
|
|
90
|
+
|
|
91
|
+
To download data, build a dictionary of filters.
|
|
92
|
+
Keys correspond to SDMX dimensions, values are strings or lists (for multiple values):
|
|
93
|
+
|
|
94
|
+
filters = {
|
|
95
|
+
'territorial_level': ['tl2', 'tl3'],
|
|
96
|
+
'measure': 'gdp',
|
|
97
|
+
'prices': 'Q',
|
|
98
|
+
'unit_measure': 'USD_PPP_PS'
|
|
99
|
+
}
|
|
100
|
+
|
|
101
|
+
Fetch the filtered dataset:
|
|
102
|
+
|
|
103
|
+
df = notoecd.get_df(agency, dataflow, filters)
|
|
104
|
+
df.head()
|
|
105
|
+
|
|
106
|
+
The returned object is a pandas DataFrame containing the requested subset of OECD SDMX data.
|
|
107
|
+
|
|
108
|
+
------------------------------------------------------------
|
|
109
|
+
|
|
110
|
+
## Examples
|
|
111
|
+
|
|
112
|
+
You can see this full example as a notebook called example.ipynb.
|
|
113
|
+
|
|
114
|
+
|
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
notoecd-0.1.0.dist-info/METADATA,sha256=UcQeYIOzhpdpZE5HcFVm_7DCjxdJmLkcCdbSDsrPTHY,3191
|
|
2
|
+
notoecd-0.1.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
3
|
+
notoecd-0.1.0.dist-info/top_level.txt,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
|
|
4
|
+
notoecd-0.1.0.dist-info/RECORD,,
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|