econcomplex 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. econcomplex-1.0.0/CHANGELOG.md +35 -0
  2. econcomplex-1.0.0/LICENSE +22 -0
  3. econcomplex-1.0.0/MANIFEST.in +7 -0
  4. econcomplex-1.0.0/PKG-INFO +223 -0
  5. econcomplex-1.0.0/README.md +186 -0
  6. econcomplex-1.0.0/README.pt-BR.md +187 -0
  7. econcomplex-1.0.0/docs/api_reference_en.tex +174 -0
  8. econcomplex-1.0.0/docs/api_reference_pt.tex +174 -0
  9. econcomplex-1.0.0/docs/econcomplex_documentation_en.pdf +0 -0
  10. econcomplex-1.0.0/docs/econcomplex_documentation_en.tex +1270 -0
  11. econcomplex-1.0.0/docs/econcomplex_documentation_pt.pdf +0 -0
  12. econcomplex-1.0.0/docs/econcomplex_documentation_pt.tex +1282 -0
  13. econcomplex-1.0.0/docs/generate_api_reference.py +203 -0
  14. econcomplex-1.0.0/econcomplex/__init__.py +220 -0
  15. econcomplex-1.0.0/econcomplex/complexity/__init__.py +23 -0
  16. econcomplex-1.0.0/econcomplex/complexity/eci_pci.py +131 -0
  17. econcomplex-1.0.0/econcomplex/complexity/eigenvector.py +115 -0
  18. econcomplex-1.0.0/econcomplex/complexity/fitness.py +130 -0
  19. econcomplex-1.0.0/econcomplex/complexity/reflections.py +173 -0
  20. econcomplex-1.0.0/econcomplex/complexity/subnational.py +82 -0
  21. econcomplex-1.0.0/econcomplex/core/__init__.py +23 -0
  22. econcomplex-1.0.0/econcomplex/core/diversity.py +125 -0
  23. econcomplex-1.0.0/econcomplex/core/preprocess.py +83 -0
  24. econcomplex-1.0.0/econcomplex/core/rca.py +161 -0
  25. econcomplex-1.0.0/econcomplex/core/utils.py +137 -0
  26. econcomplex-1.0.0/econcomplex/dynamics/__init__.py +10 -0
  27. econcomplex-1.0.0/econcomplex/dynamics/entry_exit.py +248 -0
  28. econcomplex-1.0.0/econcomplex/dynamics/growth.py +146 -0
  29. econcomplex-1.0.0/econcomplex/inequality/__init__.py +11 -0
  30. econcomplex-1.0.0/econcomplex/inequality/concentration.py +148 -0
  31. econcomplex-1.0.0/econcomplex/inequality/gini.py +164 -0
  32. econcomplex-1.0.0/econcomplex/optimization/__init__.py +46 -0
  33. econcomplex-1.0.0/econcomplex/optimization/diffusion.py +379 -0
  34. econcomplex-1.0.0/econcomplex/optimization/growth_target.py +170 -0
  35. econcomplex-1.0.0/econcomplex/optimization/portfolio.py +178 -0
  36. econcomplex-1.0.0/econcomplex/optimization/steppingstone.py +267 -0
  37. econcomplex-1.0.0/econcomplex/outlook/__init__.py +6 -0
  38. econcomplex-1.0.0/econcomplex/outlook/coi_cog.py +168 -0
  39. econcomplex-1.0.0/econcomplex/patents/__init__.py +7 -0
  40. econcomplex-1.0.0/econcomplex/patents/recombination.py +135 -0
  41. econcomplex-1.0.0/econcomplex/pipeline.py +255 -0
  42. econcomplex-1.0.0/econcomplex/productivity/__init__.py +8 -0
  43. econcomplex-1.0.0/econcomplex/productivity/prody.py +218 -0
  44. econcomplex-1.0.0/econcomplex/relatedness/__init__.py +25 -0
  45. econcomplex-1.0.0/econcomplex/relatedness/cooccurrence.py +173 -0
  46. econcomplex-1.0.0/econcomplex/relatedness/cross_space.py +142 -0
  47. econcomplex-1.0.0/econcomplex/relatedness/density.py +232 -0
  48. econcomplex-1.0.0/econcomplex/relatedness/proximity.py +214 -0
  49. econcomplex-1.0.0/econcomplex/specialization/__init__.py +17 -0
  50. econcomplex-1.0.0/econcomplex/specialization/location_quotient.py +163 -0
  51. econcomplex-1.0.0/econcomplex/specialization/similarity.py +68 -0
  52. econcomplex-1.0.0/econcomplex.egg-info/PKG-INFO +223 -0
  53. econcomplex-1.0.0/econcomplex.egg-info/SOURCES.txt +62 -0
  54. econcomplex-1.0.0/econcomplex.egg-info/dependency_links.txt +1 -0
  55. econcomplex-1.0.0/econcomplex.egg-info/requires.txt +14 -0
  56. econcomplex-1.0.0/econcomplex.egg-info/top_level.txt +1 -0
  57. econcomplex-1.0.0/examples/basic_usage.py +261 -0
  58. econcomplex-1.0.0/examples/eci_optimization.py +129 -0
  59. econcomplex-1.0.0/pyproject.toml +63 -0
  60. econcomplex-1.0.0/setup.cfg +4 -0
  61. econcomplex-1.0.0/setup.py +4 -0
  62. econcomplex-1.0.0/tests/test_core.py +514 -0
  63. econcomplex-1.0.0/tests/test_diffusion.py +145 -0
  64. econcomplex-1.0.0/tests/test_optimization.py +200 -0
@@ -0,0 +1,35 @@
1
+ # Changelog
2
+
3
+ ## [1.0.0] — 2026-06-12
4
+
5
+ First official release.
6
+
7
+ ### Added
8
+ - **Single entry point for complexity**: `eci_pci(mat, method="eigenvector" | "reflections" | "fitness")` in its own module (`complexity/eci_pci.py`), mirroring the interface of the R `economiccomplexity` package. Method-specific implementations remain public for advanced use (`eci_pci_eigenvector`, `method_of_reflections`, `fitness_complexity`).
9
+ - **Pre-processing for sparse data**: `trim_core(mat, dmin, umin)` iteratively prunes degenerate units (zero diversity/ubiquity) recomputing RCA each pass; applied automatically by `eci_pci` (`trim=True`, trimmed units returned as `NaN`). Use `dmin=2, umin=2` for the well-connected core recommended for subnational data.
10
+ - **ECI Optimization** (Stojkoski & Hidalgo 2026, *Research Policy* 55:105454): `calibrate_steppingstone`, `effort_matrix`, `forecast_specialization`, and `eci_optimization` (exact 0–1 program via `scipy.optimize.milp`), plus growth targeting (`calibrate_growth_model`, `eci_target_for_growth`, `expected_growth`).
11
+ - **Strategic diffusion** (Alshamsi, Pinheiro & Hidalgo 2018, *Nat. Commun.* 9:1328): `proximity_network`, `calibrate_contagion`, `activation_probabilities`, `diversification_strategy` (5 strategies), `expected_diversification_time` (validated against the paper's closed-form eq. 2), `compare_strategies`, and `optimize_sequence` (simulated annealing).
12
+ - **Documented short API**: aliases bound to the canonical functions (`density`, `relatedness`, `hhi`, `coi`, `cog`, `pgi`, `peii`, `spec_coefficient`, `cross_space_proximity`), new functions `make_sample_data`, `cosine_proximity`, `correlation_proximity`, and long-format panel wrappers `growth_rates`, `entry_tracking`, `exit_tracking`.
13
+ - `log_fitness` option (Cristelli et al. 2015 log scale) and convergence `tol` for the Method of Reflections; non-convergence warning for Fitness-Complexity.
14
+ - `continuous_method="correlation" | "cosine"` in `proximity()`/`continuous_proximity()`.
15
+ - Runnable examples in `examples/` (`basic_usage.py`, `eci_optimization.py`) and an API map section in the documentation.
16
+ - Expanded documentation: complete auto-generated API reference (87 functions, `docs/generate_api_reference.py`), indicator interpretation guide, and rewritten bilingual READMEs with quickstarts, data format, validation notes, and BibTeX citation.
17
+
18
+ ### Changed
19
+ - Default iterations unified at **20** for reflections and fitness (matching the R `economiccomplexity` package; for fitness it is a cap with early stopping at convergence).
20
+ - Method of Reflections output is now sign-oriented like the eigenvector method (ECI correlates positively with diversity, PCI negatively with ubiquity).
21
+ - `compute_complexity` routes all methods through the `eci_pci` dispatcher (validation and trimming included).
22
+ - Internal relative relatedness of the optimization module now follows Pinheiro et al. (2022, eq. 7) exactly (z-transform over the option set).
23
+ - SciPy minimum raised to 1.9 (`scipy.optimize.milp`).
24
+ - Documentation (EN/PT) fully revised: GitHub installation, real API signatures, new sections (pre-processing, ECI Optimization, strategic diffusion), corrected references (Tacchella et al. 2012).
25
+
26
+ ### Fixed
27
+ - NumPy 2.x compatibility: `np.trapz` removal broke `locational_gini`/`hoover_gini`.
28
+ - `cross_relatedness` returned the wrong column labels (crashed whenever the two spaces had different sizes).
29
+ - Missing standard-deviation guards in the eigenvector sign correction (silent NaN on degenerate matrices).
30
+ - Edge-case guards in the optimization layer (`b1 ≈ 0` effort, `a1 + a3·z ≈ 0` growth-target inversion).
31
+ - Documentation/API mismatches reported by external testing (`rpop`, `pgi`, `peii`, `expy` signatures; `coi`/`cog` argument order; `proximity` return type).
32
+
33
+ ## [0.1.0] — 2026 (initial public release)
34
+
35
+ - Core indicators: RCA, RPOP, Mcp, diversity/ubiquity; ECI/PCI (eigenvector, reflections, fitness); proximity, relatedness density, co-occurrence, cross-space; specialization, inequality, productivity, patents, dynamics, COI/COG; `compute_complexity` pipeline.
@@ -0,0 +1,22 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Elton Freitas
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
@@ -0,0 +1,7 @@
1
+ include README.md
2
+ include README.pt-BR.md
3
+ include LICENSE
4
+ include CHANGELOG.md
5
+ recursive-include docs *.pdf *.tex *.py
6
+ recursive-include examples *.py
7
+
@@ -0,0 +1,223 @@
1
+ Metadata-Version: 2.4
2
+ Name: econcomplex
3
+ Version: 1.0.0
4
+ Summary: Python library for economic complexity and regional science indicators
5
+ Author: Elton Freitas
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/eltonfreitas/econcomplex
8
+ Project-URL: Documentation, https://github.com/eltonfreitas/econcomplex/tree/main/docs
9
+ Project-URL: Changelog, https://github.com/eltonfreitas/econcomplex/blob/main/CHANGELOG.md
10
+ Project-URL: Issues, https://github.com/eltonfreitas/econcomplex/issues
11
+ Keywords: economic complexity,regional science,economic geography,product space,relatedness
12
+ Classifier: Development Status :: 5 - Production/Stable
13
+ Classifier: Intended Audience :: Science/Research
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3 :: Only
16
+ Classifier: Programming Language :: Python :: 3.9
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Topic :: Scientific/Engineering :: Information Analysis
21
+ Requires-Python: >=3.9
22
+ Description-Content-Type: text/markdown
23
+ License-File: LICENSE
24
+ Requires-Dist: numpy>=1.21
25
+ Requires-Dist: pandas>=1.3
26
+ Requires-Dist: scipy>=1.9
27
+ Provides-Extra: dev
28
+ Requires-Dist: black; extra == "dev"
29
+ Requires-Dist: build; extra == "dev"
30
+ Requires-Dist: isort; extra == "dev"
31
+ Requires-Dist: pytest; extra == "dev"
32
+ Requires-Dist: pytest-cov; extra == "dev"
33
+ Requires-Dist: twine; extra == "dev"
34
+ Provides-Extra: network
35
+ Requires-Dist: networkx>=2.6; extra == "network"
36
+ Dynamic: license-file
37
+
38
+ # econcomplex
39
+
40
+ [![version](https://img.shields.io/badge/version-1.0.0-blue)](CHANGELOG.md)
41
+ [![python](https://img.shields.io/badge/python-3.9%2B-blue)](pyproject.toml)
42
+ [![license](https://img.shields.io/badge/license-MIT-green)](LICENSE)
43
+ [![tests](https://img.shields.io/badge/tests-81%20passing-brightgreen)](tests/)
44
+
45
+ **econcomplex** is a Python library for **economic complexity and regional
46
+ science indicators**. It consolidates, in a single coherent API, the tools
47
+ scattered across the reference packages of the field — `EconGeo` (R),
48
+ `economiccomplexity` (R), `py-ecomplexity`, `py-economic-complexity` — and
49
+ adds a **target-oriented optimization layer** (ECI Optimization and strategic
50
+ diffusion) that, to our knowledge, is not available in any other package.
51
+
52
+ *Leia em português: [README.pt-BR.md](README.pt-BR.md).*
53
+
54
+ ---
55
+
56
+ ## What it computes
57
+
58
+ | Group | Indicators |
59
+ |---|---|
60
+ | **Complexity** | ECI / PCI through a single entry point — `eci_pci(mat, method="eigenvector" \| "reflections" \| "fitness")` — plus subnational ECI projected with an external PCI |
61
+ | **Relatedness / Product Space** | Proximity (discrete, correlation, cosine), relatedness density, distance, relative relatedness (option-set z-score), co-occurrence indices, cross-space proximity between two activity spaces |
62
+ | **Specialization** | Location quotient, Hachman, Krugman, Hoover specialization coefficient, export similarity |
63
+ | **Inequality / Concentration** | Gini, locational Gini, Hoover-Gini, Hoover index, Herfindahl-Hirschman, Shannon entropy |
64
+ | **Productivity** | PRODY, EXPY, Product Gini Index, Product Emissions Intensity Index |
65
+ | **Patents** | Ease of recombination, modular complexity |
66
+ | **Dynamics** | Growth rates, entry/exit tracking — matrix-pair and long-panel APIs |
67
+ | **Outlook** | Complexity Outlook Index (COI) and Gain (COG) |
68
+ | **ECI Optimization** | Stepping-stone forecast model, entry-effort matrix, exact 0–1 program for minimal-effort diversification portfolios, growth targeting (Stojkoski & Hidalgo 2026) |
69
+ | **Strategic diffusion** | Complex-contagion calibration, five diversification strategies, optimal entry sequencing (Alshamsi, Pinheiro & Hidalgo 2018) |
70
+
71
+ 87 public functions in total — the PDF documentation carries a complete API
72
+ reference and an interpretation guide for every indicator family.
73
+
74
+ ## Installation
75
+
76
+ ```bash
77
+ pip install econcomplex
78
+ ```
79
+
80
+ Or, for the latest development version straight from GitHub:
81
+
82
+ ```bash
83
+ pip install git+https://github.com/eltonfreitas/econcomplex.git
84
+ ```
85
+
86
+ Requires Python ≥ 3.9 with `numpy ≥ 1.21` (1.x **and** 2.x supported),
87
+ `pandas ≥ 1.3`, `scipy ≥ 1.9`. For local development:
88
+
89
+ ```bash
90
+ git clone https://github.com/eltonfreitas/econcomplex.git
91
+ cd econcomplex
92
+ pip install -e .[dev]
93
+ pytest # 81 tests
94
+ ```
95
+
96
+ ## Quick start
97
+
98
+ ### 1. One call, every indicator (long-format data)
99
+
100
+ ```python
101
+ import pandas as pd
102
+ import econcomplex as ec
103
+
104
+ df = pd.read_csv("my_data.csv") # columns: region, sector, employment[, year]
105
+
106
+ result = ec.compute_complexity(
107
+ df,
108
+ cols={"loc": "region", "act": "sector", "val": "employment", "time": "year"},
109
+ method="eigenvector", # or "reflections" / "fitness"
110
+ )
111
+ # adds columns: rca, mcp, diversity, ubiquity, eci, pci, density, distance, coi, cog
112
+ # with a "time" column the pipeline recomputes everything per period automatically
113
+ ```
114
+
115
+ ### 2. Working with matrices
116
+
117
+ ```python
118
+ mat = ec.pivot_to_matrix(df, "region", "sector", "employment")
119
+
120
+ eci, pci = ec.eci_pci(mat) # eigenvector method (default)
121
+ eci2, pci2 = ec.eci_pci(mat, method="fitness") # same call, other method
122
+
123
+ phi = ec.proximity(mat)["product"] # product space
124
+ density = ec.density(mat, phi=phi) # 0–100 % relatedness density
125
+ coi = ec.coi(mat, pci, phi=phi) # diversification potential
126
+ ```
127
+
128
+ Degenerate units (zero diversity or ubiquity) are **trimmed automatically**
129
+ and returned as `NaN`; for very sparse data (e.g. municipal trade) use the
130
+ well-connected core: `ec.eci_pci(mat, dmin=2, umin=2)` or `ec.trim_core(mat, 2, 2)`.
131
+
132
+ ### 3. Diversification targets (ECI Optimization)
133
+
134
+ Requires a panel with at least the periods *t*, *t+τ* and *t+Δt*:
135
+
136
+ ```python
137
+ model = ec.calibrate_steppingstone(panel, "region", "sector", "employment",
138
+ "year", horizon=10, steppingstone=5)
139
+
140
+ portfolio = ec.eci_optimization(mat, model, delta_eci=0.1)
141
+ # → minimal-effort set of new activities per region that raises its ECI by 0.1
142
+
143
+ # Growth targeting: convert a 3.5 %/yr target into an ECI target
144
+ gm = ec.calibrate_growth_model(macro, "region", "year", "gdppc", "eci")
145
+ eci_star = ec.eci_target_for_growth(gm, 0.035, gdppc_now)
146
+ portfolio = ec.eci_optimization(mat, model, target_eci=eci_star)
147
+
148
+ # When to make unrelated bets (strategic diffusion)
149
+ adj = ec.proximity_network(mat)
150
+ fit = ec.calibrate_contagion(panel, "region", "sector", "employment", "year",
151
+ adjacency=adj)
152
+ best = ec.optimize_sequence(adj, ec.mcp(mat).loc["my_region"],
153
+ B=fit["B"], alpha=fit["alpha"])
154
+ ```
155
+
156
+ ## Data format
157
+
158
+ The high-level API expects **long-format** (tidy) data — one row per
159
+ (location, activity[, period]):
160
+
161
+ | region | sector | employment | year |
162
+ |---|---|---:|---|
163
+ | SP | cnae_10 | 12345 | 2022 |
164
+ | SP | cnae_25 | 6789 | 2022 |
165
+ | RJ | cnae_10 | 9012 | 2022 |
166
+
167
+ Requirements: no duplicate (location, activity, period) rows, non-negative
168
+ values, no `NaN`, a single geographic level and a single activity
169
+ classification per analysis. Works with employment, exports, patents,
170
+ payroll — anything shaped location × activity × value. To experiment without
171
+ data: `df = ec.make_sample_data(n_locs=50, n_acts=30, seed=42)`.
172
+
173
+ ## Documentation and examples
174
+
175
+ - **Technical documentation (PDF)** — formulas, step-by-step usage,
176
+ interpretation guide, and the complete API reference:
177
+ [English](docs/econcomplex_documentation_en.pdf) ·
178
+ [Português](docs/econcomplex_documentation_pt.pdf)
179
+ (LaTeX sources in [docs/](docs/))
180
+ - **Runnable examples**: [examples/basic_usage.py](examples/basic_usage.py)
181
+ (guided tour of every indicator group) and
182
+ [examples/eci_optimization.py](examples/eci_optimization.py)
183
+ (optimization layer end to end)
184
+ - **In-code reference**: every function has a full NumPy-style docstring —
185
+ `help(ec.eci_pci)`
186
+ - **[CHANGELOG.md](CHANGELOG.md)** — release history
187
+
188
+ The API has three layers (detailed map in the PDF): *entry points* such as
189
+ `eci_pci` and `compute_complexity`; *advanced implementations* they delegate
190
+ to (`method_of_reflections`, `fitness_complexity`, …); and short *aliases*
191
+ bound to the same objects (`density`, `hhi`, `coi`, `pgi`, …).
192
+
193
+ ## Validation
194
+
195
+ The 81-test suite includes exact validations against the literature: the
196
+ eigenvector ECI/PCI uses the proper non-symmetric solver; the strategic
197
+ diffusion module reproduces the closed-form solution of Alshamsi et al.
198
+ (2018, eq. 2) on the wheel network; relative relatedness follows Pinheiro
199
+ et al. (2022, eq. 7) exactly; and the 0–1 portfolio program is solved
200
+ exactly with `scipy.optimize.milp`. On the 2022–2024 BACI trade data the
201
+ library recovers the canonical ECI country ranking.
202
+
203
+ ## Citation
204
+
205
+ ```bibtex
206
+ @software{freitas_econcomplex_2026,
207
+ author = {Freitas, Elton},
208
+ title = {econcomplex: economic complexity and regional science indicators in Python},
209
+ year = {2026},
210
+ version = {1.0.0},
211
+ url = {https://github.com/eltonfreitas/econcomplex}
212
+ }
213
+ ```
214
+
215
+ Please also cite the original papers of the indicators you use — full list
216
+ in the PDF documentation. Key references: Hidalgo & Hausmann (2009, *PNAS*);
217
+ Hidalgo et al. (2007, *Science*); Tacchella et al. (2012, *Sci. Rep.*);
218
+ Alshamsi, Pinheiro & Hidalgo (2018, *Nat. Commun.*); Pinheiro et al. (2022,
219
+ *Res. Policy*); Stojkoski & Hidalgo (2026, *Res. Policy*).
220
+
221
+ ## License
222
+
223
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,186 @@
1
+ # econcomplex
2
+
3
+ [![version](https://img.shields.io/badge/version-1.0.0-blue)](CHANGELOG.md)
4
+ [![python](https://img.shields.io/badge/python-3.9%2B-blue)](pyproject.toml)
5
+ [![license](https://img.shields.io/badge/license-MIT-green)](LICENSE)
6
+ [![tests](https://img.shields.io/badge/tests-81%20passing-brightgreen)](tests/)
7
+
8
+ **econcomplex** is a Python library for **economic complexity and regional
9
+ science indicators**. It consolidates, in a single coherent API, the tools
10
+ scattered across the reference packages of the field — `EconGeo` (R),
11
+ `economiccomplexity` (R), `py-ecomplexity`, `py-economic-complexity` — and
12
+ adds a **target-oriented optimization layer** (ECI Optimization and strategic
13
+ diffusion) that, to our knowledge, is not available in any other package.
14
+
15
+ *Leia em português: [README.pt-BR.md](README.pt-BR.md).*
16
+
17
+ ---
18
+
19
+ ## What it computes
20
+
21
+ | Group | Indicators |
22
+ |---|---|
23
+ | **Complexity** | ECI / PCI through a single entry point — `eci_pci(mat, method="eigenvector" \| "reflections" \| "fitness")` — plus subnational ECI projected with an external PCI |
24
+ | **Relatedness / Product Space** | Proximity (discrete, correlation, cosine), relatedness density, distance, relative relatedness (option-set z-score), co-occurrence indices, cross-space proximity between two activity spaces |
25
+ | **Specialization** | Location quotient, Hachman, Krugman, Hoover specialization coefficient, export similarity |
26
+ | **Inequality / Concentration** | Gini, locational Gini, Hoover-Gini, Hoover index, Herfindahl-Hirschman, Shannon entropy |
27
+ | **Productivity** | PRODY, EXPY, Product Gini Index, Product Emissions Intensity Index |
28
+ | **Patents** | Ease of recombination, modular complexity |
29
+ | **Dynamics** | Growth rates, entry/exit tracking — matrix-pair and long-panel APIs |
30
+ | **Outlook** | Complexity Outlook Index (COI) and Gain (COG) |
31
+ | **ECI Optimization** | Stepping-stone forecast model, entry-effort matrix, exact 0–1 program for minimal-effort diversification portfolios, growth targeting (Stojkoski & Hidalgo 2026) |
32
+ | **Strategic diffusion** | Complex-contagion calibration, five diversification strategies, optimal entry sequencing (Alshamsi, Pinheiro & Hidalgo 2018) |
33
+
34
+ 87 public functions in total — the PDF documentation carries a complete API
35
+ reference and an interpretation guide for every indicator family.
36
+
37
+ ## Installation
38
+
39
+ ```bash
40
+ pip install econcomplex
41
+ ```
42
+
43
+ Or, for the latest development version straight from GitHub:
44
+
45
+ ```bash
46
+ pip install git+https://github.com/eltonfreitas/econcomplex.git
47
+ ```
48
+
49
+ Requires Python ≥ 3.9 with `numpy ≥ 1.21` (1.x **and** 2.x supported),
50
+ `pandas ≥ 1.3`, `scipy ≥ 1.9`. For local development:
51
+
52
+ ```bash
53
+ git clone https://github.com/eltonfreitas/econcomplex.git
54
+ cd econcomplex
55
+ pip install -e .[dev]
56
+ pytest # 81 tests
57
+ ```
58
+
59
+ ## Quick start
60
+
61
+ ### 1. One call, every indicator (long-format data)
62
+
63
+ ```python
64
+ import pandas as pd
65
+ import econcomplex as ec
66
+
67
+ df = pd.read_csv("my_data.csv") # columns: region, sector, employment[, year]
68
+
69
+ result = ec.compute_complexity(
70
+ df,
71
+ cols={"loc": "region", "act": "sector", "val": "employment", "time": "year"},
72
+ method="eigenvector", # or "reflections" / "fitness"
73
+ )
74
+ # adds columns: rca, mcp, diversity, ubiquity, eci, pci, density, distance, coi, cog
75
+ # with a "time" column the pipeline recomputes everything per period automatically
76
+ ```
77
+
78
+ ### 2. Working with matrices
79
+
80
+ ```python
81
+ mat = ec.pivot_to_matrix(df, "region", "sector", "employment")
82
+
83
+ eci, pci = ec.eci_pci(mat) # eigenvector method (default)
84
+ eci2, pci2 = ec.eci_pci(mat, method="fitness") # same call, other method
85
+
86
+ phi = ec.proximity(mat)["product"] # product space
87
+ density = ec.density(mat, phi=phi) # 0–100 % relatedness density
88
+ coi = ec.coi(mat, pci, phi=phi) # diversification potential
89
+ ```
90
+
91
+ Degenerate units (zero diversity or ubiquity) are **trimmed automatically**
92
+ and returned as `NaN`; for very sparse data (e.g. municipal trade) use the
93
+ well-connected core: `ec.eci_pci(mat, dmin=2, umin=2)` or `ec.trim_core(mat, 2, 2)`.
94
+
95
+ ### 3. Diversification targets (ECI Optimization)
96
+
97
+ Requires a panel with at least the periods *t*, *t+τ* and *t+Δt*:
98
+
99
+ ```python
100
+ model = ec.calibrate_steppingstone(panel, "region", "sector", "employment",
101
+ "year", horizon=10, steppingstone=5)
102
+
103
+ portfolio = ec.eci_optimization(mat, model, delta_eci=0.1)
104
+ # → minimal-effort set of new activities per region that raises its ECI by 0.1
105
+
106
+ # Growth targeting: convert a 3.5 %/yr target into an ECI target
107
+ gm = ec.calibrate_growth_model(macro, "region", "year", "gdppc", "eci")
108
+ eci_star = ec.eci_target_for_growth(gm, 0.035, gdppc_now)
109
+ portfolio = ec.eci_optimization(mat, model, target_eci=eci_star)
110
+
111
+ # When to make unrelated bets (strategic diffusion)
112
+ adj = ec.proximity_network(mat)
113
+ fit = ec.calibrate_contagion(panel, "region", "sector", "employment", "year",
114
+ adjacency=adj)
115
+ best = ec.optimize_sequence(adj, ec.mcp(mat).loc["my_region"],
116
+ B=fit["B"], alpha=fit["alpha"])
117
+ ```
118
+
119
+ ## Data format
120
+
121
+ The high-level API expects **long-format** (tidy) data — one row per
122
+ (location, activity[, period]):
123
+
124
+ | region | sector | employment | year |
125
+ |---|---|---:|---|
126
+ | SP | cnae_10 | 12345 | 2022 |
127
+ | SP | cnae_25 | 6789 | 2022 |
128
+ | RJ | cnae_10 | 9012 | 2022 |
129
+
130
+ Requirements: no duplicate (location, activity, period) rows, non-negative
131
+ values, no `NaN`, a single geographic level and a single activity
132
+ classification per analysis. Works with employment, exports, patents,
133
+ payroll — anything shaped location × activity × value. To experiment without
134
+ data: `df = ec.make_sample_data(n_locs=50, n_acts=30, seed=42)`.
135
+
136
+ ## Documentation and examples
137
+
138
+ - **Technical documentation (PDF)** — formulas, step-by-step usage,
139
+ interpretation guide, and the complete API reference:
140
+ [English](docs/econcomplex_documentation_en.pdf) ·
141
+ [Português](docs/econcomplex_documentation_pt.pdf)
142
+ (LaTeX sources in [docs/](docs/))
143
+ - **Runnable examples**: [examples/basic_usage.py](examples/basic_usage.py)
144
+ (guided tour of every indicator group) and
145
+ [examples/eci_optimization.py](examples/eci_optimization.py)
146
+ (optimization layer end to end)
147
+ - **In-code reference**: every function has a full NumPy-style docstring —
148
+ `help(ec.eci_pci)`
149
+ - **[CHANGELOG.md](CHANGELOG.md)** — release history
150
+
151
+ The API has three layers (detailed map in the PDF): *entry points* such as
152
+ `eci_pci` and `compute_complexity`; *advanced implementations* they delegate
153
+ to (`method_of_reflections`, `fitness_complexity`, …); and short *aliases*
154
+ bound to the same objects (`density`, `hhi`, `coi`, `pgi`, …).
155
+
156
+ ## Validation
157
+
158
+ The 81-test suite includes exact validations against the literature: the
159
+ eigenvector ECI/PCI uses the proper non-symmetric solver; the strategic
160
+ diffusion module reproduces the closed-form solution of Alshamsi et al.
161
+ (2018, eq. 2) on the wheel network; relative relatedness follows Pinheiro
162
+ et al. (2022, eq. 7) exactly; and the 0–1 portfolio program is solved
163
+ exactly with `scipy.optimize.milp`. On the 2022–2024 BACI trade data the
164
+ library recovers the canonical ECI country ranking.
165
+
166
+ ## Citation
167
+
168
+ ```bibtex
169
+ @software{freitas_econcomplex_2026,
170
+ author = {Freitas, Elton},
171
+ title = {econcomplex: economic complexity and regional science indicators in Python},
172
+ year = {2026},
173
+ version = {1.0.0},
174
+ url = {https://github.com/eltonfreitas/econcomplex}
175
+ }
176
+ ```
177
+
178
+ Please also cite the original papers of the indicators you use — full list
179
+ in the PDF documentation. Key references: Hidalgo & Hausmann (2009, *PNAS*);
180
+ Hidalgo et al. (2007, *Science*); Tacchella et al. (2012, *Sci. Rep.*);
181
+ Alshamsi, Pinheiro & Hidalgo (2018, *Nat. Commun.*); Pinheiro et al. (2022,
182
+ *Res. Policy*); Stojkoski & Hidalgo (2026, *Res. Policy*).
183
+
184
+ ## License
185
+
186
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,187 @@
1
+ # econcomplex
2
+
3
+ [![version](https://img.shields.io/badge/vers%C3%A3o-1.0.0-blue)](CHANGELOG.md)
4
+ [![python](https://img.shields.io/badge/python-3.9%2B-blue)](pyproject.toml)
5
+ [![license](https://img.shields.io/badge/licen%C3%A7a-MIT-green)](LICENSE)
6
+ [![tests](https://img.shields.io/badge/testes-81%20passando-brightgreen)](tests/)
7
+
8
+ **econcomplex** é uma biblioteca Python para **indicadores de complexidade
9
+ econômica e ciência regional**. Ela consolida, numa API única e coerente, as
10
+ ferramentas espalhadas pelos pacotes de referência da área — `EconGeo` (R),
11
+ `economiccomplexity` (R), `py-ecomplexity`, `py-economic-complexity` — e
12
+ adiciona uma **camada de otimização orientada a metas** (Otimização de ECI e
13
+ difusão estratégica) que, até onde sabemos, não existe em nenhum outro pacote.
14
+
15
+ *Read in English: [README.md](README.md).*
16
+
17
+ ---
18
+
19
+ ## O que ela calcula
20
+
21
+ | Grupo | Indicadores |
22
+ |---|---|
23
+ | **Complexidade** | ECI / PCI por uma porta de entrada única — `eci_pci(mat, method="eigenvector" \| "reflections" \| "fitness")` — além de ECI subnacional projetado com PCI externo |
24
+ | **Relatedness / Product Space** | Proximidade (discreta, correlação, cosseno), densidade de relatedness, distância, relatedness relativa (z-score no option set), índices de coocorrência, proximidade cross-space entre dois espaços de atividades |
25
+ | **Especialização** | Quociente locacional, Hachman, Krugman, coeficiente de especialização de Hoover, similaridade de exportações |
26
+ | **Desigualdade / Concentração** | Gini, Gini locacional, Hoover-Gini, índice de Hoover, Herfindahl-Hirschman, entropia de Shannon |
27
+ | **Produtividade** | PRODY, EXPY, Product Gini Index, Product Emissions Intensity Index |
28
+ | **Patentes** | Facilidade de recombinação, complexidade modular |
29
+ | **Dinâmica** | Taxas de crescimento, rastreamento de entrada/saída — APIs por pares de matrizes e por painel longo |
30
+ | **Outlook** | Complexity Outlook Index (COI) e Gain (COG) |
31
+ | **Otimização de ECI** | Modelo de previsão com steppingstone, matriz de esforço de entrada, programa 0–1 exato para portfólios de diversificação de menor esforço, metas de crescimento (Stojkoski & Hidalgo 2026) |
32
+ | **Difusão estratégica** | Calibração de contágio complexo, cinco estratégias de diversificação, sequenciamento ótimo de entrada (Alshamsi, Pinheiro & Hidalgo 2018) |
33
+
34
+ São 87 funções públicas — a documentação em PDF traz a referência completa da
35
+ API e um guia de interpretação para cada família de indicadores.
36
+
37
+ ## Instalação
38
+
39
+ ```bash
40
+ pip install econcomplex
41
+ ```
42
+
43
+ Ou, para a versão de desenvolvimento mais recente direto do GitHub:
44
+
45
+ ```bash
46
+ pip install git+https://github.com/eltonfreitas/econcomplex.git
47
+ ```
48
+
49
+ Exige Python ≥ 3.9 com `numpy ≥ 1.21` (compatível com 1.x **e** 2.x),
50
+ `pandas ≥ 1.3`, `scipy ≥ 1.9`. Para desenvolvimento local:
51
+
52
+ ```bash
53
+ git clone https://github.com/eltonfreitas/econcomplex.git
54
+ cd econcomplex
55
+ pip install -e .[dev]
56
+ pytest # 81 testes
57
+ ```
58
+
59
+ ## Começando
60
+
61
+ ### 1. Uma chamada, todos os indicadores (dados em formato longo)
62
+
63
+ ```python
64
+ import pandas as pd
65
+ import econcomplex as ec
66
+
67
+ df = pd.read_csv("meus_dados.csv") # colunas: regiao, setor, emprego[, ano]
68
+
69
+ resultado = ec.compute_complexity(
70
+ df,
71
+ cols={"loc": "regiao", "act": "setor", "val": "emprego", "time": "ano"},
72
+ method="eigenvector", # ou "reflections" / "fitness"
73
+ )
74
+ # adiciona as colunas: rca, mcp, diversity, ubiquity, eci, pci, density,
75
+ # distance, coi, cog — com a coluna de tempo, recalcula tudo por período
76
+ ```
77
+
78
+ ### 2. Trabalhando com matrizes
79
+
80
+ ```python
81
+ mat = ec.pivot_to_matrix(df, "regiao", "setor", "emprego")
82
+
83
+ eci, pci = ec.eci_pci(mat) # método do autovetor (padrão)
84
+ eci2, pci2 = ec.eci_pci(mat, method="fitness") # mesma chamada, outro método
85
+
86
+ phi = ec.proximity(mat)["product"] # product space
87
+ density = ec.density(mat, phi=phi) # densidade de relatedness (0–100 %)
88
+ coi = ec.coi(mat, pci, phi=phi) # potencial de diversificação
89
+ ```
90
+
91
+ Unidades degeneradas (diversidade ou ubiquidade zero) são **podadas
92
+ automaticamente** e retornadas como `NaN`; para dados muito esparsos (ex.:
93
+ comércio municipal) use o núcleo conectado: `ec.eci_pci(mat, dmin=2, umin=2)`
94
+ ou `ec.trim_core(mat, 2, 2)`.
95
+
96
+ ### 3. Alvos de diversificação (Otimização de ECI)
97
+
98
+ Exige um painel com ao menos os períodos *t*, *t+τ* e *t+Δt*:
99
+
100
+ ```python
101
+ model = ec.calibrate_steppingstone(painel, "regiao", "setor", "emprego",
102
+ "ano", horizon=10, steppingstone=5)
103
+
104
+ portfolio = ec.eci_optimization(mat, model, delta_eci=0.1)
105
+ # → menor conjunto de esforço de novas atividades, por região, que eleva o ECI em 0.1
106
+
107
+ # Meta de crescimento: converter 3,5 %/ano em alvo de ECI
108
+ gm = ec.calibrate_growth_model(macro, "regiao", "ano", "pibpc", "eci")
109
+ eci_alvo = ec.eci_target_for_growth(gm, 0.035, pibpc_atual)
110
+ portfolio = ec.eci_optimization(mat, model, target_eci=eci_alvo)
111
+
112
+ # Quando fazer a aposta não-relacionada (difusão estratégica)
113
+ adj = ec.proximity_network(mat)
114
+ fit = ec.calibrate_contagion(painel, "regiao", "setor", "emprego", "ano",
115
+ adjacency=adj)
116
+ otimo = ec.optimize_sequence(adj, ec.mcp(mat).loc["minha_regiao"],
117
+ B=fit["B"], alpha=fit["alpha"])
118
+ ```
119
+
120
+ ## Formato dos dados
121
+
122
+ A API de alto nível espera dados em **formato longo** (tidy) — uma linha por
123
+ (local, atividade[, período]):
124
+
125
+ | regiao | setor | emprego | ano |
126
+ |---|---|---:|---|
127
+ | SP | cnae_10 | 12345 | 2022 |
128
+ | SP | cnae_25 | 6789 | 2022 |
129
+ | RJ | cnae_10 | 9012 | 2022 |
130
+
131
+ Requisitos: sem linhas duplicadas de (local, atividade, período), valores não
132
+ negativos, sem `NaN`, um único nível geográfico e uma única classificação de
133
+ atividades por análise. Funciona com emprego, exportações, patentes, massa
134
+ salarial — qualquer dado no formato local × atividade × valor. Para
135
+ experimentar sem dados: `df = ec.make_sample_data(n_locs=50, n_acts=30, seed=42)`.
136
+
137
+ ## Documentação e exemplos
138
+
139
+ - **Documentação técnica (PDF)** — fórmulas, passo a passo de uso, guia de
140
+ interpretação e a referência completa da API:
141
+ [Português](docs/econcomplex_documentation_pt.pdf) ·
142
+ [English](docs/econcomplex_documentation_en.pdf)
143
+ (fontes LaTeX em [docs/](docs/))
144
+ - **Exemplos executáveis**: [examples/basic_usage.py](examples/basic_usage.py)
145
+ (tour guiado por todos os grupos de indicadores) e
146
+ [examples/eci_optimization.py](examples/eci_optimization.py)
147
+ (camada de otimização de ponta a ponta)
148
+ - **Referência no código**: toda função tem docstring completo no estilo
149
+ NumPy — `help(ec.eci_pci)`
150
+ - **[CHANGELOG.md](CHANGELOG.md)** — histórico de versões
151
+
152
+ A API tem três camadas (mapa detalhado no PDF): *portas de entrada* como
153
+ `eci_pci` e `compute_complexity`; *implementações avançadas* para as quais
154
+ elas delegam (`method_of_reflections`, `fitness_complexity`, …); e *aliases*
155
+ curtos ligados aos mesmos objetos (`density`, `hhi`, `coi`, `pgi`, …).
156
+
157
+ ## Validação
158
+
159
+ A suíte de 81 testes inclui validações exatas contra a literatura: o ECI/PCI
160
+ por autovetor usa o solver não-simétrico correto; o módulo de difusão
161
+ estratégica reproduz a solução fechada de Alshamsi et al. (2018, eq. 2) na
162
+ rede wheel; a relatedness relativa segue Pinheiro et al. (2022, eq. 7)
163
+ exatamente; e o programa 0–1 de portfólio é resolvido de forma exata com
164
+ `scipy.optimize.milp`. Nos dados de comércio BACI 2022–2024 a biblioteca
165
+ recupera o ranking canônico de ECI dos países.
166
+
167
+ ## Citação
168
+
169
+ ```bibtex
170
+ @software{freitas_econcomplex_2026,
171
+ author = {Freitas, Elton},
172
+ title = {econcomplex: economic complexity and regional science indicators in Python},
173
+ year = {2026},
174
+ version = {1.0.0},
175
+ url = {https://github.com/eltonfreitas/econcomplex}
176
+ }
177
+ ```
178
+
179
+ Cite também os artigos originais dos indicadores utilizados — lista completa
180
+ na documentação em PDF. Referências centrais: Hidalgo & Hausmann (2009,
181
+ *PNAS*); Hidalgo et al. (2007, *Science*); Tacchella et al. (2012, *Sci.
182
+ Rep.*); Alshamsi, Pinheiro & Hidalgo (2018, *Nat. Commun.*); Pinheiro et al.
183
+ (2022, *Res. Policy*); Stojkoski & Hidalgo (2026, *Res. Policy*).
184
+
185
+ ## Licença
186
+
187
+ MIT — veja [LICENSE](LICENSE).