catalogmx 0.3.0__py3-none-any.whl → 0.4.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. catalogmx/__init__.py +133 -19
  2. catalogmx/calculators/__init__.py +113 -0
  3. catalogmx/calculators/costo_trabajador.py +213 -0
  4. catalogmx/calculators/impuestos.py +920 -0
  5. catalogmx/calculators/imss.py +370 -0
  6. catalogmx/calculators/isr.py +290 -0
  7. catalogmx/calculators/resico.py +154 -0
  8. catalogmx/catalogs/banxico/__init__.py +29 -3
  9. catalogmx/catalogs/banxico/cetes_sqlite.py +279 -0
  10. catalogmx/catalogs/banxico/inflacion_sqlite.py +302 -0
  11. catalogmx/catalogs/banxico/salarios_minimos_sqlite.py +295 -0
  12. catalogmx/catalogs/banxico/tiie_sqlite.py +279 -0
  13. catalogmx/catalogs/banxico/tipo_cambio_usd_sqlite.py +255 -0
  14. catalogmx/catalogs/banxico/udis_sqlite.py +332 -0
  15. catalogmx/catalogs/cnbv/__init__.py +9 -0
  16. catalogmx/catalogs/cnbv/sectores.py +173 -0
  17. catalogmx/catalogs/conapo/__init__.py +15 -0
  18. catalogmx/catalogs/conapo/sistema_urbano_nacional.py +50 -0
  19. catalogmx/catalogs/conapo/zonas_metropolitanas.py +230 -0
  20. catalogmx/catalogs/ift/__init__.py +1 -1
  21. catalogmx/catalogs/ift/codigos_lada.py +517 -313
  22. catalogmx/catalogs/inegi/__init__.py +17 -0
  23. catalogmx/catalogs/inegi/scian.py +127 -0
  24. catalogmx/catalogs/mexico/__init__.py +2 -0
  25. catalogmx/catalogs/mexico/giros_mercantiles.py +119 -0
  26. catalogmx/catalogs/sat/carta_porte/material_peligroso.py +5 -1
  27. catalogmx/catalogs/sat/cfdi_4/clave_prod_serv.py +78 -0
  28. catalogmx/catalogs/sat/cfdi_4/tasa_o_cuota.py +2 -1
  29. catalogmx/catalogs/sepomex/__init__.py +2 -1
  30. catalogmx/catalogs/sepomex/codigos_postales.py +30 -2
  31. catalogmx/catalogs/sepomex/codigos_postales_completo.py +261 -0
  32. catalogmx/cli.py +12 -9
  33. catalogmx/data/__init__.py +10 -0
  34. catalogmx/data/mexico_dynamic.sqlite3 +0 -0
  35. catalogmx/data/updater.py +362 -0
  36. catalogmx/generators/__init__.py +20 -0
  37. catalogmx/generators/identity.py +582 -0
  38. catalogmx/helpers.py +177 -3
  39. catalogmx/utils/__init__.py +29 -0
  40. catalogmx/utils/clabe_utils.py +417 -0
  41. catalogmx/utils/text.py +7 -1
  42. catalogmx/validators/clabe.py +52 -2
  43. catalogmx/validators/nss.py +32 -27
  44. catalogmx/validators/rfc.py +185 -52
  45. catalogmx-0.4.0.dist-info/METADATA +905 -0
  46. {catalogmx-0.3.0.dist-info → catalogmx-0.4.0.dist-info}/RECORD +51 -25
  47. {catalogmx-0.3.0.dist-info → catalogmx-0.4.0.dist-info}/WHEEL +1 -1
  48. catalogmx/catalogs/banxico/udis.py +0 -279
  49. catalogmx-0.3.0.dist-info/METADATA +0 -644
  50. {catalogmx-0.3.0.dist-info → catalogmx-0.4.0.dist-info}/entry_points.txt +0 -0
  51. {catalogmx-0.3.0.dist-info → catalogmx-0.4.0.dist-info}/licenses/AUTHORS.rst +0 -0
  52. {catalogmx-0.3.0.dist-info → catalogmx-0.4.0.dist-info}/licenses/LICENSE +0 -0
  53. {catalogmx-0.3.0.dist-info → catalogmx-0.4.0.dist-info}/top_level.txt +0 -0
@@ -1,644 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: catalogmx
3
- Version: 0.3.0
4
- Summary: Comprehensive Mexican data validators and official catalogs library - 170K+ records
5
- Author-email: Luis Fernando Barrera <luisfernando@informind.com>
6
- Maintainer-email: Luis Fernando Barrera <luisfernando@informind.com>
7
- License: BSD-2-Clause
8
- Project-URL: Homepage, https://github.com/openbancor/catalogmx
9
- Project-URL: Documentation, https://catalogmx.readthedocs.io
10
- Project-URL: Repository, https://github.com/openbancor/catalogmx.git
11
- Project-URL: Issues, https://github.com/openbancor/catalogmx/issues
12
- Project-URL: Changelog, https://github.com/openbancor/catalogmx/blob/main/CHANGELOG.rst
13
- Keywords: mexico,rfc,curp,clabe,nss,sat,cfdi,inegi,sepomex,banxico,validators,catalogs,postal-codes,geographic-data
14
- Classifier: Development Status :: 5 - Production/Stable
15
- Classifier: Intended Audience :: Developers
16
- Classifier: Intended Audience :: Financial and Insurance Industry
17
- Classifier: License :: OSI Approved :: BSD License
18
- Classifier: Operating System :: OS Independent
19
- Classifier: Programming Language :: Python
20
- Classifier: Programming Language :: Python :: 3
21
- Classifier: Programming Language :: Python :: 3.10
22
- Classifier: Programming Language :: Python :: 3.11
23
- Classifier: Programming Language :: Python :: 3.12
24
- Classifier: Programming Language :: Python :: 3.13
25
- Classifier: Programming Language :: Python :: Implementation :: CPython
26
- Classifier: Programming Language :: Python :: Implementation :: PyPy
27
- Classifier: Topic :: Office/Business :: Financial
28
- Classifier: Topic :: Software Development :: Libraries :: Python Modules
29
- Classifier: Topic :: Utilities
30
- Classifier: Typing :: Typed
31
- Requires-Python: >=3.10
32
- Description-Content-Type: text/markdown
33
- License-File: LICENSE
34
- License-File: AUTHORS.rst
35
- Requires-Dist: unidecode>=1.4.0
36
- Requires-Dist: click>=8.0.0
37
- Provides-Extra: dev
38
- Requires-Dist: pytest>=7.4.0; extra == "dev"
39
- Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
40
- Requires-Dist: black>=23.0.0; extra == "dev"
41
- Requires-Dist: ruff>=0.1.0; extra == "dev"
42
- Requires-Dist: mypy>=1.7.0; extra == "dev"
43
- Provides-Extra: docs
44
- Requires-Dist: sphinx>=7.0.0; extra == "docs"
45
- Requires-Dist: sphinx-rtd-theme>=2.0.0; extra == "docs"
46
- Requires-Dist: sphinx-autodoc-typehints>=1.25.0; extra == "docs"
47
- Provides-Extra: all
48
- Requires-Dist: catalogmx[dev,docs]; extra == "all"
49
- Dynamic: license-file
50
-
51
- # catalogmx
52
-
53
- **Comprehensive Mexican Data Validators and Official Catalogs Library**
54
-
55
- A complete multi-language library (Python 3.10+ | TypeScript 5.0+) for validating Mexican identifiers and accessing official catalogs from SAT, Banxico, INEGI, SEPOMEX, and other government agencies.
56
-
57
- [![Python Version](https://img.shields.io/pypi/pyversions/catalogmx)](https://pypi.org/project/catalogmx/)
58
- [![PyPI Version](https://img.shields.io/pypi/v/catalogmx)](https://pypi.org/project/catalogmx/)
59
- [![NPM Version](https://img.shields.io/npm/v/catalogmx)](https://www.npmjs.com/package/catalogmx)
60
- [![Coverage](https://img.shields.io/badge/coverage-93.78%25-brightgreen)](https://github.com/openbancor/catalogmx)
61
- [![Tests](https://img.shields.io/badge/tests-926%20passing-brightgreen)](https://github.com/openbancor/catalogmx)
62
- [![License](https://img.shields.io/badge/license-BSD--2--Clause-blue.svg)](LICENSE)
63
-
64
- **Languages**: [English](#) | [Español](README.es.md)
65
-
66
- ---
67
-
68
- ## Overview
69
-
70
- **catalogmx** provides production-ready tools for Mexican data validation and official catalog access:
71
-
72
- - **4 Validators**: RFC, CURP, CLABE, NSS with complete algorithms
73
- - **58 Official Catalogs**: SAT (CFDI 4.0, Comercio Exterior, Carta Porte, Nómina), INEGI, SEPOMEX, Banxico, IFT, Mexico National
74
- - **170,505+ Records**: Complete databases including 157K postal codes, 2.4K municipalities, 10K+ localities with GPS
75
- - **Economic Indicators**: Salarios Mínimos, UMA, UDI with historical data (2010-2025)
76
- - **Traffic Regulations**: Hoy No Circula CDMX with hologram exemptions and contingency rules
77
- - **SQLite Hybrid Architecture**: 22-59% size reduction for large catalogs with FTS5 full-text search
78
- - **Multi-language Support**: Python and TypeScript with identical APIs
79
- - **Type-Safe**: Full type hints (PEP 604) and TypeScript declarations
80
- - **Production Ready**: Comprehensive test coverage (1,147 tests: 926 Python + 221 TypeScript = 93.78% coverage), fully documented, and actively maintained
81
-
82
- ---
83
-
84
- ## Quick Start
85
-
86
- ### Python
87
-
88
- ```bash
89
- # Using pip
90
- pip install catalogmx
91
-
92
- # Using uv (10-100x faster)
93
- uv pip install catalogmx
94
- ```
95
-
96
- ```python
97
- from catalogmx.validators import rfc, curp
98
- from catalogmx.catalogs.sepomex import CodigosPostales
99
- from catalogmx.catalogs.inegi import LocalidadesCatalog
100
-
101
- # Validate and generate RFC
102
- is_valid = rfc.validate_rfc("XAXX010101000")
103
- rfc_code = rfc.generate_rfc_persona_fisica(
104
- nombre="Juan",
105
- apellido_paterno="Pérez",
106
- apellido_materno="López",
107
- fecha_nacimiento="1990-01-15"
108
- ) # Returns: "PELJ900115XXX"
109
-
110
- # Generate and validate CURP
111
- curp_code = curp.generate_curp(
112
- nombre="Juan",
113
- apellido_paterno="Pérez",
114
- apellido_materno="García",
115
- fecha_nacimiento="1990-05-15",
116
- sexo="H",
117
- estado="Jalisco"
118
- ) # Returns: "PEGJ900515HJCRRN09"
119
-
120
- # Search postal codes
121
- postal_codes = CodigosPostales.get_by_cp("06700")
122
- print(postal_codes[0]['asentamiento']) # "Roma Norte"
123
-
124
- # Geographic search with GPS coordinates
125
- localities = LocalidadesCatalog.get_by_coordinates(
126
- lat=19.4326, lon=-99.1332, radio_km=10
127
- )
128
- ```
129
-
130
- ### TypeScript
131
-
132
- ```bash
133
- npm install catalogmx
134
- ```
135
-
136
- ```typescript
137
- import { validateRFC, validateCURP } from 'catalogmx';
138
- import { RegimenFiscalCatalog } from 'catalogmx/catalogs';
139
-
140
- const isValid = validateRFC('XAXX010101000');
141
- const regimen = RegimenFiscalCatalog.getRegimen('605');
142
- ```
143
-
144
- ---
145
-
146
- ## Testing & Quality
147
-
148
- - ✅ **926 Tests** with **93.78% coverage**
149
- - ✅ **50+ modules at 100%** coverage
150
- - ✅ Comprehensive validator tests (CLABE, NSS, RFC, CURP)
151
- - ✅ All critical functionality fully tested
152
- - ✅ CI/CD with GitHub Actions
153
- - ✅ [View Coverage Reports](docs/testing-coverage.md)
154
-
155
- ---
156
-
157
- ## Features
158
-
159
- ### Validators
160
-
161
- **RFC (Registro Federal de Contribuyentes)**
162
- - Persona Física (13 characters) and Persona Moral (12 characters)
163
- - Homoclave calculation using Módulo 11 algorithm
164
- - Check digit validation
165
- - 170+ cacophonic word replacement
166
- - Foreign resident support
167
-
168
- **CURP (Clave Única de Registro de Población)**
169
- - 18-character validation with complete RENAPO algorithm
170
- - **CURP generation** from name, birth date, gender, and state
171
- - Check digit calculation and verification (position 18)
172
- - State code validation (32 Mexican states)
173
- - 70+ inconvenient words handling (Anexo 2)
174
- - Birth date, gender, and state extraction
175
-
176
- **CLABE (Clave Bancaria Estandarizada)**
177
- - 18-digit bank account validation
178
- - Modulo 10 check digit algorithm
179
- - Bank, branch, and account number extraction
180
- - Integration with Banxico bank catalog (110 institutions)
181
-
182
- **NSS (Número de Seguridad Social)**
183
- - 11-digit IMSS number validation
184
- - Modified Luhn algorithm check digit
185
- - Subdelegation, year, and serial extraction
186
-
187
- ### Official Catalogs
188
-
189
- **SAT (Tax Administration Service)** - 31 catalogs
190
- - **CFDI 4.0 Core**: 11 catalogs including tax regimes, CFDI uses, payment methods, product/service keys (52K+ with SQLite hybrid), unit codes
191
- - **Comercio Exterior 2.0**: 8 catalogs including Incoterms, countries, currencies, customs procedures, tax ID registry
192
- - **Carta Porte 3.0**: 7 catalogs including airports, seaports, highways, dangerous materials (UN codes), packaging types
193
- - **Nómina 1.2**: 7 catalogs including payroll types, contracts, work shifts, IMSS risk levels
194
- - **Tax Calculators**: IEPS, ISR (historical tables 2002-2025), IVA, withholdings, local taxes
195
-
196
- **INEGI (Geographic Data)** - 4 catalogs
197
- - **Complete municipalities**: 2,478 records with population data (Census 2020)
198
- - **Localities with GPS**: 10,635 localities (1,000+ inhabitants) with SQLite hybrid architecture
199
- - **States**: Complete 32 Mexican states with geographic codes
200
- - Geographic search by coordinates with radius filtering
201
- - Urban/rural classification
202
-
203
- **SEPOMEX (Postal Service)** - 2 catalogs
204
- - **Complete postal codes**: 157,252 records (largest catalog)
205
- - **Simplified postal codes**: Fast lookup version
206
- - All 32 Mexican states (100% coverage)
207
- - Search by postal code, municipality, or state
208
-
209
- **Banxico (Central Bank)** - 3 catalogs
210
- - **Financial institutions**: 110 banks with SPEI participation
211
- - **Currencies**: ISO 4217 codes with exchange rate availability
212
- - Bank code validation and lookup
213
-
214
- **IFT (Telecommunications)** - 2 catalogs
215
- - **LADA codes**: Mexican area codes with geographic coverage
216
- - **Mobile operators**: Telecom providers and network identifiers
217
-
218
- **Mexico National Catalogs** - 6 catalogs
219
- - **License Plates (Placas)**: 35 official vehicle plate formats by NOM-001-SCT-2-2016 (particular, federal, diplomatic, military, emergency services, etc.)
220
- - **Minimum Wages (Salarios Mínimos)**: Historical minimum wages 2010-2025 (daily, monthly, annual)
221
- - **UMA**: Unidad de Medida y Actualización 2017-2025 (reference unit for fines/taxes)
222
- - **UDI (Banxico)**: Unidades de Inversión with historical values (inflation-indexed investment units)
223
- - **Hoy No Circula CDMX**: Traffic restrictions program for Mexico City and Metro Area
224
- - **Economic Indicators**: Historical data for wages, UMA, and UDI values
225
-
226
- ---
227
-
228
- ## Statistics
229
-
230
- | Catalog Category | Records | Implementation | Size |
231
- |-----------------|---------|----------------|------|
232
- | SEPOMEX Postal Codes | 157,252 | JSON | 41 MB |
233
- | SAT Clave Prod/Serv | 52,063 | **SQLite hybrid** | 13.4 MB (was 18 MB JSON, **26% reduction**) |
234
- | INEGI Localities | 10,635 | **SQLite hybrid** | 2.0 MB (was 4.9 MB JSON, **59% reduction**) |
235
- | INEGI Municipalities | 2,478 | JSON | 0.98 MB |
236
- | SAT CFDI 4.0 | ~30 catalogs | JSON | <1 MB |
237
- | SAT Comercio Exterior | 8 catalogs | JSON | <1 MB |
238
- | SAT Carta Porte | 7 catalogs | JSON | <2 MB |
239
- | SAT Nómina | 7 catalogs | JSON | <1 MB |
240
- | SAT Tax Calculators | 5 calculators | JSON | <1 MB |
241
- | Banxico | 3 catalogs | JSON | 41 KB |
242
- | Banxico UDI | 24 values | JSON | ~2 KB |
243
- | IFT Telecom | 2 catalogs | JSON | 38 KB |
244
- | Mexico National | 6 catalogs | JSON | ~15 KB |
245
- | **TOTAL** | **170,505+ records** | **56 JSON + 2 SQLite** | **~82 MB total** |
246
-
247
- ### Test Coverage (TypeScript)
248
-
249
- | Metric | Coverage | Status |
250
- |--------|----------|--------|
251
- | **Functional Tests** | **220/220 passing** | ✅ **100%** |
252
- | Code Statements | 59.83% | ⚠️ Below 80% threshold |
253
- | Branches | 37.48% | ⚠️ Below 80% threshold |
254
- | Lines | 61.56% | ⚠️ Below 80% threshold |
255
- | Functions | 45.69% | ⚠️ Below 80% threshold |
256
-
257
- *All functional tests pass. Lower code coverage is due to many catalog methods and validators not yet having exhaustive test cases. Core functionality is fully tested.*
258
-
259
- ---
260
-
261
- ## Installation
262
-
263
- ### Python
264
-
265
- #### From PyPI (Recommended)
266
-
267
- ```bash
268
- pip install catalogmx
269
- ```
270
-
271
- #### From Source
272
-
273
- ```bash
274
- git clone https://github.com/openbancor/catalogmx.git
275
- cd catalogmx/packages/python
276
- pip install -e .
277
- ```
278
-
279
- **Requirements**:
280
- - Python 3.10 or higher
281
- - unidecode (for RFC generation)
282
- - click (for CLI)
283
-
284
- ### TypeScript/JavaScript
285
-
286
- #### NPM
287
-
288
- ```bash
289
- npm install catalogmx
290
- ```
291
-
292
- #### Yarn
293
-
294
- ```bash
295
- yarn add catalogmx
296
- ```
297
-
298
- **Requirements**:
299
- - Node.js 16 or higher
300
- - TypeScript 5.0+ (optional, type definitions included)
301
-
302
- ---
303
-
304
- ## Documentation
305
-
306
- ### Getting Started
307
- - [Installation Guide](docs/installation.rst)
308
- - [Quick Start Guide](docs/quickstart.rst)
309
- - [API Reference](docs/api/)
310
-
311
- ### Guides
312
- - [Architecture Guide](docs/guides/architecture.md)
313
- - [Developer's Guide](docs/guides/developers-guide.md)
314
- - [Catalog Updates](docs/guides/catalog-updates.md)
315
- - [CP-Locality Linking](docs/guides/cp-locality-linking.md)
316
-
317
- ### Catalogs
318
- - [Catalog Overview](docs/catalogs/overview.md)
319
- - [SEPOMEX Documentation](docs/catalogs/sepomex.md)
320
- - [INEGI Documentation](docs/catalogs/inegi.md)
321
- - [SAT Documentation](docs/catalogs/sat.md)
322
-
323
- ### Project
324
- - [Roadmap](docs/roadmap.md)
325
- - [Changelog](CHANGELOG.rst)
326
- - [Catalog Changelog](docs/changelog-catalogs.md)
327
- - [Contributing](CONTRIBUTING.rst)
328
-
329
- ---
330
-
331
- ## Usage Examples
332
-
333
- ### Address Validation
334
-
335
- ```python
336
- from catalogmx.catalogs.sepomex import CodigosPostales
337
- from catalogmx.catalogs.inegi import MunicipiosCatalog
338
-
339
- def validate_address(postal_code, municipality_name):
340
- """Validate Mexican address"""
341
-
342
- if not CodigosPostales.is_valid(postal_code):
343
- return False, "Invalid postal code"
344
-
345
- cp_info = CodigosPostales.get_by_cp(postal_code)[0]
346
-
347
- if municipality_name.lower() not in cp_info['municipio'].lower():
348
- return False, f"Postal code {postal_code} does not belong to {municipality_name}"
349
-
350
- return True, cp_info
351
- ```
352
-
353
- ### Geographic Analysis
354
-
355
- ```python
356
- from catalogmx.catalogs.inegi import LocalidadesCatalog
357
-
358
- # Find localities near a coordinate
359
- nearby = LocalidadesCatalog.get_by_coordinates(
360
- lat=19.4326, # Mexico City
361
- lon=-99.1332,
362
- radio_km=50
363
- )
364
-
365
- for locality in nearby[:5]:
366
- print(f"{locality['nom_localidad']}: {locality['distancia_km']} km")
367
- print(f" Population: {locality['poblacion_total']:,}")
368
- ```
369
-
370
- ### CFDI Validation
371
-
372
- ```python
373
- from catalogmx.validators import rfc
374
- from catalogmx.catalogs.sat.cfdi_4 import (
375
- RegimenFiscalCatalog,
376
- UsoCFDICatalog,
377
- FormaPagoCatalog
378
- )
379
-
380
- def validate_cfdi_data(rfc_code, tax_regime, cfdi_use, payment_method):
381
- """Validate CFDI invoice data"""
382
-
383
- errors = []
384
-
385
- if not rfc.validate_rfc(rfc_code):
386
- errors.append("Invalid RFC")
387
-
388
- if not RegimenFiscalCatalog.is_valid(tax_regime):
389
- errors.append(f"Invalid tax regime: {tax_regime}")
390
-
391
- if not UsoCFDICatalog.is_valid(cfdi_use):
392
- errors.append(f"Invalid CFDI use: {cfdi_use}")
393
-
394
- if not FormaPagoCatalog.is_valid(payment_method):
395
- errors.append(f"Invalid payment method: {payment_method}")
396
-
397
- return len(errors) == 0, errors
398
- ```
399
-
400
- ---
401
-
402
- ## Roadmap
403
-
404
- ### Version 0.3.0 (Current - November 2025)
405
-
406
- **Completed**:
407
- - ✅ Complete SEPOMEX postal codes (157,252 records)
408
- - ✅ Complete INEGI municipalities (2,478 records)
409
- - ✅ INEGI localities with GPS coordinates (10,635 records)
410
- - ✅ **SQLite hybrid architecture** for large catalogs (22-59% size reduction)
411
- - ✅ FTS5 full-text search with Spanish tokenization
412
- - ✅ IFT telecommunications catalogs (LADA codes, mobile operators)
413
- - ✅ Banxico complete financial catalogs (banks, currencies)
414
- - ✅ SAT tax calculators (IEPS, ISR historical 2002-2025, IVA, withholdings)
415
- - ✅ Geographic search by coordinates with radius filtering
416
- - ✅ Population and housing data (Census 2020)
417
- - ✅ Urban/rural classification
418
- - ✅ **Comprehensive test coverage** (337 tests: 221 TypeScript + 116 Python, all passing)
419
- - ✅ Bilingual documentation
420
-
421
- ### Version 0.4.0 (Planned - Q1 2025)
422
-
423
- **Planned**:
424
- - Geocoding integration (add GPS to postal codes)
425
- - Pre-computed CP-Locality correspondence table
426
- - REST API server examples
427
- - GraphQL API examples
428
- - Improve code coverage to 80%+ threshold
429
- - Python test suite and coverage reporting
430
-
431
- ### Version 0.5.0 (Future - Q2-Q3 2025)
432
-
433
- **Planned**:
434
- - Additional validators (ISAN, license plates, MRZ)
435
- - IMSS (social security) extended catalogs
436
- - TIGIE (customs tariff) catalog
437
- - Historical catalog versions with temporal queries
438
- - ML-based address normalization
439
- - WebAssembly compilation for validators
440
- - Browser-compatible SQLite with sql.js
441
-
442
- **Full Roadmap**: See [docs/roadmap.md](docs/roadmap.md) for detailed roadmap by catalog and implementation strategy.
443
-
444
- ---
445
-
446
- ## SQLite Hybrid Architecture
447
-
448
- For catalogs with >10,000 records, we provide **SQLite hybrid implementation** with automatic backend selection:
449
-
450
- **Benefits** (Proven Results):
451
- - **22-59% smaller file size** (measured on production catalogs)
452
- - **10-100x faster queries** with indexed lookups
453
- - **FTS5 full-text search** with Spanish text tokenization
454
- - **Memory efficient**: Query without loading entire dataset into memory
455
- - **Automatic selection**: Falls back to JSON if SQLite unavailable
456
-
457
- **Current Implementation** (v0.3.0):
458
-
459
- | Catalog | JSON Size | SQLite Size | Size Reduction | Features |
460
- |---------|-----------|-------------|----------------|----------|
461
- | **Clave Prod/Serv** | 18 MB | 13.4 MB | **26%** | FTS5 Spanish search |
462
- | **INEGI Localities** | 4.9 MB | 2.0 MB | **59%** | GPS coordinates indexed |
463
-
464
- **Technical Details**:
465
- - `better-sqlite3` for Node.js (native performance)
466
- - `sql.js` for WebAssembly browser support (planned)
467
- - FTS5 tokenization with Spanish stop words
468
- - Lazy loading with static caching
469
- - Seamless fallback to JSON for compatibility
470
-
471
- ---
472
-
473
- ## Catalog Update Strategy
474
-
475
- ### Update Frequencies
476
-
477
- | Catalog | Frequency | Source | Auto-update |
478
- |---------|-----------|--------|-------------|
479
- | SEPOMEX | Monthly | correosdemexico.gob.mx | Planned (v0.4.0) |
480
- | INEGI | Annually | inegi.org.mx | Manual |
481
- | SAT CFDI | Quarterly | sat.gob.mx | Planned (v0.4.0) |
482
- | Banxico | Quarterly | banxico.org.mx | Planned (v0.4.0) |
483
-
484
- ### Current Process
485
-
486
- ```bash
487
- # Check for updates
488
- python scripts/check_catalog_updates.py
489
-
490
- # Download and process
491
- python scripts/fetch_sat_catalogs.py
492
- python scripts/process_sepomex_file.py
493
- python scripts/process_inegi_municipios.py
494
- ```
495
-
496
- **Automated updates planned for v0.4.0**
497
-
498
- ---
499
-
500
- ## Contributing
501
-
502
- Contributions are welcome! Please see [CONTRIBUTING.rst](CONTRIBUTING.rst) for guidelines.
503
-
504
- ### Development Setup
505
-
506
- ```bash
507
- git clone https://github.com/openbancor/catalogmx.git
508
- cd catalogmx
509
-
510
- # Python
511
- cd packages/python
512
- pip install -e ".[dev]"
513
- pytest
514
-
515
- # TypeScript
516
- cd packages/typescript
517
- npm install
518
- npm test
519
- ```
520
-
521
- ### Adding New Catalogs
522
-
523
- See [Developer's Guide](docs/guides/developers-guide.md) for detailed instructions on:
524
- - Creating catalog JSON files
525
- - Implementing catalog classes
526
- - Writing tests
527
- - Updating documentation
528
-
529
- ---
530
-
531
- ## Project Structure
532
-
533
- ```
534
- catalogmx/
535
- ├── README.md # This file
536
- ├── LICENSE # BSD 2-Clause
537
- ├── CONTRIBUTING.rst # Contribution guidelines
538
- ├── CHANGELOG.rst # Project changelog
539
-
540
- ├── docs/ # Documentation
541
- │ ├── guides/ # Technical guides
542
- │ ├── catalogs/ # Catalog documentation
543
- │ ├── api/ # API reference
544
- │ ├── roadmap.md # Detailed roadmap
545
- │ └── releases/ # Release notes
546
-
547
- ├── packages/
548
- │ ├── python/ # Python implementation
549
- │ │ ├── catalogmx/
550
- │ │ ├── tests/
551
- │ │ ├── pyproject.toml # Modern Python config
552
- │ │ └── requirements.txt
553
- │ │
554
- │ ├── typescript/ # TypeScript implementation
555
- │ │ ├── src/
556
- │ │ ├── tests/
557
- │ │ └── package.json
558
- │ │
559
- │ └── shared-data/ # Catalog JSON data
560
- │ ├── sepomex/ # 157K postal codes
561
- │ ├── inegi/ # Municipalities & localities
562
- │ ├── sat/ # Tax catalogs
563
- │ └── banxico/ # Banking data
564
-
565
- └── scripts/ # Processing scripts
566
- ├── process_sepomex_file.py
567
- ├── process_inegi_municipios.py
568
- └── process_inegi_localidades.py
569
- ```
570
-
571
- ---
572
-
573
- ## License
574
-
575
- BSD 2-Clause License. See [LICENSE](LICENSE) for details.
576
-
577
- ---
578
-
579
- ## Acknowledgments
580
-
581
- ### Official Data Sources
582
-
583
- - **SAT** - Servicio de Administración Tributaria
584
- - **INEGI** - Instituto Nacional de Estadística y Geografía
585
- - **SEPOMEX** - Servicio Postal Mexicano
586
- - **Banxico** - Banco de México
587
- - **RENAPO** - Registro Nacional de Población
588
-
589
- ### Technology Stack
590
-
591
- - Python 3.10+ with modern type hints (PEP 604)
592
- - TypeScript 5.0+
593
- - Zero external dependencies (validators)
594
- - Lazy loading architecture
595
- - JSON-based catalog storage
596
-
597
- ---
598
-
599
- ## Support
600
-
601
- - **Documentation**: [docs/](docs/)
602
- - **Issues**: [GitHub Issues](https://github.com/openbancor/catalogmx/issues)
603
- - **Email**: luisfernando@informind.com
604
-
605
- ---
606
-
607
- ## Project Statistics
608
-
609
- ```
610
- Package Size: ~82 MB (all catalogs + SQLite)
611
- Total Catalogs: 58 (56 JSON + 2 SQLite)
612
- Total Records: 170,505+
613
- Test Coverage: 337/337 tests passing (221 TypeScript + 116 Python)
614
- Code Coverage: ~60% statements, ~37% branches
615
- Population: 126,014,024 (100% coverage)
616
- GPS Localities: 10,635
617
- Municipalities: 2,478
618
- Postal Codes: 157,252
619
- Banks: 110
620
- IFT Operators: Multiple telecom providers
621
- Tax Calculators: 5 (IEPS, ISR, IVA, Withholdings, Local)
622
- Economic Data: UMA (2017-2025), UDI (1995-2025), Salarios Mínimos (2010-2025)
623
- Traffic Rules: Hoy No Circula CDMX (complete program)
624
- License Plates: 35 official formats (NOM-001-SCT-2-2016)
625
- ```
626
-
627
- ### Package Size Breakdown
628
-
629
- ```
630
- Directory Size Description
631
- -----------------------------------------
632
- sepomex/ 41 MB Postal codes (complete)
633
- sat/ 19 MB Tax catalogs (all modules)
634
- sqlite/ 16 MB Hybrid databases (2 files)
635
- inegi/ 5.8 MB Geographic data
636
- banxico/ 41 KB Financial institutions
637
- ift/ 38 KB Telecommunications
638
- misc/ 5.5 KB Supporting data
639
- ```
640
-
641
- ---
642
-
643
- **catalogmx** v0.3.0 | November 2025 | Made for the Mexican developer community
644
-