geo-intel-offline 1.0.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,784 @@
1
+ Metadata-Version: 2.1
2
+ Name: geo-intel-offline
3
+ Version: 1.0.1
4
+ Summary: Production-ready, offline geo-intelligence library for resolving latitude/longitude to country, ISO codes, continent, and timezone information
5
+ Home-page: https://github.com/your-org/geo-intel-offline
6
+ Author: GeoIntelLib Team
7
+ License: MIT
8
+ Project-URL: Homepage, https://github.com/yourusername/geo-intel-offline
9
+ Project-URL: Documentation, https://github.com/yourusername/geo-intel-offline#readme
10
+ Project-URL: Repository, https://github.com/yourusername/geo-intel-offline
11
+ Project-URL: Issues, https://github.com/yourusername/geo-intel-offline/issues
12
+ Project-URL: Bug Tracker, https://github.com/yourusername/geo-intel-offline/issues
13
+ Keywords: geolocation,geography,country,iso,timezone,offline,geohash
14
+ Classifier: Development Status :: 5 - Production/Stable
15
+ Classifier: Intended Audience :: Developers
16
+ Classifier: License :: OSI Approved :: MIT License
17
+ Classifier: Programming Language :: Python :: 3
18
+ Classifier: Programming Language :: Python :: 3.8
19
+ Classifier: Programming Language :: Python :: 3.9
20
+ Classifier: Programming Language :: Python :: 3.10
21
+ Classifier: Programming Language :: Python :: 3.11
22
+ Classifier: Programming Language :: Python :: 3.12
23
+ Classifier: Topic :: Scientific/Engineering :: GIS
24
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
25
+ Requires-Python: >=3.8
26
+ Description-Content-Type: text/markdown
27
+ License-File: LICENSE
28
+ Provides-Extra: dev
29
+ Requires-Dist: black >=23.0 ; extra == 'dev'
30
+ Requires-Dist: mypy >=1.0 ; extra == 'dev'
31
+ Requires-Dist: pytest-cov >=4.0 ; extra == 'dev'
32
+ Requires-Dist: pytest >=7.0 ; extra == 'dev'
33
+
34
+ # geo-intel-offline
35
+
36
+ [![Python Version](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://www.python.org/downloads/)
37
+ [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
38
+ [![Development Status](https://img.shields.io/badge/status-production--ready-brightgreen.svg)](https://pypi.org/project/geo-intel-offline/)
39
+
40
+ **Production-ready, offline geo-intelligence library** for resolving latitude/longitude coordinates to country, ISO codes, continent, timezone, and confidence scores. No API keys, no network requests, 100% deterministic.
41
+
42
+ ## 🌟 Why This Library Exists
43
+
44
+ Every developer working with geolocation has faced the same frustration: you need to know what country a set of coordinates belongs to, but all the solutions either cost money, require API keys, need constant internet connectivity, or have restrictive rate limits. What if you're building an offline application? What if you're processing millions of records and API costs become prohibitive? What if you need deterministic results without external dependencies?
45
+
46
+ **We built `geo-intel-offline` to solve these real-world problems.**
47
+
48
+ This library was born from the need for a **reliable, fast, and completely free** solution that works everywhereβ€”from edge devices in remote locations to high-throughput data processing pipelines. No subscriptions, no rate limits, no vendor lock-in. Just pure Python that does one thing exceptionally well: **tell you where in the world a coordinate belongs.**
49
+
50
+ Whether you're building a mobile app that works offline, processing billions of GPS logs, enriching datasets without external APIs, or creating applications for regions with unreliable internetβ€”this library empowers you to add geo-intelligence to your projects without compromise.
51
+
52
+ ## ✨ Features
53
+
54
+ - πŸš€ **Fast**: < 1ms per lookup, < 15MB memory footprint
55
+ - πŸ“¦ **Offline**: Zero network dependencies, works completely offline
56
+ - 🎯 **Accurate**: 99.92% accuracy across 258 countries
57
+ - πŸ”’ **Deterministic**: Same input always produces same output
58
+ - πŸ—œοΈ **Optimized**: 66% size reduction with automatic compression
59
+ - 🌍 **Comprehensive**: Supports all countries, continents, and territories
60
+ - 🎨 **Clean API**: Simple, intuitive interface
61
+ - πŸ”§ **No Dependencies**: Pure Python, no native extensions
62
+ - πŸ’° **Free Forever**: No API costs, no rate limits, no hidden fees
63
+
64
+ ## 🎯 Where Can You Use This Library?
65
+
66
+ ### Mobile Applications
67
+ **Offline-first apps** that need to identify user location even without internet connectivity. Perfect for travel apps, fitness trackers, or field data collection tools that work in remote areas.
68
+
69
+ ```python
70
+ # Works offline - no internet needed!
71
+ from geo_intel_offline import resolve
72
+
73
+ def identify_user_country(lat, lon):
74
+ result = resolve(lat, lon)
75
+ return result.country # Works even in airplane mode
76
+ ```
77
+
78
+ ### Data Processing & Analytics
79
+ **Batch processing** of GPS logs, location data, or transaction records. Process millions of coordinates without API rate limits or costs.
80
+
81
+ ```python
82
+ # Process millions of records - no rate limits!
83
+ import pandas as pd
84
+ from geo_intel_offline import resolve
85
+
86
+ df = pd.read_csv('location_data.csv')
87
+ df['country'] = df.apply(
88
+ lambda row: resolve(row['lat'], row['lon']).country,
89
+ axis=1
90
+ )
91
+ ```
92
+
93
+ ### IoT & Edge Devices
94
+ **Edge computing** applications where devices need geo-intelligence without cloud connectivity. Perfect for sensors, trackers, or embedded systems.
95
+
96
+ ```python
97
+ # Runs on Raspberry Pi, microcontrollers, edge devices
98
+ # No cloud dependency, minimal resources
99
+ result = resolve(sensor_lat, sensor_lon)
100
+ if result.country != 'US':
101
+ trigger_alert()
102
+ ```
103
+
104
+ ### API Alternatives & Rate Limit Avoidance
105
+ **Replace expensive APIs** or bypass rate limits. Perfect for applications that need high throughput or want to reduce infrastructure costs. See the [Use Cases](#-use-cases) section below for detailed implementation examples.
106
+
107
+ ```python
108
+ # Instead of: external_api.geocode(lat, lon) # $0.005 per request
109
+ # Use: resolve(lat, lon) # FREE, unlimited, instant
110
+ ```
111
+
112
+ ### Geographic Data Enrichment
113
+ **Enrich datasets** with country information for analysis, visualization, or machine learning. No need to maintain external API connections or handle failures. See the [Use Cases](#-use-cases) section below for pandas DataFrame examples.
114
+
115
+ ```python
116
+ # Enrich logs, events, transactions with country data
117
+ events = load_events_from_database()
118
+ for event in events:
119
+ event['country'] = resolve(event['lat'], event['lon']).iso2
120
+ save_event(event)
121
+ ```
122
+
123
+ ### Location-Based Features
124
+ **Add geo-context** to your applications: content localization, compliance checks, regional restrictions, or timezone-aware scheduling.
125
+
126
+ ```python
127
+ # Content localization based on location
128
+ result = resolve(user_lat, user_lon)
129
+ if result.continent == 'Europe':
130
+ show_gdpr_banner()
131
+ elif result.country == 'US':
132
+ show_us_specific_content()
133
+ ```
134
+
135
+ ### Development & Testing
136
+ **Local development** and testing without needing API keys or internet connectivity. Great for CI/CD pipelines and automated testing.
137
+
138
+ ```python
139
+ # Test with real data - no mocks needed
140
+ def test_geocoding():
141
+ result = resolve(40.7128, -74.0060)
142
+ assert result.country == 'United States of America'
143
+ assert result.iso2 == 'US'
144
+ ```
145
+
146
+ ### Research & Academic Projects
147
+ **Academic research** that requires reproducible results without external API dependencies or costs that might limit research scope.
148
+
149
+ ```python
150
+ # Reproducible research - same results every time
151
+ # No API costs to worry about in grant proposals
152
+ results = [resolve(lat, lon) for lat, lon in research_coordinates]
153
+ ```
154
+
155
+ ## πŸ’‘ Benefits
156
+
157
+ ### For Developers
158
+
159
+ #### **Simplicity & Speed**
160
+ - **One-line integration**: `from geo_intel_offline import resolve`
161
+ - **No configuration**: Works out of the box with pre-built data
162
+ - **Lightning fast**: < 1ms per lookup means no performance bottlenecks
163
+ - **Predictable**: Same coordinates always return same results
164
+
165
+ #### **Development Experience**
166
+ - **No API keys needed**: Start coding immediately
167
+ - **Works offline**: Develop and test without internet
168
+ - **No rate limits**: Test with unlimited requests
169
+ - **Pure Python**: Easy to debug, inspect, and modify
170
+ - **Well documented**: Comprehensive examples and API reference
171
+
172
+ #### **Flexibility & Control**
173
+ - **Modular loading**: Load only countries you need (reduce memory)
174
+ - **Custom data**: Build datasets from your own GeoJSON sources
175
+ - **No vendor lock-in**: Your code, your data, your control
176
+ - **Deterministic**: Perfect for testing and reproducible builds
177
+
178
+ ### For Businesses & Organizations
179
+
180
+ #### **Cost Savings**
181
+ - **Zero API costs**: Save thousands on external geocoding services
182
+ - **No infrastructure**: Runs locally, no cloud services needed
183
+ - **No scaling costs**: Handle millions of requests without per-request fees
184
+ - **Predictable expenses**: One-time setup, no ongoing subscription
185
+
186
+ **Example Cost Comparison:**
187
+ - External API: $0.005 per request Γ— 1M requests = **$5,000/month**
188
+ - This library: **$0/month** (one-time setup time)
189
+
190
+ #### **Reliability & Performance**
191
+ - **100% uptime**: No external service dependencies to fail
192
+ - **Consistent latency**: < 1ms every time (no network delays)
193
+ - **No rate limits**: Process data at your own pace
194
+ - **Data privacy**: Location data never leaves your infrastructure
195
+
196
+ #### **Scalability**
197
+ - **Handle any volume**: Process billions of coordinates
198
+ - **Edge deployment**: Deploy to edge devices and IoT
199
+ - **Batch processing**: Process large datasets efficiently
200
+ - **Memory efficient**: < 15MB footprint even with all countries
201
+
202
+ #### **Compliance & Security**
203
+ - **GDPR friendly**: No data sent to external services
204
+ - **Offline capable**: Meets requirements for air-gapped systems
205
+ - **Auditable**: You can inspect the exact logic and data
206
+ - **No data sharing**: Complete data sovereignty
207
+
208
+ ### For End Users
209
+
210
+ #### **Privacy**
211
+ - **Data stays local**: Coordinates never sent to external servers
212
+ - **No tracking**: No analytics, no usage monitoring
213
+ - **Transparent**: Open source, you can verify everything
214
+
215
+ #### **Performance**
216
+ - **Instant results**: No network latency
217
+ - **Works offline**: No internet required
218
+ - **Low resource usage**: Runs on modest hardware
219
+
220
+ ## πŸ“¦ Installation
221
+
222
+ ### From PyPI (Recommended)
223
+
224
+ ```bash
225
+ pip install geo-intel-offline
226
+ ```
227
+
228
+ ### From uv
229
+
230
+ ```bash
231
+ uv pip install geo-intel-offline
232
+ ```
233
+
234
+ ### From Source
235
+
236
+ ```bash
237
+ git clone https://github.com/yourusername/geo-intel-offline.git
238
+ cd geo-intel-offline
239
+ pip install .
240
+ ```
241
+
242
+ ## πŸš€ Quick Start
243
+
244
+ ### Basic Usage
245
+
246
+ ```python
247
+ from geo_intel_offline import resolve
248
+
249
+ # Resolve coordinates to country information
250
+ result = resolve(40.7128, -74.0060) # New York City
251
+
252
+ print(result.country) # "United States of America"
253
+ print(result.iso2) # "US"
254
+ print(result.iso3) # "USA"
255
+ print(result.continent) # "North America"
256
+ print(result.timezone) # "America/New_York"
257
+ print(result.confidence) # 0.98
258
+ ```
259
+
260
+ ### Step-by-Step Guide
261
+
262
+ #### Step 1: Install the Package
263
+
264
+ ```bash
265
+ pip install geo-intel-offline
266
+ ```
267
+
268
+ #### Step 2: Import and Use
269
+
270
+ ```python
271
+ from geo_intel_offline import resolve
272
+
273
+ # Resolve a coordinate
274
+ result = resolve(51.5074, -0.1278) # London, UK
275
+
276
+ # Access results as attributes
277
+ print(f"Country: {result.country}")
278
+ print(f"ISO2 Code: {result.iso2}")
279
+ print(f"ISO3 Code: {result.iso3}")
280
+ print(f"Continent: {result.continent}")
281
+ print(f"Timezone: {result.timezone}")
282
+ print(f"Confidence: {result.confidence:.2f}")
283
+ ```
284
+
285
+ #### Step 3: Handle Edge Cases
286
+
287
+ ```python
288
+ from geo_intel_offline import resolve
289
+
290
+ # Ocean locations (no country)
291
+ result = resolve(0.0, 0.0) # Gulf of Guinea (ocean)
292
+ if result.country is None:
293
+ print("No country found (likely ocean)")
294
+ print(f"Confidence: {result.confidence}") # Will be 0.0
295
+
296
+ # Border regions (may have lower confidence)
297
+ result = resolve(49.0, 8.2) # Near France-Germany border
298
+ if result.confidence < 0.7:
299
+ print(f"Low confidence: {result.confidence:.2f} (near border)")
300
+ ```
301
+
302
+ ## πŸ“– Detailed Examples
303
+
304
+ ### Example 1: Resolve Multiple Locations
305
+
306
+ ```python
307
+ from geo_intel_offline import resolve
308
+
309
+ locations = [
310
+ (40.7128, -74.0060, "New York"),
311
+ (51.5074, -0.1278, "London"),
312
+ (35.6762, 139.6503, "Tokyo"),
313
+ (-33.8688, 151.2093, "Sydney"),
314
+ (55.7558, 37.6173, "Moscow"),
315
+ ]
316
+
317
+ for lat, lon, name in locations:
318
+ result = resolve(lat, lon)
319
+ print(f"{name}: {result.country} ({result.iso2}) - Confidence: {result.confidence:.2f}")
320
+ ```
321
+
322
+ **Output:**
323
+ ```
324
+ New York: United States of America (US) - Confidence: 0.98
325
+ London: United Kingdom (GB) - Confidence: 0.93
326
+ Tokyo: Japan (JP) - Confidence: 0.93
327
+ Sydney: Australia (AU) - Confidence: 0.80
328
+ Moscow: Russia (RU) - Confidence: 0.93
329
+ ```
330
+
331
+ ### Example 2: Batch Processing
332
+
333
+ ```python
334
+ from geo_intel_offline import resolve
335
+ import time
336
+
337
+ coordinates = [
338
+ (40.7128, -74.0060),
339
+ (51.5074, -0.1278),
340
+ (35.6762, 139.6503),
341
+ # ... more coordinates
342
+ ]
343
+
344
+ start = time.perf_counter()
345
+ results = [resolve(lat, lon) for lat, lon in coordinates]
346
+ end = time.perf_counter()
347
+
348
+ print(f"Processed {len(coordinates)} coordinates in {(end - start)*1000:.2f}ms")
349
+ print(f"Average: {(end - start)*1000/len(coordinates):.3f}ms per lookup")
350
+ ```
351
+
352
+ ### Example 3: Dictionary Access
353
+
354
+ ```python
355
+ from geo_intel_offline import resolve
356
+
357
+ result = resolve(37.7749, -122.4194) # San Francisco
358
+
359
+ # Access as dictionary
360
+ result_dict = result.to_dict()
361
+ print(result_dict)
362
+ # {
363
+ # 'country': 'United States of America',
364
+ # 'iso2': 'US',
365
+ # 'iso3': 'USA',
366
+ # 'continent': 'North America',
367
+ # 'timezone': 'America/Los_Angeles',
368
+ # 'confidence': 0.95
369
+ # }
370
+
371
+ # Or access as attributes
372
+ print(result.country) # "United States of America"
373
+ print(result.iso2) # "US"
374
+ ```
375
+
376
+ ### Example 4: Filter by Confidence
377
+
378
+ ```python
379
+ from geo_intel_offline import resolve
380
+
381
+ def resolve_with_threshold(lat, lon, min_confidence=0.7):
382
+ """Resolve coordinates with confidence threshold."""
383
+ result = resolve(lat, lon)
384
+ if result.confidence < min_confidence:
385
+ return None, f"Low confidence: {result.confidence:.2f}"
386
+ return result, None
387
+
388
+ result, error = resolve_with_threshold(40.7128, -74.0060, min_confidence=0.9)
389
+ if result:
390
+ print(f"High confidence result: {result.country}")
391
+ else:
392
+ print(f"Rejected: {error}")
393
+ ```
394
+
395
+ ### Example 5: Error Handling
396
+
397
+ ```python
398
+ from geo_intel_offline import resolve
399
+
400
+ def safe_resolve(lat, lon):
401
+ """Safely resolve coordinates with error handling."""
402
+ try:
403
+ result = resolve(lat, lon)
404
+ if result.country is None:
405
+ return {"error": "No country found", "confidence": result.confidence}
406
+ return {
407
+ "country": result.country,
408
+ "iso2": result.iso2,
409
+ "iso3": result.iso3,
410
+ "continent": result.continent,
411
+ "timezone": result.timezone,
412
+ "confidence": result.confidence,
413
+ }
414
+ except ValueError as e:
415
+ return {"error": f"Invalid coordinates: {e}"}
416
+ except FileNotFoundError as e:
417
+ return {"error": f"Data files not found: {e}"}
418
+
419
+ # Usage
420
+ result = safe_resolve(40.7128, -74.0060)
421
+ print(result)
422
+ ```
423
+
424
+ ## πŸ“š API Reference
425
+
426
+ ### `resolve(lat, lon, data_dir=None, countries=None, continents=None, exclude_countries=None)`
427
+
428
+ Main function to resolve coordinates to geo-intelligence.
429
+
430
+ **Parameters:**
431
+
432
+ - `lat` (float): Latitude (-90.0 to 90.0)
433
+ - `lon` (float): Longitude (-180.0 to 180.0)
434
+ - `data_dir` (str, optional): Custom data directory path
435
+ - `countries` (list[str], optional): List of ISO2 codes to load (modular format only)
436
+ - `continents` (list[str], optional): List of continent names to load (modular format only)
437
+ - `exclude_countries` (list[str], optional): List of ISO2 codes to exclude (modular format only)
438
+
439
+ **Returns:**
440
+
441
+ `GeoIntelResult` object with the following properties:
442
+
443
+ - `country` (str | None): Country name
444
+ - `iso2` (str | None): ISO 3166-1 alpha-2 code
445
+ - `iso3` (str | None): ISO 3166-1 alpha-3 code
446
+ - `continent` (str | None): Continent name
447
+ - `timezone` (str | None): IANA timezone identifier
448
+ - `confidence` (float): Confidence score (0.0 to 1.0)
449
+
450
+ **Methods:**
451
+
452
+ - `to_dict()`: Convert result to dictionary
453
+
454
+ **Raises:**
455
+
456
+ - `ValueError`: If lat/lon are out of valid range
457
+ - `FileNotFoundError`: If data files are missing
458
+
459
+ ### `GeoIntelResult`
460
+
461
+ Result object returned by `resolve()`.
462
+
463
+ **Properties:**
464
+
465
+ ```python
466
+ result.country # Country name (str | None)
467
+ result.iso2 # ISO2 code (str | None)
468
+ result.iso3 # ISO3 code (str | None)
469
+ result.continent # Continent name (str | None)
470
+ result.timezone # Timezone (str | None)
471
+ result.confidence # Confidence score (float, 0.0-1.0)
472
+ ```
473
+
474
+ **Methods:**
475
+
476
+ ```python
477
+ result.to_dict() # Convert to dictionary
478
+ ```
479
+
480
+ ## 🎯 Use Cases
481
+
482
+ ### 1. Geocoding Service
483
+
484
+ ```python
485
+ from geo_intel_offline import resolve
486
+
487
+ def geocode_location(lat, lon):
488
+ """Geocode a location without external API."""
489
+ result = resolve(lat, lon)
490
+ return {
491
+ "country": result.country,
492
+ "country_code": result.iso2,
493
+ "continent": result.continent,
494
+ "timezone": result.timezone,
495
+ }
496
+
497
+ # Use in your application
498
+ location_info = geocode_location(40.7128, -74.0060)
499
+ ```
500
+
501
+ ### 2. User Location Analysis
502
+
503
+ ```python
504
+ from geo_intel_offline import resolve
505
+
506
+ def analyze_user_locations(locations):
507
+ """Analyze user locations for geographic distribution."""
508
+ countries = {}
509
+ for lat, lon in locations:
510
+ result = resolve(lat, lon)
511
+ if result.country:
512
+ countries[result.country] = countries.get(result.country, 0) + 1
513
+ return countries
514
+ ```
515
+
516
+ ### 3. Data Enrichment
517
+
518
+ ```python
519
+ from geo_intel_offline import resolve
520
+ import pandas as pd
521
+
522
+ # Enrich DataFrame with country information
523
+ df = pd.DataFrame({
524
+ 'lat': [40.7128, 51.5074, 35.6762],
525
+ 'lon': [-74.0060, -0.1278, 139.6503],
526
+ })
527
+
528
+ df['country'] = df.apply(
529
+ lambda row: resolve(row['lat'], row['lon']).country,
530
+ axis=1
531
+ )
532
+ df['iso2'] = df.apply(
533
+ lambda row: resolve(row['lat'], row['lon']).iso2,
534
+ axis=1
535
+ )
536
+
537
+ print(df)
538
+ ```
539
+
540
+ ### 4. API Rate Limiting Alternative
541
+
542
+ ```python
543
+ from geo_intel_offline import resolve
544
+
545
+ # Instead of calling external API
546
+ # result = external_api.geocode(lat, lon) # Rate limited!
547
+
548
+ # Use offline resolution
549
+ result = resolve(lat, lon) # No rate limits, always available
550
+ ```
551
+
552
+ ## πŸ”§ Advanced Usage
553
+
554
+ ### Modular Data Loading
555
+
556
+ For applications that only need specific regions, you can use modular data loading to reduce memory footprint:
557
+
558
+ ```python
559
+ from geo_intel_offline import resolve
560
+
561
+ # Load only specific countries (requires modular data format)
562
+ result = resolve(40.7128, -74.0060, countries=["US", "CA", "MX"])
563
+
564
+ # Load by continent
565
+ result = resolve(51.5074, -0.1278, continents=["Europe"])
566
+
567
+ # Exclude specific countries
568
+ result = resolve(35.6762, 139.6503, exclude_countries=["RU", "CN"])
569
+ ```
570
+
571
+ **Note:** Modular data loading requires building data in modular format. See [Building Custom Data](#building-custom-data) below.
572
+
573
+ ## πŸ—οΈ Building Custom Data (Advanced)
574
+
575
+ ### Prerequisites
576
+
577
+ 1. Download Natural Earth Admin 0 Countries GeoJSON:
578
+ ```bash
579
+ wget https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-0-countries/
580
+ # Or use the provided script:
581
+ bash scripts/download_natural_earth.sh
582
+ ```
583
+
584
+ ### Build Full Dataset
585
+
586
+ ```bash
587
+ # Build complete dataset with compression
588
+ python3 -m geo_intel_offline.data_builder \
589
+ data_sources/ne_10m_admin_0_countries.geojson \
590
+ geo_intel_offline/data
591
+
592
+ # Or use automated script
593
+ python3 scripts/prepare_full_data.py
594
+ ```
595
+
596
+ **Note:** The build process automatically compresses data files, reducing size by ~66%.
597
+
598
+ ### Build Modular Dataset
599
+
600
+ ```bash
601
+ # Build modular format (country-wise files)
602
+ python3 -m geo_intel_offline.data_builder_modular \
603
+ data_sources/ne_10m_admin_0_countries.geojson \
604
+ output_directory
605
+
606
+ # Build specific countries only
607
+ python3 -m geo_intel_offline.data_builder_modular \
608
+ --countries US,CA,MX \
609
+ data_sources/ne_10m_admin_0_countries.geojson \
610
+ output_directory
611
+
612
+ # Build by continent
613
+ python3 -m geo_intel_offline.data_builder_modular \
614
+ --continents "North America,Europe" \
615
+ data_sources/ne_10m_admin_0_countries.geojson \
616
+ output_directory
617
+ ```
618
+
619
+ ## ⚑ Performance
620
+
621
+ ### Benchmarks
622
+
623
+ - **Lookup Speed**: < 1ms per resolution
624
+ - **Memory Footprint**: < 15 MB (all data in memory)
625
+ - **Cold Start**: ~100ms (initial data load)
626
+ - **Accuracy**: 99.92% across 258 countries
627
+ - **Data Size**: ~4 MB compressed (66% reduction)
628
+
629
+ ### Performance Test
630
+
631
+ ```python
632
+ from geo_intel_offline import resolve
633
+ import time
634
+
635
+ test_points = [
636
+ (40.7128, -74.0060), # NYC
637
+ (51.5074, -0.1278), # London
638
+ (35.6762, 139.6503), # Tokyo
639
+ # ... more points
640
+ ]
641
+
642
+ start = time.perf_counter()
643
+ for _ in range(100):
644
+ for lat, lon in test_points:
645
+ resolve(lat, lon)
646
+ end = time.perf_counter()
647
+
648
+ avg_time = ((end - start) / (100 * len(test_points))) * 1000
649
+ print(f"Average lookup time: {avg_time:.3f}ms")
650
+ ```
651
+
652
+ ## πŸ” Understanding Confidence Scores
653
+
654
+ Confidence scores range from 0.0 to 1.0:
655
+
656
+ - **0.9 - 1.0**: High confidence (well within country boundaries)
657
+ - **0.7 - 0.9**: Good confidence (inside country, may be near border)
658
+ - **0.5 - 0.7**: Moderate confidence (near border or ambiguous region)
659
+ - **0.0 - 0.5**: Low confidence (likely ocean or disputed territory)
660
+
661
+ ```python
662
+ from geo_intel_offline import resolve
663
+
664
+ result = resolve(40.7128, -74.0060) # NYC (center of country)
665
+ print(f"Confidence: {result.confidence:.2f}") # ~0.98
666
+
667
+ result = resolve(49.0, 8.2) # Near France-Germany border
668
+ print(f"Confidence: {result.confidence:.2f}") # ~0.65-0.75
669
+
670
+ result = resolve(0.0, 0.0) # Ocean
671
+ print(f"Confidence: {result.confidence:.2f}") # 0.0
672
+ ```
673
+
674
+ ## ❓ Troubleshooting
675
+
676
+ ### Issue: "Data file not found"
677
+
678
+ **Solution:** Ensure data files are present in the package installation directory, or build custom data:
679
+
680
+ ```bash
681
+ # Check if data files exist
682
+ ls geo_intel_offline/data/*.json.gz
683
+
684
+ # If missing, rebuild data
685
+ python3 -m geo_intel_offline.data_builder \
686
+ path/to/geojson \
687
+ geo_intel_offline/data
688
+ ```
689
+
690
+ ### Issue: Low accuracy for specific locations
691
+
692
+ **Possible causes:**
693
+ - Location is in ocean (no country)
694
+ - Location is on border (ambiguous)
695
+ - Location is in disputed territory
696
+
697
+ **Solution:** Check confidence score and handle edge cases:
698
+
699
+ ```python
700
+ result = resolve(lat, lon)
701
+ if result.confidence < 0.5:
702
+ print("Low confidence - may be ocean or border region")
703
+ ```
704
+
705
+ ### Issue: Memory usage higher than expected
706
+
707
+ **Solution:** Use modular data loading to load only needed countries:
708
+
709
+ ```python
710
+ # Instead of loading all countries
711
+ result = resolve(lat, lon)
712
+
713
+ # Load only needed countries
714
+ result = resolve(lat, lon, countries=["US", "CA"])
715
+ ```
716
+
717
+ ## πŸ“Š Test Results
718
+
719
+ Comprehensive testing across 258 countries:
720
+
721
+ - **Overall Accuracy**: 99.92%
722
+ - **Countries Tested**: 258
723
+ - **Total Test Points**: 2,513
724
+ - **Countries with 100% Accuracy**: 256 (99.2%)
725
+ - **Countries with 90%+ Accuracy**: 257 (99.6%)
726
+
727
+ See [TEST_RESULTS.md](TEST_RESULTS.md) for detailed country-wise results.
728
+
729
+ ## πŸ—οΈ Architecture
730
+
731
+ The library uses a hybrid three-stage resolution pipeline:
732
+
733
+ 1. **Geohash Indexing**: Fast spatial filtering to candidate countries
734
+ 2. **Point-in-Polygon**: Accurate geometric verification using ray casting
735
+ 3. **Confidence Scoring**: Distance-to-border calculation for certainty assessment
736
+
737
+ For detailed architecture documentation, see [ARCHITECTURE.md](ARCHITECTURE.md).
738
+
739
+ ## πŸ“ License
740
+
741
+ MIT License - see [LICENSE](LICENSE) file for details.
742
+
743
+ ## 🀝 Contributing
744
+
745
+ Contributions are welcome! Please feel free to submit a Pull Request.
746
+
747
+ 1. Fork the repository
748
+ 2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
749
+ 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
750
+ 4. Push to the branch (`git push origin feature/AmazingFeature`)
751
+ 5. Open a Pull Request
752
+
753
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
754
+
755
+ ## πŸ“š Additional Documentation
756
+
757
+ - **[ARCHITECTURE.md](ARCHITECTURE.md)** - Internal design and architecture details
758
+ - **[TEST_RESULTS.md](TEST_RESULTS.md)** - Comprehensive test results and benchmarks
759
+ - **[QUICK_START.md](QUICK_START.md)** - Quick start guide for new users
760
+
761
+ ## πŸ”— Links
762
+
763
+ - **PyPI**: https://pypi.org/project/geo-intel-offline/
764
+ - **GitHub**: https://github.com/yourusername/geo-intel-offline
765
+ - **Issues**: https://github.com/yourusername/geo-intel-offline/issues
766
+
767
+ ## πŸ™ Acknowledgments
768
+
769
+ - Data source: [Natural Earth](https://www.naturalearthdata.com/)
770
+ - Geohash implementation: Based on standard geohash algorithm
771
+ - Point-in-Polygon: Ray casting algorithm
772
+
773
+ ---
774
+
775
+ ## πŸ‘¨β€πŸ’» Author
776
+
777
+ **Rakesh Ranjan Jena**
778
+
779
+ - 🌐 **Blog**: [https://www.rrjprince.com/](https://www.rrjprince.com/)
780
+ - πŸ’Ό **LinkedIn**: [https://www.linkedin.com/in/rrjprince/](https://www.linkedin.com/in/rrjprince/)
781
+
782
+ ---
783
+
784
+ **Made with ❀️ for the Python community**