open-civics 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +172 -0
- package/data/sc/state.json +1907 -0
- package/package.json +25 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Tim Simpson
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,172 @@
|
|
|
1
|
+
# open-civics
|
|
2
|
+
|
|
3
|
+
Structured contact data for US elected officials — state legislators, county councils, and city councils — scraped weekly and published as npm packages.
|
|
4
|
+
|
|
5
|
+
South Carolina is fully covered: 170 state legislators and 96 local jurisdictions (every county and incorporated municipality).
|
|
6
|
+
|
|
7
|
+
## Packages
|
|
8
|
+
|
|
9
|
+
Install via npm:
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
npm install open-civics # contact data
|
|
13
|
+
npm install open-civics-boundaries # district boundary GeoJSON
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
| Package | What's in it |
|
|
17
|
+
|---------|-------------|
|
|
18
|
+
| `open-civics` | Names, titles, emails, phones, districts for state and local reps |
|
|
19
|
+
| `open-civics-boundaries` | GeoJSON district boundaries for client-side point-in-polygon matching |
|
|
20
|
+
|
|
21
|
+
## What the data looks like
|
|
22
|
+
|
|
23
|
+
Each local jurisdiction file has a `meta` block and a `members` array:
|
|
24
|
+
|
|
25
|
+
```json
|
|
26
|
+
{
|
|
27
|
+
"meta": {
|
|
28
|
+
"state": "SC",
|
|
29
|
+
"level": "local",
|
|
30
|
+
"jurisdiction": "county:greenville",
|
|
31
|
+
"label": "Greenville County Council",
|
|
32
|
+
"lastUpdated": "2026-03-14"
|
|
33
|
+
},
|
|
34
|
+
"members": [
|
|
35
|
+
{
|
|
36
|
+
"name": "Benton Blount",
|
|
37
|
+
"title": "Chairman, District 19",
|
|
38
|
+
"email": "BBlount@greenvillecounty.org",
|
|
39
|
+
"phone": "(864) 483-2474"
|
|
40
|
+
}
|
|
41
|
+
]
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
State legislator files are keyed by district number with senate/house chambers:
|
|
46
|
+
|
|
47
|
+
```json
|
|
48
|
+
{
|
|
49
|
+
"meta": { "state": "SC", "level": "state" },
|
|
50
|
+
"senate": {
|
|
51
|
+
"1": { "name": "...", "district": "1", "party": "R", "email": "...", "phone": "..." }
|
|
52
|
+
},
|
|
53
|
+
"house": {
|
|
54
|
+
"1": { "name": "...", "district": "1", "party": "D", "email": "...", "phone": "..." }
|
|
55
|
+
}
|
|
56
|
+
}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Usage
|
|
60
|
+
|
|
61
|
+
```js
|
|
62
|
+
// State legislators
|
|
63
|
+
import scState from 'open-civics/sc/state.json';
|
|
64
|
+
const senator = scState.senate["1"];
|
|
65
|
+
|
|
66
|
+
// Local councils
|
|
67
|
+
import greenvilleCounty from 'open-civics/sc/local/county-greenville.json';
|
|
68
|
+
const members = greenvilleCounty.members;
|
|
69
|
+
|
|
70
|
+
// District boundaries (GeoJSON FeatureCollection)
|
|
71
|
+
import senateBoundaries from 'open-civics-boundaries/sc/boundaries/sldu.json';
|
|
72
|
+
import houseBoundaries from 'open-civics-boundaries/sc/boundaries/sldl.json';
|
|
73
|
+
import countyBoundaries from 'open-civics-boundaries/sc/boundaries/county-greenville.json';
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
Boundary files are standard GeoJSON FeatureCollections. Each feature has a `properties.district` field matching the district keys in the contact data. Use any point-in-polygon library (Turf.js, Mapbox, etc.) to find which district a user's address falls in.
|
|
77
|
+
|
|
78
|
+
## Data structure
|
|
79
|
+
|
|
80
|
+
```
|
|
81
|
+
data/
|
|
82
|
+
sc/
|
|
83
|
+
state.json # State legislators (senate + house + governor)
|
|
84
|
+
local/
|
|
85
|
+
county-greenville.json # Greenville County Council
|
|
86
|
+
place-greenville.json # Greenville City Council
|
|
87
|
+
... # 96 jurisdiction files total
|
|
88
|
+
boundaries/
|
|
89
|
+
sldu.json # State senate district boundaries
|
|
90
|
+
sldl.json # State house district boundaries
|
|
91
|
+
county-greenville.json # County council district boundaries
|
|
92
|
+
place-greenville.json # City council district boundaries (where available)
|
|
93
|
+
...
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
## How scraping works
|
|
97
|
+
|
|
98
|
+
Data is scraped from government websites using Python adapters — one per site pattern. Five shared adapters handle the most common CMS platforms:
|
|
99
|
+
|
|
100
|
+
| Adapter | Sites | How it works |
|
|
101
|
+
|---------|-------|-------------|
|
|
102
|
+
| Revize | ~30 cities | Parses bold name / mailto / phone patterns |
|
|
103
|
+
| CivicPlus | ~14 counties | Parses staff directory tables with JS-obfuscated emails |
|
|
104
|
+
| TableAdapter | ~10 jurisdictions | Auto-detects HTML tables with name/email/phone columns |
|
|
105
|
+
| GenericMailto | ~15 cities | Finds mailto links in WordPress/Drupal content areas |
|
|
106
|
+
| DrupalViews | ~3 counties | Parses Drupal views-row and person-item patterns |
|
|
107
|
+
|
|
108
|
+
The remaining jurisdictions use bespoke adapters, MASC (Municipal Association of SC), or SCAC (SC Association of Counties) as data sources.
|
|
109
|
+
|
|
110
|
+
Boundary data comes from the US Census TIGER/Line shapefiles (state districts) and ArcGIS REST services (local districts).
|
|
111
|
+
|
|
112
|
+
## Running the scrapers
|
|
113
|
+
|
|
114
|
+
Requires Python 3.12+.
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
pip install -r requirements.txt
|
|
118
|
+
|
|
119
|
+
# Scrape everything for a state
|
|
120
|
+
python -m scrapers --state SC
|
|
121
|
+
|
|
122
|
+
# Scrape only state legislators
|
|
123
|
+
python -m scrapers --state SC --state-only
|
|
124
|
+
|
|
125
|
+
# Scrape only local councils
|
|
126
|
+
python -m scrapers --state SC --local-only
|
|
127
|
+
|
|
128
|
+
# Scrape state + local, skip boundaries (faster)
|
|
129
|
+
python -m scrapers --state SC --skip-boundaries
|
|
130
|
+
|
|
131
|
+
# Preview what would run without scraping
|
|
132
|
+
python -m scrapers --dry-run
|
|
133
|
+
|
|
134
|
+
# Validate all data files
|
|
135
|
+
python validate.py
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
## Automation
|
|
139
|
+
|
|
140
|
+
Three GitHub Actions workflows keep data fresh:
|
|
141
|
+
|
|
142
|
+
- **Weekly scrape** (Mondays 10am ET): Runs state + local scrapers, opens a PR with changes
|
|
143
|
+
- **Monthly scrape** (1st of month 10am ET): Full scrape including boundary rebuilds
|
|
144
|
+
- **Validation**: Runs on every PR touching `data/` — auto-merges `data-update/*` branches if validation passes
|
|
145
|
+
- **Publish**: Weekly npm publish if data changed since last release
|
|
146
|
+
|
|
147
|
+
## Running tests
|
|
148
|
+
|
|
149
|
+
```bash
|
|
150
|
+
pip install -r requirements-dev.txt
|
|
151
|
+
|
|
152
|
+
# Unit tests (fast, no network)
|
|
153
|
+
pytest tests/unit/ -v
|
|
154
|
+
|
|
155
|
+
# All tests including integration
|
|
156
|
+
pytest -v
|
|
157
|
+
|
|
158
|
+
# Refresh integration test snapshots from live sites
|
|
159
|
+
python scripts/refresh_snapshots.py
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
## Adding a new state
|
|
163
|
+
|
|
164
|
+
1. Add a state block to `registry.json` under `states.XX`
|
|
165
|
+
2. Add email format rules to `scrapers/state_email_rules.py`
|
|
166
|
+
3. Run `python -m scrapers --state XX --state-only` to pull legislators
|
|
167
|
+
4. Add local jurisdiction adapters as needed (see `CLAUDE.md` for the adapter selection checklist)
|
|
168
|
+
5. Add boundary sources to `registry.json` and run `python -m scrapers --state XX --boundaries-only`
|
|
169
|
+
|
|
170
|
+
## License
|
|
171
|
+
|
|
172
|
+
See [LICENSE](LICENSE) for details.
|