@yottagraph-app/data-model-skill 0.0.36 → 0.0.37
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/skill/fips/DATA_DICTIONARY.md +135 -0
- package/skill/fips/schema.yaml +77 -0
package/package.json
CHANGED
|
@@ -0,0 +1,135 @@
|
|
|
1
|
+
# Data Dictionary: FIPS Codes (FCC mirror)
|
|
2
|
+
|
|
3
|
+
## Source Overview
|
|
4
|
+
|
|
5
|
+
Federal Information Processing Standard (FIPS) codes for U.S. states and
|
|
6
|
+
counties — short numeric identifiers issued by the federal government
|
|
7
|
+
(Census Bureau / NIST historically) that uniquely tag every U.S. state,
|
|
8
|
+
the District of Columbia, and every county-or-equivalent (boroughs,
|
|
9
|
+
parishes, independent cities, census areas).
|
|
10
|
+
|
|
11
|
+
- **Publisher (mirror):** U.S. Federal Communications Commission, Office
|
|
12
|
+
of Engineering and Technology
|
|
13
|
+
- **URL:** https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt
|
|
14
|
+
- **Format:** Plain-text fixed-column ASCII, ~3,200 lines total
|
|
15
|
+
- **Cadence:** Effectively static. The federal codes change only when
|
|
16
|
+
jurisdictions are created, dissolved, or renamed (rare — once every
|
|
17
|
+
several years).
|
|
18
|
+
- **Source name:** `fips`
|
|
19
|
+
|
|
20
|
+
The file contains two sections:
|
|
21
|
+
|
|
22
|
+
1. A 51-row table of **state-level FIPS codes** (50 states + DC), one
|
|
23
|
+
line per state with a 2-digit code.
|
|
24
|
+
2. A 3,140+ row table of **county-level FIPS codes**, one line per county
|
|
25
|
+
(or county-equivalent) with a 5-digit code. The leading 2 digits are
|
|
26
|
+
the parent state's FIPS code; the trailing 3 digits identify the
|
|
27
|
+
county within the state. The county-level table is grouped by state
|
|
28
|
+
and prefaced with a header row of the form `XX000 StateName`.
|
|
29
|
+
|
|
30
|
+
**Limitations:**
|
|
31
|
+
- The FCC mirror is a republication of the older Census-published list;
|
|
32
|
+
it does not include FIPS *places* (cities/towns), MSAs, or U.S.
|
|
33
|
+
territories (Puerto Rico, Guam, U.S. Virgin Islands, etc.).
|
|
34
|
+
- A small number of county lines have parenthesized historical
|
|
35
|
+
annotations (`(created after 1990)`, `(1990 Census Area)`,
|
|
36
|
+
`(After 1990, part of Halifax County)`). These are stripped from the
|
|
37
|
+
emitted county name.
|
|
38
|
+
- The `XX000` state header rows in the county section are duplicates of
|
|
39
|
+
the state-level table and are skipped during atomization (one record
|
|
40
|
+
per state, not two).
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## Entity Types
|
|
45
|
+
|
|
46
|
+
### `location`
|
|
47
|
+
|
|
48
|
+
Used for both U.S. states (and DC) and U.S. counties (and county-
|
|
49
|
+
equivalents like Alaska boroughs, Louisiana parishes, and Virginia
|
|
50
|
+
independent cities). The level of geography is distinguished by which
|
|
51
|
+
strong-ID property is set (`fips_state` for state-level, `fips_county`
|
|
52
|
+
for county-level) and by the `administrative_level` property.
|
|
53
|
+
|
|
54
|
+
- **Primary key (state-level):** the 2-digit FIPS state code, exposed as
|
|
55
|
+
the `fips_state` strong-ID property and as a property atom.
|
|
56
|
+
- **Primary key (county-level):** the 5-digit FIPS county code, exposed
|
|
57
|
+
as the `fips_county` strong-ID property and as a property atom.
|
|
58
|
+
- **Entity resolver:** named entity, **MERGEABLE**. State and county FIPS
|
|
59
|
+
codes are stable, official identifiers and merging across sources
|
|
60
|
+
(Census, FRED, sanctions data, etc.) is desired. Disambiguation
|
|
61
|
+
snippets include the formatted name (e.g. `"Autauga County, Alabama"`).
|
|
62
|
+
- **Name format:**
|
|
63
|
+
- State-level: `Title-cased state name` (e.g. `"Alabama"`,
|
|
64
|
+
`"District of Columbia"`).
|
|
65
|
+
- County-level: `"{County} County, {State}"` for the common case;
|
|
66
|
+
parishes / boroughs / cities / census areas keep their original
|
|
67
|
+
suffix (e.g. `"East Baton Rouge Parish, Louisiana"`,
|
|
68
|
+
`"Aleutians East Borough, Alaska"`,
|
|
69
|
+
`"Baltimore city, Maryland"`).
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Properties
|
|
74
|
+
|
|
75
|
+
The dataset uses the DataSchema `namespace: fips`. Atom property keys
|
|
76
|
+
are `fips::<local_name>` for source-specific properties. Identity
|
|
77
|
+
properties also used for resolver strong IDs are `fips_state` and
|
|
78
|
+
`fips_county`.
|
|
79
|
+
|
|
80
|
+
### Common Properties (states and counties)
|
|
81
|
+
|
|
82
|
+
* `fips::administrative_level`
|
|
83
|
+
* Definition: Granularity of the geographic entity within the U.S.
|
|
84
|
+
federal hierarchy.
|
|
85
|
+
* Examples: `"state"`, `"county"`
|
|
86
|
+
* Derivation: Set to `"state"` for entries from the state-level table,
|
|
87
|
+
`"county"` for entries from the county-level table.
|
|
88
|
+
|
|
89
|
+
* `fips::official_name`
|
|
90
|
+
* Definition: Verbatim place name as it appears in the FCC mirror,
|
|
91
|
+
upper-cased for states and mixed-case for counties.
|
|
92
|
+
* Examples: `"ALABAMA"`, `"Autauga County"`,
|
|
93
|
+
`"Aleutians East Borough"`, `"East Baton Rouge Parish"`,
|
|
94
|
+
`"Baltimore city"`
|
|
95
|
+
* Derivation: The "place name" column of the source file, with the
|
|
96
|
+
parenthesized historical annotation (when present) stripped.
|
|
97
|
+
|
|
98
|
+
### State Properties
|
|
99
|
+
|
|
100
|
+
* `fips_state`
|
|
101
|
+
* Definition: Two-digit Federal Information Processing Standard code
|
|
102
|
+
that uniquely identifies a U.S. state or the District of Columbia.
|
|
103
|
+
* Examples: `"01"` (Alabama), `"06"` (California), `"11"` (DC)
|
|
104
|
+
* Derivation: Verbatim from the state-level table's first column,
|
|
105
|
+
zero-padded to two digits.
|
|
106
|
+
* Note: Also used as the strong ID on the state's `location` entity.
|
|
107
|
+
|
|
108
|
+
### County Properties
|
|
109
|
+
|
|
110
|
+
* `fips_county`
|
|
111
|
+
* Definition: Five-digit Federal Information Processing Standard code
|
|
112
|
+
that uniquely identifies a U.S. county or county-equivalent. The
|
|
113
|
+
leading two digits are the parent state's `fips_state` code; the
|
|
114
|
+
trailing three digits identify the county within the state.
|
|
115
|
+
* Examples: `"01001"` (Autauga County, Alabama), `"06037"` (Los
|
|
116
|
+
Angeles County, California), `"22033"` (East Baton Rouge Parish,
|
|
117
|
+
Louisiana)
|
|
118
|
+
* Derivation: Verbatim from the county-level table's first column,
|
|
119
|
+
zero-padded to five digits.
|
|
120
|
+
* Note: Also used as the strong ID on the county's `location` entity.
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## Entity Relationships Summary
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
location (county) ──[located_in]──→ location (state)
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
- `located_in`: Each county-level `location` is linked to its parent
|
|
131
|
+
state-level `location` via the leading two digits of the county FIPS
|
|
132
|
+
code. Both sides carry strong IDs (`fips_county` and `fips_state`),
|
|
133
|
+
which guarantees resolver merging into a single state node across all
|
|
134
|
+
county-→-state edges and across other datasets that emit the same
|
|
135
|
+
state codes.
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
# Dataset schema for the FIPS Codes (Federal Information Processing
|
|
2
|
+
# Standard) source — the FCC mirror of the Census Bureau's two-digit
|
|
3
|
+
# state and five-digit county FIPS code list.
|
|
4
|
+
#
|
|
5
|
+
# Source: https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt
|
|
6
|
+
# Cadence: effectively static (codes change once every several years).
|
|
7
|
+
#
|
|
8
|
+
# This schema describes U.S. states and counties as `location` entities
|
|
9
|
+
# identified by their FIPS strong IDs, along with a containment
|
|
10
|
+
# relationship from each county to its parent state.
|
|
11
|
+
name: "fips"
|
|
12
|
+
description: "U.S. state and county Federal Information Processing Standard (FIPS) codes from the FCC mirror of the Census Bureau list, modelled as `location` entities with FIPS strong IDs"
|
|
13
|
+
|
|
14
|
+
extraction:
|
|
15
|
+
flavors: closed
|
|
16
|
+
properties: closed
|
|
17
|
+
relationships: closed
|
|
18
|
+
attributes: closed
|
|
19
|
+
events: closed
|
|
20
|
+
|
|
21
|
+
flavors:
|
|
22
|
+
- name: "location"
|
|
23
|
+
description: "A specific named geographic location such as a city, country, region, or landmark"
|
|
24
|
+
display_name: "Location"
|
|
25
|
+
mergeability: not_mergeable
|
|
26
|
+
strong_id_properties: ["fips_state", "fips_county"]
|
|
27
|
+
examples: ["New York City", "San Francisco", "North America", "Bakery Square"]
|
|
28
|
+
passive: true
|
|
29
|
+
|
|
30
|
+
properties:
|
|
31
|
+
- name: "fips_state"
|
|
32
|
+
type: string
|
|
33
|
+
description: "Two-digit Federal Information Processing Standard code that uniquely identifies a U.S. state or the District of Columbia"
|
|
34
|
+
display_name: "State FIPS Code"
|
|
35
|
+
mergeability: not_mergeable
|
|
36
|
+
domain_flavors: ["location"]
|
|
37
|
+
examples: ["01", "06", "11", "36"]
|
|
38
|
+
passive: true
|
|
39
|
+
|
|
40
|
+
- name: "fips_county"
|
|
41
|
+
type: string
|
|
42
|
+
description: "Five-digit Federal Information Processing Standard code that uniquely identifies a U.S. county or county-equivalent; the leading two digits are the parent state's FIPS code"
|
|
43
|
+
display_name: "County FIPS Code"
|
|
44
|
+
mergeability: not_mergeable
|
|
45
|
+
domain_flavors: ["location"]
|
|
46
|
+
examples: ["01001", "06037", "22033", "51790"]
|
|
47
|
+
passive: true
|
|
48
|
+
|
|
49
|
+
- name: "administrative_level"
|
|
50
|
+
namespace: "fips"
|
|
51
|
+
type: string
|
|
52
|
+
description: "Granularity of the geographic entity within the U.S. federal hierarchy of states and counties"
|
|
53
|
+
display_name: "Administrative Level"
|
|
54
|
+
mergeability: not_mergeable
|
|
55
|
+
domain_flavors: ["location"]
|
|
56
|
+
examples: ["state", "county"]
|
|
57
|
+
passive: true
|
|
58
|
+
|
|
59
|
+
- name: "official_name"
|
|
60
|
+
namespace: "fips"
|
|
61
|
+
type: string
|
|
62
|
+
description: "Verbatim place name as published in the FCC mirror of the FIPS code list, with parenthesized historical annotations stripped"
|
|
63
|
+
display_name: "Official FIPS Name"
|
|
64
|
+
mergeability: not_mergeable
|
|
65
|
+
domain_flavors: ["location"]
|
|
66
|
+
examples: ["ALABAMA", "Autauga County", "Aleutians East Borough", "East Baton Rouge Parish"]
|
|
67
|
+
passive: true
|
|
68
|
+
|
|
69
|
+
relationships:
|
|
70
|
+
- name: "located_in"
|
|
71
|
+
description: "Administrative territory or location the entity is situated in (Wikidata P131, P276)"
|
|
72
|
+
display_name: "Located In"
|
|
73
|
+
mergeability: not_mergeable
|
|
74
|
+
domain_flavors: ["location"]
|
|
75
|
+
target_flavors: ["location"]
|
|
76
|
+
examples: ["Autauga County is located in Alabama", "Los Angeles County is located in California"]
|
|
77
|
+
passive: true
|