@yottagraph-app/data-model-skill 0.0.18 → 0.0.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@yottagraph-app/data-model-skill",
3
- "version": "0.0.18",
3
+ "version": "0.0.20",
4
4
  "description": "Data model skill documentation for AI agents - entity types, properties, and schemas from Lovelace fetch sources",
5
5
  "repository": {
6
6
  "type": "git",
@@ -0,0 +1,157 @@
1
+ # Data Dictionary: DOT Census
2
+
3
+ ## Source Overview
4
+
5
+ The FMCSA Company Census File is a daily snapshot of all motor carriers, brokers, shippers, and hazmat carriers registered with the Federal Motor Carrier Safety Administration (FMCSA), part of the U.S. Department of Transportation.
6
+
7
+ - **Publisher:** FMCSA / U.S. Department of Transportation
8
+ - **URL:** https://data.transportation.gov/Trucking-and-Motorcoaches/Company-Census-File/az4n-8mr2
9
+ - **Format:** CSV (~500K+ rows)
10
+ - **Cadence:** Updated daily from a 24-hour-old FMCSA database snapshot
11
+ - **Source name:** `dotcensus`
12
+
13
+ Each row represents one registered entity identified by a unique USDOT number. The file contains identity, physical location, fleet composition, carrier operation type, and safety ratings.
14
+
15
+ **Limitations:** The census reflects registration data, not real-time operational status. Fleet sizes and driver counts are self-reported by carriers during MCS-150 filings and may be stale. Safety ratings are assigned only after compliance reviews and are absent for most carriers.
16
+
17
+ ---
18
+
19
+ ## Entity Types
20
+
21
+ ### `organization`
22
+
23
+ A motor carrier, broker, shipper, or other entity registered with FMCSA.
24
+
25
+ - Primary key: `usdot_number` (USDOT Number, unique per registrant)
26
+ - Entity resolver: named entity, MERGEABLE. Strong ID = `usdot_number`. Disambiguation snippet includes the physical address when available.
27
+ - Name: uses `DBA_NAME` when it differs from `LEGAL_NAME`; otherwise uses `LEGAL_NAME`.
28
+
29
+ ### `legal_entity`
30
+
31
+ The legal owner of a carrier when `LEGAL_NAME` differs from `DBA_NAME`. Represents the person or organization that legally owns the business operating under the DBA name.
32
+
33
+ - Primary key: none (resolved by name)
34
+ - Entity resolver: named entity, MERGEABLE. Disambiguation snippet includes the entity name.
35
+ - Only produced when `DBA_NAME` is present and differs from `LEGAL_NAME`.
36
+
37
+ ### `location`
38
+
39
+ The physical location (city + state) of a registered carrier.
40
+
41
+ - Primary key: none (resolved by name)
42
+ - Entity resolver: named entity, MERGEABLE. Disambiguation snippet includes the formatted location name.
43
+ - Name format: `"{city}, {state}"` or `"{city}, {state}, {country}"` when country is present.
44
+
45
+ ---
46
+
47
+ ## Properties
48
+
49
+ ### Organization Properties
50
+
51
+ #### Identity and Registration
52
+
53
+ * `usdot_number`
54
+ * Definition: USDOT number assigned by FMCSA, unique per registered entity.
55
+ * Examples: `"12345"`, `"99999"`
56
+ * Derivation: `DOT_NUMBER` column. Emitted as both a strong ID for entity resolution and as a property atom on every record.
57
+
58
+ * `address`
59
+ * Definition: Physical street address of the carrier.
60
+ * Examples: `"21154 HWY EAST, SILOAM SPRINGS, AR 72761"`, `"100 Main St, Dallas, TX 75201"`
61
+ * Derivation: Concatenation of `PHY_STREET`, `PHY_CITY`, `PHY_STATE`, and `PHY_ZIP` columns.
62
+
63
+ * `dot_phone_number`
64
+ * Definition: Primary phone number on file with FMCSA.
65
+ * Examples: `"5551234567"`
66
+ * Derivation: `PHONE` column.
67
+
68
+ * `dot_business_org_type`
69
+ * Definition: Business organization type of the registrant.
70
+ * Examples: `"Individual"`, `"Partnership"`, `"Corporation"`
71
+ * Derivation: `BUSINESS_ORG_ID` column decoded from numeric codes (1=Individual, 2=Partnership, 3=Corporation).
72
+
73
+ * `dot_entity_type`
74
+ * Definition: FMCSA entity type indicating the registrant's role in freight transportation.
75
+ * Examples: `"Carrier"`, `"Carrier; Broker"`, `"Carrier; Shipper"`
76
+ * Derivation: `CARSHIP` column. Semicolon-delimited code list decoded from single-letter codes (C=Carrier, B=Broker, S=Shipper, T=Cargo Tank, R=Registrant).
77
+
78
+ * `dot_operating_authority`
79
+ * Definition: Operating authority classification describing the carrier's authorization type.
80
+ * Examples: `"AUTHORIZED FOR HIRE"`, `"PRIVATE PROPERTY"`, `"EXEMPT FOR HIRE"`
81
+ * Derivation: `CLASSDEF` column, passed through as-is.
82
+
83
+ #### Carrier Status and Operations
84
+
85
+ * `dot_carrier_status`
86
+ * Definition: FMCSA carrier registration status.
87
+ * Examples: `"A (Active)"`, `"I (Inactive)"`, `"N (Not Authorized)"`
88
+ * Derivation: `STATUS_CODE` column decoded from single-letter codes.
89
+
90
+ * `dot_carrier_operation_type`
91
+ * Definition: Type of carrier operation based on registration.
92
+ * Examples: `"A (Interstate)"`, `"B (Intrastate Hazmat)"`, `"C (Intrastate Non-Hazmat)"`
93
+ * Derivation: `CARRIER_OPERATION` column decoded from single-letter codes.
94
+ * Note: The B/C distinction reflects the carrier's registration type, not actual hazmat activity. A carrier registered as C (non-hazmat) can still transport hazmat (see `dot_hazmat_indicator`).
95
+
96
+ * `dot_hazmat_indicator`
97
+ * Definition: Whether the carrier transports hazardous materials.
98
+ * Examples: `1.0` (yes), `0.0` (no)
99
+ * Derivation: `HM_Ind` column. `"Y"` → 1.0, `"N"` → 0.0. Omitted when blank.
100
+
101
+ #### Fleet Composition
102
+
103
+ * `dot_total_drivers`
104
+ * Definition: Total number of drivers reported by the carrier.
105
+ * Examples: `10.0`, `250.0`
106
+ * Derivation: `TOTAL_DRIVERS` column, parsed as float. Omitted when zero or blank.
107
+
108
+ * `dot_total_power_units`
109
+ * Definition: Total number of power units (trucks, tractors) operated by the carrier.
110
+ * Examples: `5.0`, `1200.0`
111
+ * Derivation: `POWER_UNITS` column, parsed as float. Omitted when zero or blank.
112
+
113
+ * `dot_total_bus_units`
114
+ * Definition: Total number of bus units operated by the carrier.
115
+ * Examples: `2.0`, `50.0`
116
+ * Derivation: `BUS_UNITS` column, parsed as float. Omitted when zero or blank.
117
+
118
+ * `dot_fleet_size_category`
119
+ * Definition: Fleet size category assigned by FMCSA based on total power units.
120
+ * Examples: `"A (1-6 power units)"`, `"D (20-100 power units)"`, `"F (1000+ power units)"`
121
+ * Derivation: `FLEETSIZE` column decoded from single-letter codes. Omitted when `"0"` (none reported).
122
+
123
+ #### Safety
124
+
125
+ * `dot_safety_rating`
126
+ * Definition: FMCSA safety rating assigned during a compliance review.
127
+ * Examples: `"Satisfactory"`, `"Conditional"`, `"Unsatisfactory"`
128
+ * Derivation: `SAFETY_RATING` column decoded from single-letter codes (S, C, U).
129
+ * Note: Most carriers have no safety rating — it is only assigned after a compliance review.
130
+
131
+ * `dot_safety_rating_date`
132
+ * Definition: Date the safety rating was assigned.
133
+ * Examples: `"2024-01-01"`
134
+ * Derivation: `SAFETY_RATING_DATE` column parsed from `YYYYMMDD` format, output as `YYYY-MM-DD`.
135
+
136
+ #### Filing Dates
137
+
138
+ * `dot_last_mcs150_filing_date`
139
+ * Definition: Date the carrier last filed or updated their MCS-150 Motor Carrier Identification Report.
140
+ * Examples: `"2024-01-01"`
141
+ * Derivation: `MCS150_DATE` column parsed from `YYYYMMDD HHMM` format, output as `YYYY-MM-DD`.
142
+
143
+ ### Timestamp Derivation
144
+
145
+ Record timestamps are derived in priority order: `ADD_DATE` (carrier registration date), then `MCS150_DATE` (last MCS-150 filing). Rows where neither date parses successfully are dropped.
146
+
147
+ ---
148
+
149
+ ## Entity Relationships Summary
150
+
151
+ ```
152
+ legal_entity ──[doing_business_as]──→ organization
153
+ organization ──[is_located_at]──────→ location
154
+ ```
155
+
156
+ - `doing_business_as`: Links the legal owner (`LEGAL_NAME`) to the business operating name (`DBA_NAME`). Only produced when the two names differ.
157
+ - `is_located_at`: Links the carrier to its physical location. Only produced when both `PHY_CITY` and `PHY_STATE` are present.
@@ -0,0 +1,196 @@
1
+ # Dataset schema for FMCSA Company Census File (DOT Motor Carrier Census).
2
+ #
3
+ # Source: https://data.transportation.gov/Trucking-and-Motorcoaches/Company-Census-File/az4n-8mr2
4
+ # Updated daily by FMCSA from a 24-hour-old database snapshot.
5
+ #
6
+ # This schema describes motor carrier entities registered with the Federal
7
+ # Motor Carrier Safety Administration, including identity, physical location,
8
+ # fleet composition, and safety ratings.
9
+ name: "dotcensus"
10
+ description: "FMCSA motor carrier company census data including registration, fleet size, carrier operations, and safety ratings from the DOT Company Census File"
11
+
12
+ extraction:
13
+ flavors: closed
14
+ properties: closed
15
+ relationships: closed
16
+ attributes: closed
17
+ events: closed
18
+
19
+ flavors:
20
+ - name: "organization"
21
+ description: "A particular business, institution, or organization such as a corporation, university, government agency, or non-profit"
22
+ display_name: "Organization"
23
+ mergeability: not_mergeable
24
+ strong_id_properties: ["usdot_number"]
25
+ passive: true
26
+
27
+ - name: "person"
28
+ description: "A named individual such as a business owner, executive, or public figure"
29
+ display_name: "Person"
30
+ mergeability: not_mergeable
31
+ passive: true
32
+
33
+ - name: "location"
34
+ description: "A specific named geographic location such as a city, country, region, or landmark"
35
+ display_name: "Location"
36
+ mergeability: not_mergeable
37
+ examples: ["New York City", "San Francisco", "North America", "Bakery Square"]
38
+ passive: true
39
+
40
+ properties:
41
+ - name: "usdot_number"
42
+ type: string
43
+ description: "USDOT number assigned by FMCSA, uniquely identifying a registered motor carrier, broker, or shipper"
44
+ display_name: "USDOT Number"
45
+ mergeability: not_mergeable
46
+ domain_flavors: ["organization"]
47
+ examples: ["12345", "99999"]
48
+ passive: true
49
+
50
+ - name: "address"
51
+ type: string
52
+ description: "Physical street address of the entity"
53
+ display_name: "Address"
54
+ mergeability: not_mergeable
55
+ domain_flavors: ["organization"]
56
+ examples: ["21154 HWY EAST, SILOAM SPRINGS, AR 72761"]
57
+ passive: true
58
+
59
+ - name: "dot_carrier_status"
60
+ type: string
61
+ description: "FMCSA carrier registration status"
62
+ display_name: "Carrier Status"
63
+ mergeability: not_mergeable
64
+ domain_flavors: ["organization"]
65
+ examples: ["A (Active)", "I (Inactive)", "N (Not Authorized)"]
66
+ passive: true
67
+
68
+ - name: "dot_phone_number"
69
+ type: string
70
+ description: "Primary phone number of the carrier"
71
+ display_name: "Phone Number"
72
+ mergeability: not_mergeable
73
+ domain_flavors: ["organization"]
74
+ examples: ["5551234567"]
75
+ passive: true
76
+
77
+ - name: "dot_carrier_operation_type"
78
+ type: string
79
+ description: "Type of carrier operation"
80
+ display_name: "Carrier Operation Type"
81
+ mergeability: not_mergeable
82
+ domain_flavors: ["organization"]
83
+ examples: ["A (Interstate)", "B (Intrastate Hazmat)", "C (Intrastate Non-Hazmat)"]
84
+ passive: true
85
+
86
+ - name: "dot_safety_rating"
87
+ type: string
88
+ description: "FMCSA safety rating assigned during compliance review"
89
+ display_name: "Safety Rating"
90
+ mergeability: not_mergeable
91
+ domain_flavors: ["organization"]
92
+ examples: ["Satisfactory", "Conditional", "Unsatisfactory"]
93
+ passive: true
94
+
95
+ - name: "dot_safety_rating_date"
96
+ type: string
97
+ description: "Date the safety rating was assigned, derived from the SAFETY_RATING_DATE field"
98
+ display_name: "Safety Rating Date"
99
+ mergeability: not_mergeable
100
+ domain_flavors: ["organization"]
101
+ passive: true
102
+
103
+ - name: "dot_last_mcs150_filing_date"
104
+ type: string
105
+ description: "Date the carrier last filed or updated their MCS-150 Motor Carrier Identification Report with FMCSA"
106
+ display_name: "Last MCS-150 Filing Date"
107
+ mergeability: not_mergeable
108
+ domain_flavors: ["organization"]
109
+ passive: true
110
+
111
+ - name: "dot_hazmat_indicator"
112
+ type: float
113
+ description: "Whether the carrier transports hazardous materials (1.0 = yes, 0.0 = no), derived from the HM_Ind field"
114
+ display_name: "Hazmat Indicator"
115
+ mergeability: not_mergeable
116
+ domain_flavors: ["organization"]
117
+ passive: true
118
+
119
+ - name: "dot_total_drivers"
120
+ type: float
121
+ description: "Total number of drivers reported by the carrier"
122
+ display_name: "Total Drivers"
123
+ mergeability: not_mergeable
124
+ domain_flavors: ["organization"]
125
+ passive: true
126
+
127
+ - name: "dot_total_power_units"
128
+ type: float
129
+ description: "Total number of power units (trucks, tractors) operated by the carrier"
130
+ display_name: "Total Power Units"
131
+ mergeability: not_mergeable
132
+ domain_flavors: ["organization"]
133
+ passive: true
134
+
135
+ - name: "dot_total_bus_units"
136
+ type: float
137
+ description: "Total number of bus units operated by the carrier"
138
+ display_name: "Total Bus Units"
139
+ mergeability: not_mergeable
140
+ domain_flavors: ["organization"]
141
+ passive: true
142
+
143
+ - name: "dot_fleet_size_category"
144
+ type: string
145
+ description: "Fleet size category assigned by FMCSA based on the number of power units"
146
+ display_name: "Fleet Size Category"
147
+ mergeability: not_mergeable
148
+ domain_flavors: ["organization"]
149
+ examples: ["A (1-6 power units)", "D (20-100 power units)", "F (1000+ power units)"]
150
+ passive: true
151
+
152
+ - name: "dot_business_org_type"
153
+ type: string
154
+ description: "Business organization type of the carrier"
155
+ display_name: "Business Organization Type"
156
+ mergeability: not_mergeable
157
+ domain_flavors: ["organization"]
158
+ examples: ["Individual", "Partnership", "Corporation"]
159
+ passive: true
160
+
161
+ - name: "dot_entity_type"
162
+ type: string
163
+ description: "FMCSA entity type indicating whether the entity operates as a carrier, broker, shipper, or combination"
164
+ display_name: "Entity Type"
165
+ mergeability: not_mergeable
166
+ domain_flavors: ["organization"]
167
+ examples: ["Carrier", "Carrier; Broker", "Carrier; Shipper"]
168
+ passive: true
169
+
170
+ - name: "dot_operating_authority"
171
+ type: string
172
+ description: "Operating authority classification describing the carrier's authorization type"
173
+ display_name: "Operating Authority"
174
+ mergeability: not_mergeable
175
+ domain_flavors: ["organization"]
176
+ examples: ["AUTHORIZED FOR HIRE", "PRIVATE PROPERTY", "EXEMPT FOR HIRE"]
177
+ passive: true
178
+
179
+ relationships:
180
+ - name: "doing_business_as"
181
+ description: "A legal entity is doing business as (DBA) an organization"
182
+ display_name: "Doing Business As"
183
+ mergeability: not_mergeable
184
+ domain_flavors: ["person", "organization"]
185
+ target_flavors: ["organization"]
186
+ examples: ["LUIS GALVAN doing business as G & G TRANSPORTATION CO"]
187
+ passive: true
188
+
189
+ - name: "is_located_at"
190
+ description: "An entity is located at, operates in, resides in, is headquartered in, was born in, visits, or died in a location"
191
+ display_name: "Located At"
192
+ mergeability: not_mergeable
193
+ domain_flavors: ["organization"]
194
+ target_flavors: ["location"]
195
+ examples: ["DAYSPRING CARDS INC is located at SILOAM SPRINGS, AR"]
196
+ passive: true
@@ -2763,6 +2763,33 @@ relationships:
2763
2763
  target_flavors: ["organization"]
2764
2764
  passive: true
2765
2765
 
2766
+ - name: "has_administrator"
2767
+ description: "Link from a fund series to its fund administrator (N-CEN). Sub-admins are excluded (mirrors the custodied_by sub-custodian filter); the is_admin_affiliated attribute distinguishes in-family from third-party admin relationships."
2768
+ display_name: "Has Administrator"
2769
+ mergeability: not_mergeable
2770
+ domain_flavors: ["financial_instrument"]
2771
+ target_flavors: ["organization"]
2772
+ passive: true
2773
+
2774
+ # has_broker (N-CEN Item E.1, agency-execution brokers) is intentionally
2775
+ # NOT modelled. A day of evals showed it produced ~6.5k edges — 5.8x the
2776
+ # volume of advised_by — and the source XML reports broker names in ALL
2777
+ # CAPS while every other section uses mixed case, so the same firm gets
2778
+ # fragmented into two organization entities. High volume + bad merge
2779
+ # quality + low query value (top-10 brokers are the same big banks for
2780
+ # every active equity fund) → the data isn't worth surfacing in the graph.
2781
+ # has_broker_dealer (Item E.2, principal trades) is kept because
2782
+ # principal-trade counterparties are economically distinct, the lists are
2783
+ # short, the names dedup cleanly, and the captive-distributor pattern is
2784
+ # genuinely informative.
2785
+ - name: "has_broker_dealer"
2786
+ description: "Link from a fund series to a broker-dealer that the fund traded with on a principal basis during the reporting period (N-CEN Item E.2). Broker-dealers act as principals in the trade — they take the other side of it — so this captures market-making and captive-distributor relationships rather than mere agency execution."
2787
+ display_name: "Has Broker-Dealer"
2788
+ mergeability: not_mergeable
2789
+ domain_flavors: ["financial_instrument"]
2790
+ target_flavors: ["organization"]
2791
+ passive: true
2792
+
2766
2793
  - name: "has_master_fund"
2767
2794
  description: "Link from a feeder fund to its master fund in a master-feeder structure (N-CEN Item C.3). The feeder invests substantially all assets in the master, so resolving this edge is required to see the feeder's true portfolio (which lives on the master's N-PORT)."
2768
2795
  display_name: "Has Master Fund"
@@ -3064,6 +3091,24 @@ attributes:
3064
3091
  display_name: "Sub-Adviser Affiliated"
3065
3092
  mergeability: not_mergeable
3066
3093
 
3094
+ # ── has_administrator relationship attributes (N-CEN) ──
3095
+
3096
+ - property: "has_administrator"
3097
+ name: "is_admin_affiliated"
3098
+ type: string
3099
+ description: "Whether the fund administrator is affiliated with the registrant (Y or N) per N-CEN"
3100
+ display_name: "Admin Affiliated"
3101
+ mergeability: not_mergeable
3102
+
3103
+ # ── has_broker_dealer relationship attributes (N-CEN Item E.2) ──
3104
+
3105
+ - property: "has_broker_dealer"
3106
+ name: "broker_dealer_commission"
3107
+ type: string
3108
+ description: "Commission paid to this broker-dealer for principal trades during the reporting period (USD) per N-CEN"
3109
+ display_name: "Broker-Dealer Commission"
3110
+ mergeability: not_mergeable
3111
+
3067
3112
  # ── is_director relationship attributes (N-CEN) ──
3068
3113
 
3069
3114
  - property: "is_director"