@yottagraph-app/data-model-skill 0.0.36 → 0.0.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@yottagraph-app/data-model-skill",
3
- "version": "0.0.36",
3
+ "version": "0.0.38",
4
4
  "description": "Data model skill documentation for AI agents - entity types, properties, and schemas from Lovelace fetch sources",
5
5
  "repository": {
6
6
  "type": "git",
@@ -0,0 +1,135 @@
1
+ # Data Dictionary: FIPS Codes (FCC mirror)
2
+
3
+ ## Source Overview
4
+
5
+ Federal Information Processing Standard (FIPS) codes for U.S. states and
6
+ counties — short numeric identifiers issued by the federal government
7
+ (Census Bureau / NIST historically) that uniquely tag every U.S. state,
8
+ the District of Columbia, and every county-or-equivalent (boroughs,
9
+ parishes, independent cities, census areas).
10
+
11
+ - **Publisher (mirror):** U.S. Federal Communications Commission, Office
12
+ of Engineering and Technology
13
+ - **URL:** https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt
14
+ - **Format:** Plain-text fixed-column ASCII, ~3,200 lines total
15
+ - **Cadence:** Effectively static. The federal codes change only when
16
+ jurisdictions are created, dissolved, or renamed (rare — once every
17
+ several years).
18
+ - **Source name:** `fips`
19
+
20
+ The file contains two sections:
21
+
22
+ 1. A 51-row table of **state-level FIPS codes** (50 states + DC), one
23
+ line per state with a 2-digit code.
24
+ 2. A 3,140+ row table of **county-level FIPS codes**, one line per county
25
+ (or county-equivalent) with a 5-digit code. The leading 2 digits are
26
+ the parent state's FIPS code; the trailing 3 digits identify the
27
+ county within the state. The county-level table is grouped by state
28
+ and prefaced with a header row of the form `XX000 StateName`.
29
+
30
+ **Limitations:**
31
+ - The FCC mirror is a republication of the older Census-published list;
32
+ it does not include FIPS *places* (cities/towns), MSAs, or U.S.
33
+ territories (Puerto Rico, Guam, U.S. Virgin Islands, etc.).
34
+ - A small number of county lines have parenthesized historical
35
+ annotations (`(created after 1990)`, `(1990 Census Area)`,
36
+ `(After 1990, part of Halifax County)`). These are stripped from the
37
+ emitted county name.
38
+ - The `XX000` state header rows in the county section are duplicates of
39
+ the state-level table and are skipped during atomization (one record
40
+ per state, not two).
41
+
42
+ ---
43
+
44
+ ## Entity Types
45
+
46
+ ### `location`
47
+
48
+ Used for both U.S. states (and DC) and U.S. counties (and county-
49
+ equivalents like Alaska boroughs, Louisiana parishes, and Virginia
50
+ independent cities). The level of geography is distinguished by which
51
+ strong-ID property is set (`fips_state` for state-level, `fips_county`
52
+ for county-level) and by the `administrative_level` property.
53
+
54
+ - **Primary key (state-level):** the 2-digit FIPS state code, exposed as
55
+ the `fips_state` strong-ID property and as a property atom.
56
+ - **Primary key (county-level):** the 5-digit FIPS county code, exposed
57
+ as the `fips_county` strong-ID property and as a property atom.
58
+ - **Entity resolver:** named entity, **MERGEABLE**. State and county FIPS
59
+ codes are stable, official identifiers and merging across sources
60
+ (Census, FRED, sanctions data, etc.) is desired. Disambiguation
61
+ snippets include the formatted name (e.g. `"Autauga County, Alabama"`).
62
+ - **Name format:**
63
+ - State-level: `Title-cased state name` (e.g. `"Alabama"`,
64
+ `"District of Columbia"`).
65
+ - County-level: `"{County} County, {State}"` for the common case;
66
+ parishes / boroughs / cities / census areas keep their original
67
+ suffix (e.g. `"East Baton Rouge Parish, Louisiana"`,
68
+ `"Aleutians East Borough, Alaska"`,
69
+ `"Baltimore city, Maryland"`).
70
+
71
+ ---
72
+
73
+ ## Properties
74
+
75
+ The dataset uses the DataSchema `namespace: fips`. Atom property keys
76
+ are `fips::<local_name>` for source-specific properties. Identity
77
+ properties also used for resolver strong IDs are `fips_state` and
78
+ `fips_county`.
79
+
80
+ ### Common Properties (states and counties)
81
+
82
+ * `fips::administrative_level`
83
+ * Definition: Granularity of the geographic entity within the U.S.
84
+ federal hierarchy.
85
+ * Examples: `"state"`, `"county"`
86
+ * Derivation: Set to `"state"` for entries from the state-level table,
87
+ `"county"` for entries from the county-level table.
88
+
89
+ * `fips::official_name`
90
+ * Definition: Verbatim place name as it appears in the FCC mirror,
91
+ upper-cased for states and mixed-case for counties.
92
+ * Examples: `"ALABAMA"`, `"Autauga County"`,
93
+ `"Aleutians East Borough"`, `"East Baton Rouge Parish"`,
94
+ `"Baltimore city"`
95
+ * Derivation: The "place name" column of the source file, with the
96
+ parenthesized historical annotation (when present) stripped.
97
+
98
+ ### State Properties
99
+
100
+ * `fips_state`
101
+ * Definition: Two-digit Federal Information Processing Standard code
102
+ that uniquely identifies a U.S. state or the District of Columbia.
103
+ * Examples: `"01"` (Alabama), `"06"` (California), `"11"` (DC)
104
+ * Derivation: Verbatim from the state-level table's first column,
105
+ zero-padded to two digits.
106
+ * Note: Also used as the strong ID on the state's `location` entity.
107
+
108
+ ### County Properties
109
+
110
+ * `fips_county`
111
+ * Definition: Five-digit Federal Information Processing Standard code
112
+ that uniquely identifies a U.S. county or county-equivalent. The
113
+ leading two digits are the parent state's `fips_state` code; the
114
+ trailing three digits identify the county within the state.
115
+ * Examples: `"01001"` (Autauga County, Alabama), `"06037"` (Los
116
+ Angeles County, California), `"22033"` (East Baton Rouge Parish,
117
+ Louisiana)
118
+ * Derivation: Verbatim from the county-level table's first column,
119
+ zero-padded to five digits.
120
+ * Note: Also used as the strong ID on the county's `location` entity.
121
+
122
+ ---
123
+
124
+ ## Entity Relationships Summary
125
+
126
+ ```
127
+ location (county) ──[located_in]──→ location (state)
128
+ ```
129
+
130
+ - `located_in`: Each county-level `location` is linked to its parent
131
+ state-level `location` via the leading two digits of the county FIPS
132
+ code. Both sides carry strong IDs (`fips_county` and `fips_state`),
133
+ which guarantees resolver merging into a single state node across all
134
+ county-→-state edges and across other datasets that emit the same
135
+ state codes.
@@ -0,0 +1,77 @@
1
+ # Dataset schema for the FIPS Codes (Federal Information Processing
2
+ # Standard) source — the FCC mirror of the Census Bureau's two-digit
3
+ # state and five-digit county FIPS code list.
4
+ #
5
+ # Source: https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt
6
+ # Cadence: effectively static (codes change once every several years).
7
+ #
8
+ # This schema describes U.S. states and counties as `location` entities
9
+ # identified by their FIPS strong IDs, along with a containment
10
+ # relationship from each county to its parent state.
11
+ name: "fips"
12
+ description: "U.S. state and county Federal Information Processing Standard (FIPS) codes from the FCC mirror of the Census Bureau list, modelled as `location` entities with FIPS strong IDs"
13
+
14
+ extraction:
15
+ flavors: closed
16
+ properties: closed
17
+ relationships: closed
18
+ attributes: closed
19
+ events: closed
20
+
21
+ flavors:
22
+ - name: "location"
23
+ description: "A specific named geographic location such as a city, country, region, or landmark"
24
+ display_name: "Location"
25
+ mergeability: not_mergeable
26
+ strong_id_properties: ["fips_state", "fips_county"]
27
+ examples: ["New York City", "San Francisco", "North America", "Bakery Square"]
28
+ passive: true
29
+
30
+ properties:
31
+ - name: "fips_state"
32
+ type: string
33
+ description: "Two-digit Federal Information Processing Standard code that uniquely identifies a U.S. state or the District of Columbia"
34
+ display_name: "State FIPS Code"
35
+ mergeability: not_mergeable
36
+ domain_flavors: ["location"]
37
+ examples: ["01", "06", "11", "36"]
38
+ passive: true
39
+
40
+ - name: "fips_county"
41
+ type: string
42
+ description: "Five-digit Federal Information Processing Standard code that uniquely identifies a U.S. county or county-equivalent; the leading two digits are the parent state's FIPS code"
43
+ display_name: "County FIPS Code"
44
+ mergeability: not_mergeable
45
+ domain_flavors: ["location"]
46
+ examples: ["01001", "06037", "22033", "51790"]
47
+ passive: true
48
+
49
+ - name: "administrative_level"
50
+ namespace: "fips"
51
+ type: string
52
+ description: "Granularity of the geographic entity within the U.S. federal hierarchy of states and counties"
53
+ display_name: "Administrative Level"
54
+ mergeability: not_mergeable
55
+ domain_flavors: ["location"]
56
+ examples: ["state", "county"]
57
+ passive: true
58
+
59
+ - name: "official_name"
60
+ namespace: "fips"
61
+ type: string
62
+ description: "Verbatim place name as published in the FCC mirror of the FIPS code list, with parenthesized historical annotations stripped"
63
+ display_name: "Official FIPS Name"
64
+ mergeability: not_mergeable
65
+ domain_flavors: ["location"]
66
+ examples: ["ALABAMA", "Autauga County", "Aleutians East Borough", "East Baton Rouge Parish"]
67
+ passive: true
68
+
69
+ relationships:
70
+ - name: "located_in"
71
+ description: "Administrative territory or location the entity is situated in (Wikidata P131, P276)"
72
+ display_name: "Located In"
73
+ mergeability: not_mergeable
74
+ domain_flavors: ["location"]
75
+ target_flavors: ["location"]
76
+ examples: ["Autauga County is located in Alabama", "Los Angeles County is located in California"]
77
+ passive: true
@@ -0,0 +1,481 @@
1
+ # IAPD Data Dictionary
2
+
3
+ ## 1. Purpose / Source Overview
4
+
5
+ The IAPD dataset publishes the SEC's Investment Adviser Public Disclosure
6
+ data — the canonical federal+state registry of investment advisory firms
7
+ in the United States. It is sourced from two daily-refreshed XML
8
+ compilation feeds:
9
+
10
+ - `IA_FIRM_SEC_Feed_*.xml.gz` — SEC-registered Investment Advisers (RIAs)
11
+ and Exempt Reporting Advisers (ERAs). ~23K firms.
12
+ - `IA_FIRM_STATE_Feed_*.xml.gz` — State-registered Investment Advisers
13
+ (IAs). ~22K firms.
14
+
15
+ Each `<Firm>` element corresponds to a single advisory firm's most recent
16
+ **Form ADV Part 1A** filing on file with the SEC's IARD system. Part 1A
17
+ captures firm identity, contact info, regulatory status, ownership form,
18
+ employees, clients, assets under management (AUM), services offered, and
19
+ disciplinary Y/N flags. The bulk feed does **not** include Form ADV
20
+ Part 1B (state-specific addenda), Form ADV Part 2 (firm brochure PDFs),
21
+ or Form ADV Part 3 (CRS Form). Those would require per-firm PDF fetches
22
+ and are deferred to a future iteration.
23
+
24
+ Cadence: nightly. Volume: ~45K firm records per snapshot. Each daily
25
+ snapshot is a full re-publish — there is no incremental "diffs only"
26
+ feed.
27
+
28
+ Two `Record.Source` values are emitted, so downstream consumers can
29
+ distinguish the firms by their regulatory domicile:
30
+
31
+ | Record.Source | Coverage |
32
+ |---|---|
33
+ | `iapd_sec` | SEC-registered RIAs + ERAs (`IA_FIRM_SEC_Feed_*.xml.gz`) |
34
+ | `iapd_state` | State-registered IAs (`IA_FIRM_STATE_Feed_*.xml.gz`) |
35
+
36
+ The two streams share entity types, properties, and the strong-id
37
+ scheme, so a firm that transitions between SEC-registered and
38
+ state-registered status will resolve to the same `organization` entity
39
+ over time.
40
+
41
+ ## 2. Entity Types
42
+
43
+ ### `organization`
44
+
45
+ A registered investment adviser firm — i.e. an entity registered with
46
+ the SEC and/or one or more state securities regulators to provide
47
+ investment advice. The bulk feed represents only the firm; the
48
+ individual investment-adviser representatives are not in this feed.
49
+
50
+ - **Strong-ID properties:** `crd_number`, `sec_file_number`,
51
+ `company_cik`, `lei`.
52
+ - `crd_number` is the primary strong-ID — FINRA-assigned, present on
53
+ every firm in both feeds, stable across registration-status
54
+ changes. Property name is intentionally unprefixed (not
55
+ `organization_crd_number`) so it matches the same-named strong-ID
56
+ on `edgar`'s `organization` flavor — an IAPD firm and an EDGAR
57
+ registrant with the same CRD resolve to the same entity.
58
+ - `sec_file_number` (`801-…` / `802-…`) is a secondary strong-ID
59
+ emitted only for firms in the SEC feed where `Info@SECNb` is
60
+ non-empty.
61
+ - `company_cik` and `lei` are declared strong-ID slots but **not
62
+ populated** from the bulk Part 1A XML — neither field is in that
63
+ feed. The slots exist so downstream cross-walks (Form ADV
64
+ Schedule R, third-party CRD↔LEI / CRD↔CIK mappings) can populate
65
+ them without a breaking schema change. Adding a strong-ID property
66
+ after the schema is in production is a breaking change; adding
67
+ *values* into an existing slot is not.
68
+
69
+ ### `location`
70
+
71
+ A named geographic place where the firm has a main office or mailing
72
+ address. Resolved by name (`City, ST` for US; `City, Country` for
73
+ non-US).
74
+
75
+ ## 3. Properties
76
+
77
+ ### Identity and Registration (Organization)
78
+
79
+ * `crd_number` *(primary strong-ID)*
80
+ * Definition: FINRA-assigned firm CRD (Central Registration
81
+ Depository) number. The canonical unique identifier for both
82
+ SEC-registered and state-registered investment advisers; stable
83
+ across registration status changes.
84
+ * Examples: `283882`, `312360`, `324069`
85
+ * Derivation: `Info@FirmCrdNb` attribute on each `<Firm>` element of
86
+ the daily IAPD XML feed; identical attribute name on SEC and state
87
+ feeds. Firms with empty `Info@FirmCrdNb` are skipped by the
88
+ streamer (they would have no usable strong-ID).
89
+ * Cross-source: matches `edgar.organization.crd_number` —
90
+ purposefully unprefixed.
91
+
92
+ * `sec_file_number` *(secondary strong-ID, SEC feed only)*
93
+ * Definition: SEC-issued file number for the firm. Prefix encodes the
94
+ registration kind: `801-XXXXXX` for SEC-registered RIAs, `802-XXXXXX`
95
+ for Exempt Reporting Advisers (ERAs).
96
+ * Examples: `801-135399`, `802-120553`
97
+ * Derivation: `Info@SECNb` attribute on `<Firm>` in the SEC feed.
98
+ Not present on state-feed firms; emitted as a strong-ID only when
99
+ non-empty.
100
+ * Cross-source: matches `edgar.organization.sec_file_number` for
101
+ firms that also file via EDGAR.
102
+
103
+ * `company_cik` *(strong-ID slot — not populated by IAPD)*
104
+ * Definition: SEC Central Index Key — EDGAR's per-filer numeric ID.
105
+ * Examples: `1234567`
106
+ * Derivation: not present in the bulk Part 1A XML. Slot exists so
107
+ Form ADV Schedule R or third-party CRD↔CIK cross-walks can populate
108
+ it without a breaking schema change. Property name matches
109
+ `edgar.organization.company_cik` so cross-source ER works the
110
+ moment values appear.
111
+
112
+ * `lei` *(strong-ID slot — not populated by IAPD)*
113
+ * Definition: Legal Entity Identifier — 20-character ISO 17442 code.
114
+ * Examples: `549300LQQAVPLATTSU38`
115
+ * Derivation: not present in the bulk Part 1A XML. Slot exists so
116
+ per-firm IAPD detail (Schedule R) or third-party CRD↔LEI
117
+ cross-walks can populate it without a breaking schema change.
118
+ Property name matches `gleif.organization.lei` and
119
+ `edgar.organization.lei`.
120
+
121
+ * `primary_business_name`
122
+ * Definition: Primary name under which the firm conducts advisory
123
+ business. May differ from legal name when the firm operates under a
124
+ trade name / DBA.
125
+ * Examples: `RABENOLD ADVISORS, INC.`, `MK CAPITAL`
126
+ * Derivation: `Info@BusNm` attribute on `<Firm>`.
127
+
128
+ * `legal_name`
129
+ * Definition: Firm's full legal name as registered with the SEC and/or
130
+ state regulators.
131
+ * Examples: `RABENOLD ADVISORS, INC.`, `MK CAPITAL COMPANY`
132
+ * Derivation: `Info@LegalNm` attribute on `<Firm>`. Also passed as an
133
+ alias on the entity for entity resolution.
134
+
135
+ * `is_umbrella_registration`
136
+ * Definition: String-encoded boolean (`"true"`/`"false"`) indicating
137
+ whether this filing represents an umbrella registration covering
138
+ multiple filing-adviser/relying-adviser entities under a single
139
+ Form ADV.
140
+ * Examples: `true`, `false`
141
+ * Derivation: `Info@UmbrRgstn` attribute on `<Firm>` (`Y`/`N`),
142
+ normalized to `true`/`false`.
143
+
144
+ * `sec_region_code`
145
+ * Definition: SEC supervisory regional office code that has
146
+ jurisdiction over this firm (SEC-registered firms only).
147
+ * Examples: `NYRO`, `CHRO`, `LARO`
148
+ * Derivation: `Info@SECRgnCD` attribute. Omitted on state-feed firms
149
+ (where the SEC has no supervisory role).
150
+
151
+ * `firm_registration_type`
152
+ * Definition: Firm's registration disposition with the SEC at the time
153
+ of the most recent filing. One of `Registered` (full SEC-registered
154
+ RIA), `ERA` (Exempt Reporting Adviser — files but is not fully
155
+ registered), or other values the SEC may emit.
156
+ * Examples: `Registered`, `ERA`
157
+ * Derivation: `Rgstn@FirmType` attribute on `<Firm>` (SEC feed). State
158
+ feed instead carries a `<StateRgstn>` block; the equivalent on the
159
+ state feed is derivable from the presence of `<StateRgstn>` plus
160
+ individual regulator codes.
161
+
162
+ * `registration_status`
163
+ * Definition: Status of the firm's registration with its primary
164
+ regulator. For SEC-registered firms this is `Rgstn@St` (e.g.
165
+ `APPROVED`); for state-registered firms this is derived from the
166
+ first `<Rgltr>` element under `<StateRgstn>`.
167
+ * Examples: `APPROVED`, `ACTIVE`, `PENDING`, `TERMINATED`
168
+ * Derivation: `Rgstn@St` (SEC feed) or `StateRgstn/Rgltrs/Rgltr@St`
169
+ (state feed, first element).
170
+
171
+ * `registration_date`
172
+ * Definition: Date the firm was approved/registered by its primary
173
+ regulator, formatted YYYY-MM-DD.
174
+ * Examples: `2026-02-24`, `2021-02-16`
175
+ * Derivation: `Rgstn@Dt` (SEC) or first `Rgltr@Dt` under
176
+ `StateRgstn/Rgltrs` (state).
177
+
178
+ * `notice_filed_state_count`
179
+ * Definition: Number of US states where the firm has made notice
180
+ filings (SEC-registered firms). Notice filings are required of
181
+ SEC-registered firms in states where they have a place of business
182
+ or sufficient clients. Emitted as a float for numeric queryability.
183
+ * Examples: `0`, `1`, `50`
184
+ * Derivation: count of `<States>` elements under `<NoticeFiled>`
185
+ (SEC feed only).
186
+
187
+ * `state_registration_count`
188
+ * Definition: Number of US state and territorial securities regulators
189
+ with which the firm is registered (state-registered firms). Emitted
190
+ as a float.
191
+ * Examples: `1`, `5`, `30`
192
+ * Derivation: count of `<Rgltr>` elements under `<StateRgstn>/<Rgltrs>`
193
+ (state feed only).
194
+
195
+ * `latest_filing_date`
196
+ * Definition: Date of the most recent Form ADV filing represented in
197
+ this snapshot, YYYY-MM-DD.
198
+ * Examples: `2026-03-04`, `2025-08-01`
199
+ * Derivation: `Filing@Dt` attribute on `<Firm>`. Also used as the
200
+ record-level `Timestamp` because it bounds the freshness of the
201
+ Form ADV data.
202
+
203
+ * `form_adv_version`
204
+ * Definition: Version label of the Form ADV form template used for
205
+ this filing.
206
+ * Examples: `10/2021`
207
+ * Derivation: `Filing@FormVrsn` attribute on `<Firm>`.
208
+
209
+ ### Address (Organization)
210
+
211
+ * `physical_address`
212
+ * Definition: Firm's main office street address, formatted
213
+ `"Street1, Street2, City, ST ZIP, Country"`.
214
+ * Examples: `5930 MAIN STREET, SUITE 400, WILLIAMSVILLE, NY 14221,
215
+ United States`
216
+ * Derivation: assembled from `MainAddr@Strt1`, `MainAddr@Strt2`,
217
+ `MainAddr@City`, `MainAddr@State`, `MainAddr@PostlCd`,
218
+ `MainAddr@Cntry` on `<Firm>`.
219
+
220
+ * `mailing_address`
221
+ * Definition: Firm's mailing address when distinct from the main
222
+ office address.
223
+ * Examples: `PO BOX 1234, ALBANY, NY 12201, United States`
224
+ * Derivation: same assembly applied to `<MailingAddr>`. Omitted when
225
+ the element is empty (the common case).
226
+
227
+ * `main_phone_number`
228
+ * Definition: Main-office phone number as published by the firm. Not
229
+ normalized; the SEC publishes whatever the firm entered.
230
+ * Examples: `716-568-8790`, `6033037688`
231
+ * Derivation: `MainAddr@PhNb`.
232
+
233
+ * `main_fax_number`
234
+ * Definition: Main-office fax number, if any.
235
+ * Examples: `716-568-8791`
236
+ * Derivation: `MainAddr@FaxNb`. Often missing.
237
+
238
+ ### Web Presence (Organization)
239
+
240
+ * `website`
241
+ * Definition: Firm's website URL. The bulk feed allows multiple web
242
+ addresses per firm; we emit one `website` atom per `<WebAddr>`
243
+ element (i.e., the property can be multi-valued for a single firm).
244
+ * Examples: `http://www.rabenoldadvisors.com`,
245
+ `https://www.mkcapital.com/`
246
+ * Derivation: each `<WebAddr>` inside
247
+ `Part1A/Item1/WebAddrs` on `<Firm>`.
248
+
249
+ ### Organization Form (Organization)
250
+
251
+ * `organization_form`
252
+ * Definition: Legal organization form of the firm (Item 3A of Form
253
+ ADV).
254
+ * Examples: `Corporation`, `Limited Partnership`, `Limited Liability
255
+ Company`, `Sole Proprietorship`
256
+ * Derivation: `Item3A@OrgFormNm` under `Part1A` on `<Firm>`.
257
+
258
+ * `fiscal_year_end_month`
259
+ * Definition: Month in which the firm's fiscal year ends (Item 3B of
260
+ Form ADV).
261
+ * Examples: `DECEMBER`, `JUNE`
262
+ * Derivation: `Item3B@Q3B`.
263
+
264
+ * `state_of_formation`
265
+ * Definition: US state or country in which the firm was organized
266
+ (Item 3C of Form ADV). Stored as the 2-letter US state code when
267
+ inside the US; otherwise the country name.
268
+ * Examples: `NY`, `DE`, `IL`
269
+ * Derivation: `Item3C@StateCD` (US) or `Item3C@CntryNm` (non-US).
270
+
271
+ * `country_of_formation`
272
+ * Definition: Country in which the firm was organized.
273
+ * Examples: `United States`, `Cayman Islands`
274
+ * Derivation: `Item3C@CntryNm`.
275
+
276
+ ### Employees and Clients (Organization)
277
+
278
+ * `total_employees`
279
+ * Definition: Total number of employees worldwide as of the firm's
280
+ most recent fiscal year-end (Item 5A). Float for numeric
281
+ queryability.
282
+ * Examples: `4`, `150`, `12000`
283
+ * Derivation: `Item5A@TtlEmp`.
284
+
285
+ * `employees_providing_investment_advice`
286
+ * Definition: Number of employees who perform investment advisory
287
+ functions including research (Item 5B(1)).
288
+ * Examples: `1`, `30`
289
+ * Derivation: `Item5B@Q5B1`.
290
+
291
+ * `client_count_band`
292
+ * Definition: Approximate band for total number of advisory clients
293
+ (Item 5H). The SEC publishes this as a coarse band string rather
294
+ than an exact count for privacy.
295
+ * Examples: `0`, `1-10`, `11-25`, `26-100`, `101-250`, `251-500`,
296
+ `51-100`, `More than 500`
297
+ * Derivation: `Item5H@Q5H`.
298
+
299
+ ### Assets Under Management (Organization)
300
+
301
+ * `assets_under_management`
302
+ * Definition: Total regulatory assets under management (RAUM)
303
+ reported on Form ADV, in USD. Sum of discretionary and
304
+ non-discretionary RAUM. Item 5F(2)(c).
305
+ * Examples: `35557038`, `15000000000`
306
+ * Derivation: `Item5F@Q5F2C` (already in USD; passed through as a
307
+ float).
308
+
309
+ * `discretionary_assets_under_management`
310
+ * Definition: Regulatory AUM where the firm has discretionary
311
+ authority, in USD. Item 5F(2)(a).
312
+ * Examples: `35557038`, `0`
313
+ * Derivation: `Item5F@Q5F2A`.
314
+
315
+ * `non_discretionary_assets_under_management`
316
+ * Definition: Regulatory AUM where the firm advises without
317
+ discretionary authority, in USD. Item 5F(2)(b).
318
+ * Examples: `0`, `8000000`
319
+ * Derivation: `Item5F@Q5F2B`.
320
+
321
+ * `non_us_assets_under_management`
322
+ * Definition: Portion of RAUM attributable to non-US clients, in USD.
323
+ Item 5F(3).
324
+ * Examples: `0`, `1200000`
325
+ * Derivation: `Item5F@Q5F3`.
326
+
327
+ * `discretionary_account_count`
328
+ * Definition: Number of discretionary advisory accounts. Item 5F(2)(d).
329
+ * Examples: `117`, `5000`
330
+ * Derivation: `Item5F@Q5F2D`.
331
+
332
+ * `non_discretionary_account_count`
333
+ * Definition: Number of non-discretionary advisory accounts. Item
334
+ 5F(2)(e).
335
+ * Examples: `0`, `12`
336
+ * Derivation: `Item5F@Q5F2E`.
337
+
338
+ * `total_account_count`
339
+ * Definition: Total number of advisory accounts. Item 5F(2)(f).
340
+ * Examples: `117`, `5012`
341
+ * Derivation: `Item5F@Q5F2F`.
342
+
343
+ ### Advisory Services Offered (Organization)
344
+
345
+ Each service is emitted as a string atom with value `"true"` only when
346
+ the firm answered `Y` on the relevant Item 5G subfield. Absent atoms
347
+ mean the firm answered `N` or left the field blank — they are
348
+ intentionally not emitted as `"false"` to keep the atom count down at
349
+ production scale (~45K firms × ~10 services would add ~450K atoms with
350
+ no semantic gain).
351
+
352
+ * `provides_financial_planning_services`
353
+ * Definition: Firm provides financial planning services (Item 5G(1)).
354
+ * Examples: `true`
355
+ * Derivation: `Item5G@Q5G1=="Y"` — emitted only when true.
356
+
357
+ * `provides_individual_portfolio_management`
358
+ * Definition: Firm manages portfolios for individuals (Item 5G(2)).
359
+ * Examples: `true`
360
+ * Derivation: `Item5G@Q5G2=="Y"`.
361
+
362
+ * `provides_institutional_portfolio_management`
363
+ * Definition: Firm manages portfolios for institutions (Item 5G(5)).
364
+ * Examples: `true`
365
+ * Derivation: `Item5G@Q5G5=="Y"`.
366
+
367
+ * `provides_pooled_vehicle_portfolio_management`
368
+ * Definition: Firm manages portfolios for pooled investment vehicles
369
+ (Item 5G(3) and 5G(4) — covers both registered and unregistered
370
+ pooled vehicles).
371
+ * Examples: `true`
372
+ * Derivation: `Item5G@Q5G3=="Y" || Item5G@Q5G4=="Y"`.
373
+
374
+ * `provides_pension_consulting_services`
375
+ * Definition: Firm provides pension consulting services (Item 5G(8)).
376
+ * Examples: `true`
377
+ * Derivation: `Item5G@Q5G8=="Y"`.
378
+
379
+ * `provides_selection_of_other_advisers`
380
+ * Definition: Firm selects other advisers on behalf of clients
381
+ (including via wrap programs). Item 5G(9).
382
+ * Examples: `true`
383
+ * Derivation: `Item5G@Q5G9=="Y"`.
384
+
385
+ * `provides_market_timing_services`
386
+ * Definition: Firm offers market-timing services. Item 5G(10).
387
+ * Examples: `true`
388
+ * Derivation: `Item5G@Q5G10=="Y"`.
389
+
390
+ * `provides_security_ratings_services`
391
+ * Definition: Firm provides securities ratings or pricing services.
392
+ Item 5G(11).
393
+ * Examples: `true`
394
+ * Derivation: `Item5G@Q5G11=="Y"`.
395
+
396
+ * `provides_other_advisory_services`
397
+ * Definition: Firm provides other advisory services not listed
398
+ elsewhere on Item 5G; the free-form description (if any) lives in
399
+ `other_advisory_services_description`.
400
+ * Examples: `true`
401
+ * Derivation: `Item5G@Q5G12=="Y"`.
402
+
403
+ * `other_advisory_services_description`
404
+ * Definition: Free-text description of the "other" advisory services
405
+ the firm offers, when `provides_other_advisory_services` is true.
406
+ * Examples: `Investment research and securities analysis`
407
+ * Derivation: `Item5G@Q5G12Oth`. Omitted when blank.
408
+
409
+ ### Wrap Fee Programs (Organization)
410
+
411
+ * `is_wrap_fee_program_sponsor`
412
+ * Definition: String-encoded boolean indicating whether the firm
413
+ sponsors a wrap fee program (Item 5I(1)).
414
+ * Examples: `true`, `false`
415
+ * Derivation: `Item5I@Q5I1`, normalized from `Y`/`N`.
416
+
417
+ * `wrap_fee_sponsor_assets`
418
+ * Definition: Total assets in wrap fee programs the firm sponsors,
419
+ in USD. Item 5I(2)(a).
420
+ * Examples: `12500000000`
421
+ * Derivation: `Item5I@Q5I2A`. Omitted when 0 or missing.
422
+
423
+ * `wrap_fee_portfolio_assets`
424
+ * Definition: Total assets in wrap fee programs the firm acts as a
425
+ portfolio manager for, in USD. Item 5I(2)(b).
426
+ * Examples: `8000000`
427
+ * Derivation: `Item5I@Q5I2B`.
428
+
429
+ ### Disciplinary Disclosures (Organization)
430
+
431
+ Item 11 of Form ADV asks the firm to disclose criminal, regulatory,
432
+ civil, and bankruptcy history. The bulk XML feed exposes only the
433
+ Yes/No flags from Items 11A–H — the full DRP (Disclosure Reporting
434
+ Page) detail lives in the per-firm PDF brochures and is out of scope
435
+ for v0. We emit ONLY a single rollup flag (`has_disciplinary_disclosure`)
436
+ plus the individual flags that are most operationally useful
437
+ (criminal, regulatory action). The rest can be added incrementally
438
+ later.
439
+
440
+ * `has_disciplinary_disclosure`
441
+ * Definition: String-encoded boolean indicating that the firm or any
442
+ of its advisory affiliates has answered "Yes" to at least one
443
+ question in Item 11 (criminal, regulatory, civil, or bankruptcy).
444
+ A rollup; `false` means clean across all Item 11 subquestions.
445
+ * Examples: `true`, `false`
446
+ * Derivation: `Item11@Q11`, normalized from `Y`/`N`.
447
+
448
+ * `has_criminal_disclosure`
449
+ * Definition: String-encoded boolean rollup of Item 11A: any criminal
450
+ conviction or pending charge against the firm or an advisory
451
+ affiliate.
452
+ * Examples: `true`, `false`
453
+ * Derivation: `Item11A@Q11A1=="Y" || Item11A@Q11A2=="Y"`,
454
+ normalized.
455
+
456
+ * `has_regulatory_action_disclosure`
457
+ * Definition: String-encoded boolean rollup of Item 11B–E: any
458
+ regulatory action by the SEC, CFTC, other federal regulator, state
459
+ regulator, foreign regulator, or self-regulatory organization
460
+ against the firm or an advisory affiliate.
461
+ * Examples: `true`, `false`
462
+ * Derivation: logical-OR across all `Item11B..E@Q*` flags,
463
+ normalized.
464
+
465
+ ## 4. Entity Relationships Summary
466
+
467
+ ```
468
+ organization ──[is_located_at]──→ location (main office)
469
+ organization ──[is_located_at]──→ location (mailing address, when distinct)
470
+ ```
471
+
472
+ There are no inter-firm relationships in the bulk Form ADV feed —
473
+ ownership and control-person disclosures (Schedule A/B/C, Item 7) are
474
+ present at the Y/N flag level but the underlying detail rows are not in
475
+ the daily XML. Future iterations could add `controlled_by` /
476
+ `affiliated_with` relationships when those schedules are wired in.
477
+
478
+ ## 5. Attributes
479
+
480
+ None. All atoms carry standard citations; no source-specific
481
+ attributes are emitted.
@@ -0,0 +1,515 @@
1
+ # Dataset schema for IAPD (SEC Investment Adviser Public Disclosure).
2
+ #
3
+ # This schema describes entity types, properties, and relationships produced
4
+ # by atomizing the IAPD Form ADV Part 1A daily XML compilation feeds:
5
+ #
6
+ # - IA_FIRM_SEC_Feed_*.xml.gz — SEC-registered RIAs + Exempt Reporting
7
+ # Advisers. Source name: "iapd_sec".
8
+ # - IA_FIRM_STATE_Feed_*.xml.gz — State-registered Investment Advisers.
9
+ # Source name: "iapd_state".
10
+ #
11
+ # Both feeds emit the same entity types. crd_number is the canonical
12
+ # strong ID — chosen to align with the cross-source convention (edgar
13
+ # also uses an unprefixed crd_number on its organization flavor), so
14
+ # IAPD firms and EDGAR registrants collapse to one entity per CRD.
15
+ # sec_file_number is a secondary strong ID for SEC-feed firms.
16
+ # company_cik and lei are declared as strong-ID slots so the slots
17
+ # exist when downstream cross-walks (Form ADV Schedule R, third-party
18
+ # CRD↔LEI mappings) start populating them — adding strong-ID slots
19
+ # later is a breaking schema change.
20
+ #
21
+ # Scope: Form ADV Part 1A only. Part 1B (state addenda), Part 2 (brochures),
22
+ # and Part 3 (CRS) are per-firm PDFs not in the bulk feed; they are
23
+ # deferred to a future iteration.
24
+ name: "iapd"
25
+ description: "SEC Investment Adviser Public Disclosure (IAPD) Form ADV Part 1A data for SEC-registered Registered Investment Advisers, Exempt Reporting Advisers, and state-registered Investment Advisers, sourced from the daily IAPD compilation XML feeds"
26
+
27
+ extraction:
28
+ flavors: closed
29
+ properties: closed
30
+ relationships: closed
31
+ attributes: closed
32
+ events: closed
33
+
34
+ flavors:
35
+ - name: "organization"
36
+ description: "A particular business, institution, or organization such as a corporation, university, government agency, or non-profit"
37
+ display_name: "Organization"
38
+ mergeability: not_mergeable
39
+ # Property-name alignment matters: strong-ID matches key on
40
+ # (flavor, property, value), so crd_number / sec_file_number /
41
+ # company_cik / lei here match the same-named strong-IDs on
42
+ # edgar's organization flavor and gleif's organization flavor.
43
+ strong_id_properties: ["crd_number", "sec_file_number", "company_cik", "lei"]
44
+ passive: true
45
+
46
+ - name: "location"
47
+ description: "A specific named geographic location such as a city, country, region, or landmark"
48
+ display_name: "Location"
49
+ mergeability: not_mergeable
50
+ examples: ["New York City", "San Francisco", "North America", "Bakery Square"]
51
+ passive: true
52
+
53
+ properties:
54
+ # --- Identity and Registration ---
55
+
56
+ - name: "crd_number"
57
+ type: string
58
+ description: "FINRA-assigned firm CRD (Central Registration Depository) number, the canonical unique identifier for SEC-registered and state-registered investment advisory firms; stable across registration status changes. Derived from the FirmCrdNb attribute on the IAPD <Firm> element. Property name is unprefixed to match edgar's organization.crd_number so an iapd firm and an edgar registrant with the same CRD resolve to the same entity"
59
+ display_name: "CRD Number"
60
+ mergeability: not_mergeable
61
+ domain_flavors: ["organization"]
62
+ examples: ["283882", "312360", "324069"]
63
+ passive: true
64
+
65
+ - name: "sec_file_number"
66
+ type: string
67
+ description: "SEC file number — registrant identifier assigned by the SEC (e.g. 801-XXXXX for advisers, 084-XXXXX for funds). For IAPD firms the 801-prefix denotes a fully SEC-registered Registered Investment Adviser and the 802-prefix denotes an Exempt Reporting Adviser; derived from the SECNb attribute on the IAPD <Firm> element (SEC feed only). Promoted to a strong-ID on organization so iapd firms cross-resolve with edgar registrants by SEC file number when CRD is missing on one side"
68
+ display_name: "SEC File Number"
69
+ mergeability: not_mergeable
70
+ domain_flavors: ["organization"]
71
+ examples: ["801-135399", "802-120553"]
72
+ passive: true
73
+
74
+ - name: "company_cik"
75
+ type: string
76
+ description: "SEC Central Index Key (CIK) assigned by EDGAR to entities that file with the SEC. Property name matches edgar's organization.company_cik so iapd firms that also file via EDGAR (e.g. publicly-traded asset managers, fund advisers) cross-resolve to the same organization entity. Not present in the IAPD bulk XML — the slot is declared so downstream CRD↔CIK cross-walks (Form ADV Schedule R or third-party feeds) can populate it without a breaking schema change"
77
+ display_name: "Company CIK"
78
+ mergeability: not_mergeable
79
+ domain_flavors: ["organization"]
80
+ examples: ["1234567"]
81
+ passive: true
82
+
83
+ - name: "lei"
84
+ type: string
85
+ description: "Legal Entity Identifier — 20-character ISO 17442 code globally identifying a legal entity. Property name matches gleif's organization.lei and edgar's organization.lei so iapd firms cross-resolve to GLEIF and EDGAR records by LEI. Not present in the IAPD bulk Part 1A XML — the slot is declared so per-firm IAPD detail XML (Schedule R) or third-party cross-walks can populate it without a breaking schema change"
86
+ display_name: "LEI"
87
+ mergeability: not_mergeable
88
+ domain_flavors: ["organization"]
89
+ examples: ["549300LQQAVPLATTSU38"]
90
+ passive: true
91
+
92
+ - name: "primary_business_name"
93
+ type: string
94
+ description: "Primary business name under which the firm conducts advisory business; may differ from the legal name when the firm operates under a trade name or DBA. Derived from the BusNm attribute on the IAPD <Firm> element"
95
+ display_name: "Primary Business Name"
96
+ mergeability: not_mergeable
97
+ domain_flavors: ["organization"]
98
+ examples: ["RABENOLD ADVISORS, INC.", "MK CAPITAL"]
99
+ passive: true
100
+
101
+ - name: "legal_name"
102
+ type: string
103
+ description: "Firm's full legal name as registered with regulators; derived from the LegalNm attribute on the IAPD <Firm> element"
104
+ display_name: "Legal Name"
105
+ mergeability: not_mergeable
106
+ domain_flavors: ["organization"]
107
+ examples: ["RABENOLD ADVISORS, INC.", "MK CAPITAL COMPANY"]
108
+ passive: true
109
+
110
+ - name: "is_umbrella_registration"
111
+ type: string
112
+ description: "String-encoded boolean (\"true\"/\"false\") indicating whether this Form ADV filing represents an umbrella registration covering multiple filing-adviser/relying-adviser entities under a single filing. Derived from the UmbrRgstn attribute on the IAPD <Firm> element"
113
+ display_name: "Umbrella Registration"
114
+ mergeability: not_mergeable
115
+ domain_flavors: ["organization"]
116
+ examples: ["true", "false"]
117
+ passive: true
118
+
119
+ - name: "sec_region_code"
120
+ type: string
121
+ description: "SEC regional office code with supervisory jurisdiction over the firm. Derived from the SECRgnCD attribute on the IAPD <Firm> element (SEC-registered firms only)"
122
+ display_name: "SEC Region Code"
123
+ mergeability: not_mergeable
124
+ domain_flavors: ["organization"]
125
+ examples: ["NYRO", "CHRO", "LARO"]
126
+ passive: true
127
+
128
+ - name: "firm_registration_type"
129
+ type: string
130
+ description: "Firm's registration disposition with the SEC. \"Registered\" denotes a fully SEC-registered Registered Investment Adviser; \"ERA\" denotes an Exempt Reporting Adviser that files but is not fully registered. Derived from the Rgstn FirmType attribute on the IAPD <Firm> element (SEC feed only)"
131
+ display_name: "Firm Registration Type"
132
+ mergeability: not_mergeable
133
+ domain_flavors: ["organization"]
134
+ examples: ["Registered", "ERA"]
135
+ passive: true
136
+
137
+ - name: "firm_registration_status"
138
+ type: string
139
+ description: "Status of the firm's registration with its primary securities regulator (SEC or a US state); distinct from LEI registration status. For SEC-registered firms, derived from the Rgstn St attribute. For state-registered firms, derived from the first state regulator's status under StateRgstn/Rgltrs"
140
+ display_name: "Firm Registration Status"
141
+ mergeability: not_mergeable
142
+ domain_flavors: ["organization"]
143
+ examples: ["APPROVED", "ACTIVE", "PENDING", "TERMINATED"]
144
+ passive: true
145
+
146
+ - name: "firm_registration_date"
147
+ type: string
148
+ description: "Date the firm was approved or registered by its primary securities regulator (SEC or a US state), formatted YYYY-MM-DD. Derived from the Rgstn Dt attribute (SEC feed) or the first state regulator's Dt under StateRgstn/Rgltrs (state feed)"
149
+ display_name: "Firm Registration Date"
150
+ mergeability: not_mergeable
151
+ domain_flavors: ["organization"]
152
+ examples: ["2026-02-24", "2021-02-16"]
153
+ passive: true
154
+
155
+ - name: "notice_filed_state_count"
156
+ type: float
157
+ description: "Number of US states where the SEC-registered firm has filed notice filings; counts <States> elements under <NoticeFiled> on the IAPD <Firm> element (SEC feed only)"
158
+ display_name: "Notice-Filed State Count"
159
+ mergeability: not_mergeable
160
+ domain_flavors: ["organization"]
161
+ passive: true
162
+
163
+ - name: "state_registration_count"
164
+ type: float
165
+ description: "Number of US state and territorial securities regulators with which the firm is state-registered; counts <Rgltr> elements under StateRgstn/Rgltrs on the IAPD <Firm> element (state feed only)"
166
+ display_name: "State Registration Count"
167
+ mergeability: not_mergeable
168
+ domain_flavors: ["organization"]
169
+ passive: true
170
+
171
+ - name: "latest_filing_date"
172
+ type: string
173
+ description: "Date of the most recent Form ADV filing represented in this IAPD snapshot, formatted YYYY-MM-DD. Derived from the Filing Dt attribute on the <Firm> element; also used as the record-level timestamp because it bounds the freshness of the Form ADV data"
174
+ display_name: "Latest Form ADV Filing Date"
175
+ mergeability: not_mergeable
176
+ domain_flavors: ["organization"]
177
+ examples: ["2026-03-04", "2025-08-01"]
178
+ passive: true
179
+
180
+ - name: "form_adv_version"
181
+ type: string
182
+ description: "Version label of the Form ADV template used for the firm's most recent filing. Derived from the Filing FormVrsn attribute on the <Firm> element"
183
+ display_name: "Form ADV Version"
184
+ mergeability: not_mergeable
185
+ domain_flavors: ["organization"]
186
+ examples: ["10/2021"]
187
+ passive: true
188
+
189
+ # --- Address and Contact ---
190
+
191
+ - name: "physical_address"
192
+ type: string
193
+ description: "Physical street address of the entity"
194
+ display_name: "Physical Address"
195
+ mergeability: not_mergeable
196
+ domain_flavors: ["organization"]
197
+ examples: ["5930 MAIN STREET, SUITE 400, WILLIAMSVILLE, NY 14221, United States"]
198
+ passive: true
199
+
200
+ - name: "mailing_address"
201
+ type: string
202
+ description: "Mailing address as formatted string"
203
+ display_name: "Mailing Address"
204
+ mergeability: not_mergeable
205
+ domain_flavors: ["organization"]
206
+ examples: ["PO BOX 1234, ALBANY, NY 12201, United States"]
207
+ passive: true
208
+
209
+ - name: "main_phone_number"
210
+ type: string
211
+ description: "Main office phone number as published by the firm; not normalized. Derived from the MainAddr PhNb attribute on the <Firm> element"
212
+ display_name: "Main Phone Number"
213
+ mergeability: not_mergeable
214
+ domain_flavors: ["organization"]
215
+ examples: ["716-568-8790", "6033037688"]
216
+ passive: true
217
+
218
+ - name: "main_fax_number"
219
+ type: string
220
+ description: "Main office fax number as published by the firm. Derived from the MainAddr FaxNb attribute on the <Firm> element; often missing"
221
+ display_name: "Main Fax Number"
222
+ mergeability: not_mergeable
223
+ domain_flavors: ["organization"]
224
+ examples: ["716-568-8791"]
225
+ passive: true
226
+
227
+ - name: "website"
228
+ type: string
229
+ description: "Primary website URL of the institution, derived from the WEBADDR field; omitted when blank"
230
+ display_name: "Website"
231
+ mergeability: not_mergeable
232
+ domain_flavors: ["organization"]
233
+ examples: ["http://www.rabenoldadvisors.com", "https://www.mkcapital.com/"]
234
+ passive: true
235
+
236
+ # --- Organization Form ---
237
+
238
+ - name: "organization_form"
239
+ type: string
240
+ description: "Legal organization form of the firm as reported on Form ADV Item 3A. Derived from the Item3A OrgFormNm attribute"
241
+ display_name: "Organization Form"
242
+ mergeability: not_mergeable
243
+ domain_flavors: ["organization"]
244
+ examples: ["Corporation", "Limited Partnership", "Limited Liability Company", "Sole Proprietorship"]
245
+ passive: true
246
+
247
+ - name: "fiscal_year_end_month"
248
+ type: string
249
+ description: "Month in which the firm's fiscal year ends, as reported on Form ADV Item 3B. Derived from the Item3B Q3B attribute"
250
+ display_name: "Fiscal Year End Month"
251
+ mergeability: not_mergeable
252
+ domain_flavors: ["organization"]
253
+ examples: ["DECEMBER", "JUNE"]
254
+ passive: true
255
+
256
+ - name: "state_of_formation"
257
+ type: string
258
+ description: "US state (2-letter code) or non-US country in which the firm was organized, as reported on Form ADV Item 3C. Derived from Item3C StateCD when inside the US, otherwise CntryNm"
259
+ display_name: "State of Formation"
260
+ mergeability: not_mergeable
261
+ domain_flavors: ["organization"]
262
+ examples: ["NY", "DE", "IL"]
263
+ passive: true
264
+
265
+ - name: "country_of_formation"
266
+ type: string
267
+ description: "Country in which the firm was organized, as reported on Form ADV Item 3C. Derived from the Item3C CntryNm attribute"
268
+ display_name: "Country of Formation"
269
+ mergeability: not_mergeable
270
+ domain_flavors: ["organization"]
271
+ examples: ["United States", "Cayman Islands"]
272
+ passive: true
273
+
274
+ # --- Employees and Clients ---
275
+
276
+ - name: "total_employees"
277
+ type: float
278
+ description: "Total number of employees worldwide as of the firm's most recent fiscal year-end; from Form ADV Item 5A. Derived from the Item5A TtlEmp attribute"
279
+ display_name: "Total Employees"
280
+ mergeability: not_mergeable
281
+ domain_flavors: ["organization"]
282
+ passive: true
283
+
284
+ - name: "employees_providing_investment_advice"
285
+ type: float
286
+ description: "Number of employees who perform investment-advisory functions including research; from Form ADV Item 5B(1). Derived from the Item5B Q5B1 attribute"
287
+ display_name: "Employees Providing Investment Advice"
288
+ mergeability: not_mergeable
289
+ domain_flavors: ["organization"]
290
+ passive: true
291
+
292
+ - name: "client_count_band"
293
+ type: string
294
+ description: "Coarse band for the firm's total number of advisory clients; the SEC publishes a band rather than an exact count for privacy. From Form ADV Item 5H. Derived from the Item5H Q5H attribute"
295
+ display_name: "Client Count Band"
296
+ mergeability: not_mergeable
297
+ domain_flavors: ["organization"]
298
+ examples: ["0", "1-10", "11-25", "26-100", "51-100", "101-250", "251-500", "More than 500"]
299
+ passive: true
300
+
301
+ # --- Assets Under Management ---
302
+
303
+ - name: "assets_under_management"
304
+ type: float
305
+ description: "Total regulatory assets under management (RAUM) reported on Form ADV Item 5F(2)(c), in USD; the sum of discretionary and non-discretionary RAUM. Derived from the Item5F Q5F2C attribute"
306
+ display_name: "Assets Under Management"
307
+ mergeability: not_mergeable
308
+ domain_flavors: ["organization"]
309
+ passive: true
310
+
311
+ - name: "discretionary_assets_under_management"
312
+ type: float
313
+ description: "Regulatory assets under management for which the firm has discretionary authority, in USD; from Form ADV Item 5F(2)(a). Derived from the Item5F Q5F2A attribute"
314
+ display_name: "Discretionary Assets Under Management"
315
+ mergeability: not_mergeable
316
+ domain_flavors: ["organization"]
317
+ passive: true
318
+
319
+ - name: "non_discretionary_assets_under_management"
320
+ type: float
321
+ description: "Regulatory assets under management for which the firm advises without discretionary authority, in USD; from Form ADV Item 5F(2)(b). Derived from the Item5F Q5F2B attribute"
322
+ display_name: "Non-Discretionary Assets Under Management"
323
+ mergeability: not_mergeable
324
+ domain_flavors: ["organization"]
325
+ passive: true
326
+
327
+ - name: "non_us_assets_under_management"
328
+ type: float
329
+ description: "Portion of regulatory assets under management attributable to non-US clients, in USD; from Form ADV Item 5F(3). Derived from the Item5F Q5F3 attribute"
330
+ display_name: "Non-US Assets Under Management"
331
+ mergeability: not_mergeable
332
+ domain_flavors: ["organization"]
333
+ passive: true
334
+
335
+ - name: "discretionary_account_count"
336
+ type: float
337
+ description: "Number of discretionary advisory accounts; from Form ADV Item 5F(2)(d). Derived from the Item5F Q5F2D attribute"
338
+ display_name: "Discretionary Account Count"
339
+ mergeability: not_mergeable
340
+ domain_flavors: ["organization"]
341
+ passive: true
342
+
343
+ - name: "non_discretionary_account_count"
344
+ type: float
345
+ description: "Number of non-discretionary advisory accounts; from Form ADV Item 5F(2)(e). Derived from the Item5F Q5F2E attribute"
346
+ display_name: "Non-Discretionary Account Count"
347
+ mergeability: not_mergeable
348
+ domain_flavors: ["organization"]
349
+ passive: true
350
+
351
+ - name: "total_account_count"
352
+ type: float
353
+ description: "Total number of advisory accounts; from Form ADV Item 5F(2)(f). Derived from the Item5F Q5F2F attribute"
354
+ display_name: "Total Account Count"
355
+ mergeability: not_mergeable
356
+ domain_flavors: ["organization"]
357
+ passive: true
358
+
359
+ # --- Advisory Services (emitted only when firm answered Y on Form ADV Item 5G) ---
360
+
361
+ - name: "provides_financial_planning_services"
362
+ type: string
363
+ description: "String-encoded boolean (\"true\") indicating that the firm provides financial planning services; from Form ADV Item 5G(1). The atom is emitted ONLY when the firm answered Y (i.e. value is always \"true\" when present); an absent atom means the firm answered N or left it blank. Derived from the Item5G Q5G1 attribute"
364
+ display_name: "Provides Financial Planning Services"
365
+ mergeability: not_mergeable
366
+ domain_flavors: ["organization"]
367
+ examples: ["true"]
368
+ passive: true
369
+
370
+ - name: "provides_individual_portfolio_management"
371
+ type: string
372
+ description: "String-encoded boolean (\"true\") indicating that the firm manages portfolios for individuals (other than high-net-worth individuals); from Form ADV Item 5G(2). Emitted only when true. Derived from the Item5G Q5G2 attribute"
373
+ display_name: "Provides Individual Portfolio Management"
374
+ mergeability: not_mergeable
375
+ domain_flavors: ["organization"]
376
+ examples: ["true"]
377
+ passive: true
378
+
379
+ - name: "provides_institutional_portfolio_management"
380
+ type: string
381
+ description: "String-encoded boolean (\"true\") indicating that the firm manages portfolios for institutional clients; from Form ADV Item 5G(5). Emitted only when true. Derived from the Item5G Q5G5 attribute"
382
+ display_name: "Provides Institutional Portfolio Management"
383
+ mergeability: not_mergeable
384
+ domain_flavors: ["organization"]
385
+ examples: ["true"]
386
+ passive: true
387
+
388
+ - name: "provides_pooled_vehicle_portfolio_management"
389
+ type: string
390
+ description: "String-encoded boolean (\"true\") indicating that the firm manages portfolios for pooled investment vehicles (registered investment companies and/or unregistered pooled vehicles such as hedge funds); from Form ADV Items 5G(3) and 5G(4). Emitted only when true on either subitem. Derived from the Item5G Q5G3 or Q5G4 attribute"
391
+ display_name: "Provides Pooled Vehicle Portfolio Management"
392
+ mergeability: not_mergeable
393
+ domain_flavors: ["organization"]
394
+ examples: ["true"]
395
+ passive: true
396
+
397
+ - name: "provides_pension_consulting_services"
398
+ type: string
399
+ description: "String-encoded boolean (\"true\") indicating that the firm provides pension consulting services; from Form ADV Item 5G(8). Emitted only when true. Derived from the Item5G Q5G8 attribute"
400
+ display_name: "Provides Pension Consulting Services"
401
+ mergeability: not_mergeable
402
+ domain_flavors: ["organization"]
403
+ examples: ["true"]
404
+ passive: true
405
+
406
+ - name: "provides_selection_of_other_advisers"
407
+ type: string
408
+ description: "String-encoded boolean (\"true\") indicating that the firm selects other investment advisers on behalf of its clients (including via wrap programs); from Form ADV Item 5G(9). Emitted only when true. Derived from the Item5G Q5G9 attribute"
409
+ display_name: "Provides Selection of Other Advisers"
410
+ mergeability: not_mergeable
411
+ domain_flavors: ["organization"]
412
+ examples: ["true"]
413
+ passive: true
414
+
415
+ - name: "provides_market_timing_services"
416
+ type: string
417
+ description: "String-encoded boolean (\"true\") indicating that the firm offers market-timing services; from Form ADV Item 5G(10). Emitted only when true. Derived from the Item5G Q5G10 attribute"
418
+ display_name: "Provides Market Timing Services"
419
+ mergeability: not_mergeable
420
+ domain_flavors: ["organization"]
421
+ examples: ["true"]
422
+ passive: true
423
+
424
+ - name: "provides_security_ratings_services"
425
+ type: string
426
+ description: "String-encoded boolean (\"true\") indicating that the firm provides securities ratings or pricing services; from Form ADV Item 5G(11). Emitted only when true. Derived from the Item5G Q5G11 attribute"
427
+ display_name: "Provides Security Ratings Services"
428
+ mergeability: not_mergeable
429
+ domain_flavors: ["organization"]
430
+ examples: ["true"]
431
+ passive: true
432
+
433
+ - name: "provides_other_advisory_services"
434
+ type: string
435
+ description: "String-encoded boolean (\"true\") indicating that the firm provides other advisory services not specifically enumerated on Form ADV Item 5G; from Item 5G(12). Emitted only when true. Derived from the Item5G Q5G12 attribute. The free-form description (when present) is captured in the other_advisory_services_description property"
436
+ display_name: "Provides Other Advisory Services"
437
+ mergeability: not_mergeable
438
+ domain_flavors: ["organization"]
439
+ examples: ["true"]
440
+ passive: true
441
+
442
+ - name: "other_advisory_services_description"
443
+ type: string
444
+ description: "Free-text description of the \"other\" advisory services the firm offers when provides_other_advisory_services is true; from Form ADV Item 5G(12) freeform field. Derived from the Item5G Q5G12Oth attribute"
445
+ display_name: "Other Advisory Services Description"
446
+ mergeability: not_mergeable
447
+ domain_flavors: ["organization"]
448
+ examples: ["NEGOTIATED FINANCIAL PLANNING FEE", "INVESTMENT RESEARCH AND SECURITIES ANALYSIS"]
449
+ passive: true
450
+
451
+ # --- Wrap Fee Programs ---
452
+
453
+ - name: "is_wrap_fee_program_sponsor"
454
+ type: string
455
+ description: "String-encoded boolean (\"true\"/\"false\") indicating whether the firm sponsors a wrap fee program (a program that bundles advisory, brokerage, custody, and other services for a single fee); from Form ADV Item 5I(1). Derived from the Item5I Q5I1 attribute"
456
+ display_name: "Wrap Fee Program Sponsor"
457
+ mergeability: not_mergeable
458
+ domain_flavors: ["organization"]
459
+ examples: ["true", "false"]
460
+ passive: true
461
+
462
+ - name: "wrap_fee_sponsor_assets"
463
+ type: float
464
+ description: "Total assets in wrap fee programs the firm sponsors, in USD; from Form ADV Item 5I(2)(a). Derived from the Item5I Q5I2A attribute"
465
+ display_name: "Wrap Fee Sponsor Assets"
466
+ mergeability: not_mergeable
467
+ domain_flavors: ["organization"]
468
+ passive: true
469
+
470
+ - name: "wrap_fee_portfolio_assets"
471
+ type: float
472
+ description: "Total assets in wrap fee programs for which the firm acts as portfolio manager, in USD; from Form ADV Item 5I(2)(b). Derived from the Item5I Q5I2B attribute"
473
+ display_name: "Wrap Fee Portfolio Assets"
474
+ mergeability: not_mergeable
475
+ domain_flavors: ["organization"]
476
+ passive: true
477
+
478
+ # --- Disciplinary Disclosures (rollups only; per-item DRP detail not in bulk feed) ---
479
+
480
+ - name: "has_disciplinary_disclosure"
481
+ type: string
482
+ description: "String-encoded boolean rollup (\"true\"/\"false\") indicating that the firm or any of its advisory affiliates has answered Yes to at least one question in Form ADV Item 11 (criminal, regulatory, civil, or bankruptcy). False means clean across all Item 11 subquestions. Derived from the Item11 Q11 attribute"
483
+ display_name: "Has Disciplinary Disclosure"
484
+ mergeability: not_mergeable
485
+ domain_flavors: ["organization"]
486
+ examples: ["true", "false"]
487
+ passive: true
488
+
489
+ - name: "has_criminal_disclosure"
490
+ type: string
491
+ description: "String-encoded boolean rollup (\"true\"/\"false\") of Form ADV Item 11A indicating whether the firm or an advisory affiliate has a criminal conviction or pending criminal charge. Derived from logical-OR of the Item11A Q11A1 and Q11A2 attributes"
492
+ display_name: "Has Criminal Disclosure"
493
+ mergeability: not_mergeable
494
+ domain_flavors: ["organization"]
495
+ examples: ["true", "false"]
496
+ passive: true
497
+
498
+ - name: "has_regulatory_action_disclosure"
499
+ type: string
500
+ description: "String-encoded boolean rollup (\"true\"/\"false\") of Form ADV Items 11B-11E indicating whether the firm or an advisory affiliate has been the subject of any regulatory action by the SEC, CFTC, another federal regulator, a state regulator, a foreign regulator, or a self-regulatory organization. Derived from logical-OR of all Q11B*/Q11C*/Q11D*/Q11E* attributes"
501
+ display_name: "Has Regulatory Action Disclosure"
502
+ mergeability: not_mergeable
503
+ domain_flavors: ["organization"]
504
+ examples: ["true", "false"]
505
+ passive: true
506
+
507
+ relationships:
508
+ - name: "is_located_at"
509
+ description: "An entity is located at, operates in, resides in, is headquartered in, was born in, visits, or died in a location"
510
+ display_name: "Located At"
511
+ mergeability: not_mergeable
512
+ domain_flavors: ["organization"]
513
+ target_flavors: ["location"]
514
+ examples: ["Rabenold Advisors, Inc. is located at Williamsville, NY"]
515
+ passive: true