snf-peirce 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,28 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 peirce-lang
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
23
+ ---
24
+
25
+ SNF (Semantic Normalized Form) specification and Peirce Query Language
26
+ specification are original works. Attribution in source code and documentation
27
+ is required. See project licensing documentation for details on Portolan and
28
+ Reckoner licensing.
@@ -0,0 +1,374 @@
1
+ Metadata-Version: 2.4
2
+ Name: snf-peirce
3
+ Version: 0.1.0
4
+ Summary: Python implementation of the SNF stack — Peirce parser, lens authoring, substrate compilation, and query execution
5
+ License-Expression: MIT
6
+ Project-URL: Homepage, https://github.com/peirce-lang/snf-peirce
7
+ Project-URL: Repository, https://github.com/peirce-lang/snf-peirce
8
+ Requires-Python: >=3.9
9
+ Description-Content-Type: text/markdown
10
+ License-File: LICENSE
11
+ Requires-Dist: pandas>=1.5
12
+ Requires-Dist: duckdb>=0.9
13
+ Provides-Extra: roaring
14
+ Requires-Dist: pyroaring>=0.4; extra == "roaring"
15
+ Provides-Extra: dev
16
+ Requires-Dist: pytest>=7.0; extra == "dev"
17
+ Requires-Dist: jupyter; extra == "dev"
18
+ Dynamic: license-file
19
+
20
+ # snf-peirce
21
+
22
+ Python implementation of the SNF (Semantic Normalized Form) stack.
23
+
24
+ Query any dataset by meaning, not by schema. Author a lens once, query semantically forever.
25
+
26
+ ```python
27
+ from snf_peirce import suggest, compile_data, query
28
+ import pandas as pd
29
+
30
+ df = pd.read_csv("my_collection.csv")
31
+ draft = suggest(df)
32
+ draft.map("Artist", "who", "artist").nucleus("release_id", prefix="discogs:release")
33
+ lens = draft.to_lens(lens_id="discogs_v1", authority="me")
34
+ compiled = compile_data(df, lens)
35
+
36
+ query(compiled, 'WHO.artist = "Miles Davis" AND WHEN.released = "1959"')
37
+ ```
38
+
39
+ ---
40
+
41
+ ## What is SNF?
42
+
43
+ SNF (Semantic Normalized Form) is a data model and query protocol built around six universal dimensions:
44
+
45
+ | Dimension | Meaning | Examples |
46
+ |---|---|---|
47
+ | **WHO** | People, organisations, roles | author, publisher, attorney, artist |
48
+ | **WHAT** | Things, topics, identifiers | title, subject, ISBN, genre |
49
+ | **WHEN** | Dates, years, time periods | publication_date, year, date_added |
50
+ | **WHERE** | Places, locations, regions | publication_place, office, territory |
51
+ | **WHY** | Reasons, types, purposes | matter_type, audience, format_legal |
52
+ | **HOW** | Methods, formats, measurements | carrier_type, cmc, media_type |
53
+
54
+ Every fact in your data maps to one of these dimensions. Once mapped, you query by meaning — not by column name, table structure, or JOIN logic.
55
+
56
+ **Peirce** is the query language for SNF. Named after Charles Sanders Peirce, whose triadic sign relation maps directly onto the SNF record structure: Dimension → SemanticKey → Value.
57
+
58
+ ---
59
+
60
+ ## Install
61
+
62
+ ```bash
63
+ pip install snf-peirce
64
+ ```
65
+
66
+ Or clone and use directly:
67
+
68
+ ```bash
69
+ git clone https://github.com/peirce-lang/snf-peirce
70
+ cd snf-peirce
71
+ pip install pandas duckdb requests
72
+ ```
73
+
74
+ ---
75
+
76
+ ## Quick start
77
+
78
+ ### From a CSV
79
+
80
+ ```python
81
+ import pandas as pd
82
+ from snf_peirce import suggest, compile_data, query
83
+
84
+ df = pd.read_csv("matters.csv")
85
+ draft = suggest(df)
86
+ print(draft) # renders as a table in Jupyter
87
+
88
+ draft.map("attorney_name", "who", "attorney")
89
+ draft.map("matter_type", "why", "matter_type")
90
+ draft.map("fiscal_year", "when", "year")
91
+ draft.map("office", "where", "office")
92
+ draft.nucleus_composite(["client_id", "matter_id"],
93
+ separator="-", prefix="legal:matter")
94
+
95
+ lens = draft.to_lens(lens_id="legal_v1", authority="firm")
96
+ compiled = compile_data(df, lens)
97
+
98
+ query(compiled, 'WHO.attorney = "Smith" AND WHERE.office = "Seattle"')
99
+ query(compiled, 'WHEN.year BETWEEN "2022" AND "2024"')
100
+ query(compiled, 'WHO.attorney = "Smith" OR WHO.attorney = "Jones"')
101
+ ```
102
+
103
+ ### From an existing lens-tool lens
104
+
105
+ Lenses created by the JavaScript lens-tool are directly compatible:
106
+
107
+ ```python
108
+ from snf_peirce import load, compile_data, query
109
+ import pandas as pd
110
+
111
+ lens = load("discogs_community_v1.json")
112
+ df = pd.read_csv("discogs_sample.csv")
113
+ compiled = compile_data(df, lens)
114
+
115
+ query(compiled, 'WHO.author = "Miles Davis"')
116
+ query(compiled, 'WHEN.publication_date BETWEEN "1955" AND "1965"')
117
+ ```
118
+
119
+ ---
120
+
121
+ ## Interactive shell
122
+
123
+ ```bash
124
+ python shell.py csv://my_spoke_dir
125
+ ```
126
+
127
+ ```
128
+ peirce> WHO.author = "Miles Davis"
129
+ peirce> WHEN.publication_date BETWEEN "1955" AND "1965"
130
+ peirce> WHO.author = "Miles Davis" AND WHEN.publication_date = "1959"
131
+ peirce> \schema — show all dimensions and fields
132
+ peirce> \schema WHO — show fields in WHO with counts
133
+ peirce> \explain — show execution plan for last query
134
+ peirce> WHO.<TAB> — TAB-completes field names from substrate
135
+ peirce> \pivot — toggle wide table view
136
+ peirce> exit
137
+ ```
138
+
139
+ Shell features beyond the JS version:
140
+ - **TAB completion** — field names from the actual substrate
141
+ - **`\explain`** — execution plan with cardinality bars
142
+ - **`\schema`** — dimensions, fields, entity and value counts
143
+ - **Discovery expressions** — `WHO|*`, `WHAT|genre|*` work inline
144
+
145
+ ---
146
+
147
+ ## Guided setup (no coding required)
148
+
149
+ ```bash
150
+ python guided_ingest.py
151
+ python guided_ingest.py mydata.csv
152
+ ```
153
+
154
+ Walks through CSV → lens authoring → compilation → shell with prompts. No code required.
155
+
156
+ ---
157
+
158
+ ## Fetch from public APIs
159
+
160
+ ### Scryfall (Magic: The Gathering)
161
+
162
+ ```bash
163
+ pip install requests
164
+ python fetch_scryfall.py # Guilds of Ravnica (default)
165
+ python fetch_scryfall.py war # War of the Spark
166
+ python fetch_scryfall.py --list # see available sets
167
+ ```
168
+
169
+ ```
170
+ peirce> WHAT.guild = "Dimir"
171
+ peirce> WHAT.color = "Blue" AND WHAT.color = "Black"
172
+ peirce> WHAT.card_type = "Creature" AND HOW.cmc BETWEEN "1" AND "3"
173
+ peirce> WHAT.keyword = "Surveil"
174
+ peirce> WHO.artist = "Seb McKinnon"
175
+ peirce> WHAT|guild|*
176
+ ```
177
+
178
+ ### Library of Congress catalog
179
+
180
+ ```bash
181
+ python fetch_loc.py # default: jazz music
182
+ python fetch_loc.py "toni morrison" # keyword search
183
+ python fetch_loc.py --subject "cooking" # subject search
184
+ python fetch_loc.py --author "hemingway" # author search
185
+ python fetch_loc.py --marc-file catalog.mrc # from a .mrc file
186
+ ```
187
+
188
+ ```
189
+ peirce> WHO.author CONTAINS "Morrison"
190
+ peirce> WHAT.subject_topic CONTAINS "Jazz"
191
+ peirce> WHEN.publication_date BETWEEN "1950" AND "1970"
192
+ peirce> WHERE.publication_place = "New York"
193
+ peirce> WHAT|subject_topic|*
194
+ ```
195
+
196
+ ### Build your own fetcher
197
+
198
+ ```python
199
+ from base_fetcher import SNFFetcher, fact, facts, facts_from_list
200
+
201
+ class MyAPIFetcher(SNFFetcher):
202
+ lens_id = "myapi_v1"
203
+ set_name = "My Dataset"
204
+ spoke_dir = "myapi_spoke"
205
+
206
+ def fetch(self):
207
+ import requests
208
+ return requests.get("https://api.example.com/data").json()["items"]
209
+
210
+ def entity_id(self, item):
211
+ return f"myapi:{item['id']}"
212
+
213
+ def translate(self, item):
214
+ eid = self.entity_id(item)
215
+ return [
216
+ *facts(
217
+ (eid, "what", "title", item.get("title")),
218
+ (eid, "who", "author", item.get("author")),
219
+ (eid, "when", "year", item.get("year")),
220
+ ),
221
+ *facts_from_list(eid, "what", "genre", item.get("genres", [])),
222
+ ]
223
+
224
+ if __name__ == "__main__":
225
+ MyAPIFetcher().run()
226
+ ```
227
+
228
+ ---
229
+
230
+ ## Peirce query syntax
231
+
232
+ ```
233
+ WHO.artist = "Miles Davis" equality
234
+ WHO.artist != "Miles Davis" not equal
235
+ WHEN.released > "1960" comparison
236
+ WHEN.released BETWEEN "1955" AND "1965" range (inclusive)
237
+ WHAT.title CONTAINS "Blue" substring
238
+ WHO.artist PREFIX "Miles" starts with
239
+ NOT WHERE.office = "Seattle" negation
240
+ WHO.artist = "Miles Davis" AND WHEN.released = "1959" AND (intersection)
241
+ WHO.artist = "Miles Davis" OR WHO.artist = "Coltrane" OR (union)
242
+ (WHO.artist = "Miles Davis" AND WHEN.released = "1959")
243
+ OR
244
+ (WHO.artist = "John Coltrane" AND WHEN.released = "1964") DNF
245
+
246
+ # Discovery (shell only — shows schema)
247
+ * all dimensions
248
+ WHO|* all fields in WHO
249
+ WHO|artist|* all values for WHO.artist
250
+ ```
251
+
252
+ ---
253
+
254
+ ## MARC support
255
+
256
+ snf-peirce ships with the MARC Bibliographic Lens v1.0 — a complete
257
+ field mapping from MARC21 tags to SNF dimensions. Python port of
258
+ `MARCTranslator_v3.js`. No extra dependencies required.
259
+
260
+ ```python
261
+ from parse_marc import parse_mrc
262
+ from marc_translator import MARCTranslator
263
+
264
+ records = parse_mrc("catalog.mrc")
265
+ translator = MARCTranslator(source_id="loc")
266
+
267
+ for record in records:
268
+ facts = translator.translate_record(record)
269
+ ```
270
+
271
+ Key field mappings:
272
+
273
+ | MARC tag | → | Dimension | Semantic key |
274
+ |---|---|---|---|
275
+ | 100$a | → | WHO | author |
276
+ | 245$a+$b | → | WHAT | title |
277
+ | 260$b / 264$b | → | WHO | publisher |
278
+ | 260$a / 264$a | → | WHERE | publication_place |
279
+ | 260$c / 264$c | → | WHEN | publication_date |
280
+ | 650$a | → | WHAT | subject_topic |
281
+ | 651$a | → | WHERE | subject_place |
282
+ | 600$a | → | WHO | subject_person |
283
+ | 655$a | → | WHAT | genre |
284
+ | 020$a | → | WHAT | isbn (nucleus) |
285
+
286
+ ---
287
+
288
+ ## Jupyter workflow
289
+
290
+ Results render as tables inline. Output is pandas.
291
+
292
+ ```python
293
+ import pandas as pd
294
+ from snf_peirce import suggest, compile_data, query
295
+
296
+ df = pd.read_csv("my_collection.csv")
297
+ draft = suggest(df) # renders as mapping table
298
+ compiled = compile_data(df, lens) # renders substrate summary
299
+
300
+ query(compiled, 'WHO.artist = "Miles Davis"') # renders result table inline
301
+
302
+ result = query(compiled, 'WHEN.released BETWEEN "1955" AND "1965"', limit=None)
303
+ df_result = result.to_dataframe() # pandas DataFrame — use anything
304
+ df_result.groupby("semantic_key")["value"].value_counts()
305
+ ```
306
+
307
+ ---
308
+
309
+ ## File inventory
310
+
311
+ | File | Purpose |
312
+ |---|---|
313
+ | `parser.py` | Peirce query parser — conformant with JS reference |
314
+ | `lens.py` | Lens authoring: `suggest()`, `LensDraft`, `load()`, `save()` |
315
+ | `compile.py` | Data compilation: `compile_data()`, `Substrate` |
316
+ | `peirce.py` | Query execution: `query()`, `execute()`, `ResultSet` |
317
+ | `shell.py` | Interactive query shell |
318
+ | `guided_ingest.py` | Guided setup script — no coding required |
319
+ | `base_fetcher.py` | Base class for API fetchers |
320
+ | `fetch_scryfall.py` | Scryfall / Magic: The Gathering fetcher |
321
+ | `fetch_loc.py` | Library of Congress catalog fetcher |
322
+ | `marc_translator.py` | MARC Bibliographic Lens v1.0 |
323
+ | `parse_marc.py` | Pure Python binary MARC / MARCXML parser |
324
+
325
+ ---
326
+
327
+ ## Running the tests
328
+
329
+ ```bash
330
+ python -m pytest test_parser.py -v # 59 tests — parser conformance
331
+ python test_lens.py # 69 tests — lens authoring
332
+ python test_compile.py # 57 tests — compilation and queries
333
+ python test_peirce.py # 57 tests — end-to-end query
334
+ python test_conformance.py # 37 tests — cross-language proof
335
+ ```
336
+
337
+ ---
338
+
339
+ ## Architecture
340
+
341
+ ```
342
+ SNF / Peirce specification open protocol (MIT)
343
+
344
+ snf-peirce Python runtime / engine ← this package
345
+
346
+ Reckoner visual application for non-technical users
347
+ ```
348
+
349
+ snf-peirce is Reckoner's data engine. It is also usable standalone
350
+ for data practitioners who want SNF in Python and Jupyter workflows.
351
+
352
+ ---
353
+
354
+ ## Note on Portolan
355
+
356
+ The `\explain` shell command displays a simplified execution plan
357
+ showing constraints ordered by estimated cardinality. This implements
358
+ the same ordering heuristic as Portolan's I1 algorithm for display
359
+ purposes.
360
+
361
+ Full Portolan — schema validation, type checking, query rejection,
362
+ composite constraint reasoning — is a separate licensed component
363
+ not included in this package.
364
+
365
+ ---
366
+
367
+ ## License
368
+
369
+ MIT. See LICENSE file.
370
+
371
+ SNF specification, Peirce query language, and MARC Bibliographic
372
+ Lens v1.0 are original works. Attribution required in source code
373
+ and documentation. See project licensing documentation for details
374
+ on Portolan and Reckoner licensing.