retold-facto 0.1.0 → 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +142 -0
- package/docs/.nojekyll +0 -0
- package/docs/README.md +73 -0
- package/docs/_cover.md +18 -0
- package/docs/_sidebar.md +30 -0
- package/docs/_topbar.md +13 -0
- package/docs/_version.json +7 -0
- package/docs/api-reference.md +245 -0
- package/docs/architecture.md +411 -0
- package/docs/css/docuserve.css +327 -0
- package/docs/index.html +39 -0
- package/docs/quickstart.md +254 -0
- package/docs/retold-catalog.json +130 -0
- package/docs/retold-keyword-index.json +11123 -0
- package/docs/subsystems/audit.md +190 -0
- package/docs/subsystems/connection.md +281 -0
- package/docs/subsystems/mapping.md +218 -0
- package/docs/subsystems/projection.md +351 -0
- package/docs/subsystems/recordset.md +282 -0
- package/docs/ultravisor-integration.md +335 -0
- package/package.json +53 -13
package/README.md
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# Retold Facto
|
|
2
|
+
|
|
3
|
+
[](https://opensource.org/licenses/MIT)
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
A data warehouse and knowledge graph for the Retold ecosystem. Facto ingests records from arbitrary sources, tracks their provenance and certainty, compiles them into projections via declarative mappings, and deploys those projections to any Meadow-supported backend. It runs as a standalone REST server, a Pict web application, and -- optionally -- as an [Ultravisor](https://github.com/stevenvelozo/ultravisor) beacon exposing its ingest / transform / deploy operations as workflow capabilities.
|
|
8
|
+
|
|
9
|
+
## Features
|
|
10
|
+
|
|
11
|
+
- **Records + Certainty** -- Every ingested record carries source provenance, schema version, ingest-job lineage, and a configurable `CertaintyIndex` entry so downstream queries can filter on confidence
|
|
12
|
+
- **Ingest Engine** -- Batch ingests from CSV, JSON, folder scans, or direct API calls; tracks `IngestJob` status, dedupes with content signatures, and auto-increments dataset versions
|
|
13
|
+
- **Projection Engine** -- Compiles raw records into flat, denormalized projections using declarative `Mappings` JSON and five built-in merge strategies (WriteAll, FirstWriteWins, ReliabilityOverwrite, MergeAndReinforce, FieldFillOnly)
|
|
14
|
+
- **Connection Manager** -- First-class support for SQLite, MySQL, PostgreSQL, and MSSQL projection targets via Meadow connectors; masked-password safe API
|
|
15
|
+
- **Mapping DSL** -- `Entity + GUIDTemplate + Mappings` JSON descriptors drive `meadow-integration`'s `TabularTransform` for flattening, comprehension, and de-duplication
|
|
16
|
+
- **Source Catalog** -- Research-grade catalog with `SourceCatalogEntry`, `CatalogDatasetDefinition`, and a folder scanner that discovers datasets from `README.md` files
|
|
17
|
+
- **Multi-Entity Web UI** -- Two Pict browser applications (`pict-app` and `pict-app-full`) provide source, dataset, record, projection, mapping, and connection management
|
|
18
|
+
- **Ultravisor Beacon** -- Optional beacon mode exposes three capabilities (`FactoData`, `FactoTransform`, `FactoDeploy`) so workflows can orchestrate Facto remotely
|
|
19
|
+
- **Meadow Native** -- Schema is defined in a single stricture JSON file; REST endpoints for every entity come for free via `meadow-endpoints`
|
|
20
|
+
- **Orator Native** -- Built on the standard Retold Orator + Restify stack; every subsystem exposes its own REST surface
|
|
21
|
+
|
|
22
|
+
## Installation
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
npm install retold-facto
|
|
26
|
+
# or globally for the CLI
|
|
27
|
+
npm install -g retold-facto
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
## Quick Start
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
# Initialize the default SQLite schema
|
|
34
|
+
retold-facto init
|
|
35
|
+
|
|
36
|
+
# Start the server (default :8386)
|
|
37
|
+
retold-facto serve
|
|
38
|
+
|
|
39
|
+
# Start on a custom port with a custom database
|
|
40
|
+
retold-facto serve --port 9000 --db /var/data/facto.sqlite
|
|
41
|
+
|
|
42
|
+
# Scan a folder of README-based dataset definitions
|
|
43
|
+
retold-facto scan ./my-data
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
Open `http://localhost:8386/` for the web UI, or hit the REST API at `/1.0/*` (auto-generated Meadow CRUD) and `/facto/*` (subsystem endpoints).
|
|
47
|
+
|
|
48
|
+
## CLI
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
retold-facto <command> [options]
|
|
52
|
+
|
|
53
|
+
Commands:
|
|
54
|
+
serve [default] Start the REST API + Pict web UI
|
|
55
|
+
init Create the default schema (21 tables)
|
|
56
|
+
ingest <file> [dataset-id] [source-id] [type]
|
|
57
|
+
Parse and ingest a CSV/JSON file
|
|
58
|
+
source list | add Source CRUD shortcuts
|
|
59
|
+
dataset list | add Dataset CRUD shortcuts
|
|
60
|
+
scan <folder> Discover datasets from README-annotated folders
|
|
61
|
+
scan provision <folder> Provision discovered datasets into Facto
|
|
62
|
+
scan ingest <folder> Ingest discovered datasets end-to-end
|
|
63
|
+
|
|
64
|
+
Options:
|
|
65
|
+
-c, --config <file> JSON configuration file
|
|
66
|
+
-p, --port <port> API server port (default 8386)
|
|
67
|
+
-d, --db <path> SQLite database path (default ./data/facto.sqlite)
|
|
68
|
+
-s, --scan-path <path> Add a scan path (repeatable)
|
|
69
|
+
-l, --log [path] Write a log file
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## Subsystems
|
|
73
|
+
|
|
74
|
+
Facto is composed of twelve service managers layered over Meadow. Each one owns a subset of the schema and a subset of the REST surface:
|
|
75
|
+
|
|
76
|
+
| Subsystem | Purpose | Docs |
|
|
77
|
+
|---|---|---|
|
|
78
|
+
| **Recordset** | Records, CertaintyIndex, IngestJob lifecycle | [docs/subsystems/recordset.md](docs/subsystems/recordset.md) |
|
|
79
|
+
| **Projection** | MultiSetProjection, ProjectionStore, merge strategies, deployment | [docs/subsystems/projection.md](docs/subsystems/projection.md) |
|
|
80
|
+
| **Mapping** | `Entity + GUIDTemplate + Mappings` transform descriptors | [docs/subsystems/mapping.md](docs/subsystems/mapping.md) |
|
|
81
|
+
| **Connection** | External database connections for projection targets | [docs/subsystems/connection.md](docs/subsystems/connection.md) |
|
|
82
|
+
| **Audit** | Timestamped CRUD columns, ingest job logs, certainty logs | [docs/subsystems/audit.md](docs/subsystems/audit.md) |
|
|
83
|
+
|
|
84
|
+
## Ultravisor Integration
|
|
85
|
+
|
|
86
|
+
Facto can register as an [Ultravisor](https://github.com/stevenvelozo/ultravisor) beacon and expose its operations as workflow capabilities:
|
|
87
|
+
|
|
88
|
+
- **`FactoData`** -- Source / Dataset / Record / IngestJob / ProjectionStore CRUD
|
|
89
|
+
- **`FactoTransform`** -- Apply a mapping to a batch of records (pure function, no side effects)
|
|
90
|
+
- **`FactoDeploy`** -- Deploy a projection schema to an external store
|
|
91
|
+
|
|
92
|
+
In a typical Retold deployment, Ultravisor orchestrates pipelines that dispatch these capabilities to one or more Facto beacons running close to their data sources. See [docs/ultravisor-integration.md](docs/ultravisor-integration.md) for the full beacon contract and workflow patterns.
|
|
93
|
+
|
|
94
|
+
Facto also runs perfectly well without Ultravisor -- beacon mode is optional.
|
|
95
|
+
|
|
96
|
+
## Documentation
|
|
97
|
+
|
|
98
|
+
- [Overview](docs/README.md)
|
|
99
|
+
- [Quick Start](docs/quickstart.md)
|
|
100
|
+
- [Architecture](docs/architecture.md)
|
|
101
|
+
- [API Reference](docs/api-reference.md)
|
|
102
|
+
- [Ultravisor Integration](docs/ultravisor-integration.md)
|
|
103
|
+
- **Subsystems**
|
|
104
|
+
- [Recordset](docs/subsystems/recordset.md)
|
|
105
|
+
- [Projection](docs/subsystems/projection.md)
|
|
106
|
+
- [Connection](docs/subsystems/connection.md)
|
|
107
|
+
- [Mapping](docs/subsystems/mapping.md)
|
|
108
|
+
- [Audit](docs/subsystems/audit.md)
|
|
109
|
+
|
|
110
|
+
## Testing
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
npm test # Mocha TDD unit tests
|
|
114
|
+
npm run test-browser # Puppeteer headless browser tests
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
## Building
|
|
118
|
+
|
|
119
|
+
```bash
|
|
120
|
+
npm run build
|
|
121
|
+
npm run build-codemirror
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
## Related Packages
|
|
125
|
+
|
|
126
|
+
- [meadow](https://github.com/stevenvelozo/meadow) -- ORM / query DSL
|
|
127
|
+
- [meadow-endpoints](https://github.com/stevenvelozo/meadow-endpoints) -- auto-generated REST CRUD
|
|
128
|
+
- [meadow-integration](https://github.com/stevenvelozo/meadow-integration) -- `TabularTransform` and `CertaintyAccumulator` used by the projection engine
|
|
129
|
+
- [orator](https://github.com/stevenvelozo/orator) -- REST server framework
|
|
130
|
+
- [pict](https://github.com/stevenvelozo/pict) -- MVC framework for the web UI
|
|
131
|
+
- [stricture](https://github.com/stevenvelozo/stricture) -- schema definition language (MicroDDL)
|
|
132
|
+
- [ultravisor](https://github.com/stevenvelozo/ultravisor) -- workflow orchestrator (optional beacon target)
|
|
133
|
+
- [ultravisor-beacon](https://github.com/stevenvelozo/ultravisor-beacon) -- beacon protocol client
|
|
134
|
+
- [bibliograph](https://github.com/stevenvelozo/bibliograph) -- (dependency; reserved for richer audit logging)
|
|
135
|
+
|
|
136
|
+
## License
|
|
137
|
+
|
|
138
|
+
MIT
|
|
139
|
+
|
|
140
|
+
## Contributing
|
|
141
|
+
|
|
142
|
+
Pull requests welcome. See the [Retold Contributing Guide](https://github.com/stevenvelozo/retold/blob/main/docs/contributing.md) for the code of conduct, contribution process, and testing requirements.
|
package/docs/.nojekyll
ADDED
|
File without changes
|
package/docs/README.md
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# Retold Facto
|
|
2
|
+
|
|
3
|
+
> A data warehouse and knowledge graph for the Retold ecosystem
|
|
4
|
+
|
|
5
|
+
Retold Facto is the persistence and transform layer that sits between raw data sources and the analytical stores, search indexes, and reporting systems that consume them. It is built as a Fable service that hosts an Orator REST server, a Meadow-backed data access layer, a pair of Pict web applications for browser-based management, and an optional Ultravisor beacon for workflow-driven orchestration.
|
|
6
|
+
|
|
7
|
+
Facto is the answer to a specific problem: **you have a lot of data from a lot of sources with different schemas, different reliability, different update cadences, and different downstream consumers, and you need a single place to store it, audit its provenance, transform it, and publish it.** The core design decisions follow from that problem:
|
|
8
|
+
|
|
9
|
+
- **Raw before refined.** Every record lands in the warehouse verbatim, tagged with the source it came from, the ingest job that produced it, and a certainty value. Projections are computed downstream from the raw records, not in place of them.
|
|
10
|
+
- **Mappings as data.** How you flatten raw records into an analytical shape is itself a first-class entity -- a row in the `ProjectionMapping` table with a `MappingConfiguration` JSON blob. You can edit mappings through the web UI, version them, export them, and re-run them.
|
|
11
|
+
- **Everything is Meadow.** The schema is a stricture JSON file. Every entity gets automatic CRUD endpoints, queryable filters, and connector flexibility. Swap SQLite for MySQL for PostgreSQL without touching the business logic.
|
|
12
|
+
- **Workflows, not cron jobs.** Facto can register as an Ultravisor beacon and expose its ingest / transform / deploy operations as workflow capabilities. Ultravisor handles the scheduling, retry, and distribution; Facto handles the work.
|
|
13
|
+
|
|
14
|
+
The twelve service managers inside Facto each own a small, focused piece of the system:
|
|
15
|
+
|
|
16
|
+
1. **RetoldFactoSourceManager** -- source CRUD and activation
|
|
17
|
+
2. **RetoldFactoDatasetManager** -- dataset CRUD and type filtering
|
|
18
|
+
3. **RetoldFactoRecordManager** -- record ingest and certainty assignment
|
|
19
|
+
4. **RetoldFactoIngestEngine** -- file-based and batch-based ingest with job tracking
|
|
20
|
+
5. **RetoldFactoProjectionEngine** -- projection compilation, deployment, and querying (the largest subsystem)
|
|
21
|
+
6. **RetoldFactoCatalogManager** -- catalog entries and provisioning helpers
|
|
22
|
+
7. **RetoldFactoStoreConnectionManager** -- external database connections with masked passwords
|
|
23
|
+
8. **RetoldFactoSchemaManager** -- schema versioning and MicroDDL compilation
|
|
24
|
+
9. **RetoldFactoSourceFolderScanner** -- README-driven folder discovery
|
|
25
|
+
10. **RetoldFactoDataLakeService** -- offline data lake primitives
|
|
26
|
+
11. **ThroughputMonitor** -- per-stage performance metrics
|
|
27
|
+
12. **RetoldFactoBeaconProvider** -- Ultravisor beacon registration (optional)
|
|
28
|
+
|
|
29
|
+
## Features
|
|
30
|
+
|
|
31
|
+
- **Raw Warehouse** -- 21 tables covering sources, datasets, records, ingest jobs, binary attachments, and certainty metadata
|
|
32
|
+
- **Batch Ingest** -- `IngestEngine` with file scanning, job tracking, content-signature dedupe, and automatic dataset versioning
|
|
33
|
+
- **Projection Compilation** -- `ProjectionEngine` applies `Mappings` JSON to raw records and produces de-duped comprehensions
|
|
34
|
+
- **Merge Strategies** -- Five built-in strategies (`WriteAll`, `FirstWriteWins`, `ReliabilityOverwrite`, `MergeAndReinforce`, `FieldFillOnly`) for combining records from multiple sources
|
|
35
|
+
- **Projection Deployment** -- Flat output materialized to any Meadow-supported backend (SQLite, MySQL, PostgreSQL, MSSQL)
|
|
36
|
+
- **Web UI** -- Two Pict browser applications: a compact `pict-app` and a full-featured `pict-app-full` with per-entity editors and projection flow diagrams
|
|
37
|
+
- **Source Catalog** -- Research-grade catalog entries, folder scanning via README conventions, and automatic provisioning
|
|
38
|
+
- **Optional Beacon** -- Register as an Ultravisor beacon and expose `FactoData`, `FactoTransform`, and `FactoDeploy` capabilities
|
|
39
|
+
- **Path Safety** -- All filesystem operations use sanitized paths relative to the configured data directory
|
|
40
|
+
|
|
41
|
+
## When to Use It
|
|
42
|
+
|
|
43
|
+
Reach for Facto when you need:
|
|
44
|
+
|
|
45
|
+
- A single warehouse for records from heterogeneous sources with provenance tracking
|
|
46
|
+
- A declarative transform pipeline that produces analytical projections without writing ETL code
|
|
47
|
+
- A REST surface and a web UI for managing sources, datasets, records, projections, and mappings
|
|
48
|
+
- An Ultravisor-integratable target for workflow-driven data operations
|
|
49
|
+
- A place to experiment with multi-source merge strategies and certainty-weighted joins
|
|
50
|
+
|
|
51
|
+
Skip it if your data pipeline is a single-source ETL from one CSV to one database -- you can do that with Meadow directly and skip the projection/certainty/catalog overhead.
|
|
52
|
+
|
|
53
|
+
## Relationship with Ultravisor
|
|
54
|
+
|
|
55
|
+
Facto and Ultravisor are designed to work together, but neither requires the other:
|
|
56
|
+
|
|
57
|
+
- **Facto standalone** -- `retold-facto serve` starts the REST server, loads the schema, and opens the web UI. Ingest runs when you POST to it, projections compile when you invoke the compile endpoint, and deployment happens on demand.
|
|
58
|
+
- **Facto + Ultravisor** -- `retold-facto serve --beacon <url>` additionally registers the beacon with an Ultravisor coordinator. Workflow operations can then dispatch `FactoData`, `FactoTransform`, and `FactoDeploy` capabilities to the beacon instead of calling REST endpoints directly.
|
|
59
|
+
|
|
60
|
+
The beacon mode is most useful in distributed deployments where several Facto instances run close to their data sources and an Ultravisor coordinator orchestrates the pipeline that flows through them. See [Ultravisor Integration](ultravisor-integration.md) for the full contract.
|
|
61
|
+
|
|
62
|
+
## Learn More
|
|
63
|
+
|
|
64
|
+
- [Quick Start](quickstart.md) -- install, init, serve, and run your first ingest
|
|
65
|
+
- [Architecture](architecture.md) -- process layout, class hierarchy, and data-flow diagrams
|
|
66
|
+
- [API Reference](api-reference.md) -- every REST endpoint, grouped by subsystem
|
|
67
|
+
- [Ultravisor Integration](ultravisor-integration.md) -- beacon capabilities and workflow patterns
|
|
68
|
+
- **Subsystem guides:**
|
|
69
|
+
- [Recordset](subsystems/recordset.md) -- records, certainty, and ingest jobs
|
|
70
|
+
- [Projection](subsystems/projection.md) -- compilation, merge strategies, deployment
|
|
71
|
+
- [Mapping](subsystems/mapping.md) -- the transform DSL and how to write one
|
|
72
|
+
- [Connection](subsystems/connection.md) -- external projection store management
|
|
73
|
+
- [Audit](subsystems/audit.md) -- what is tracked and how to query it
|
package/docs/_cover.md
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# Retold Facto
|
|
2
|
+
|
|
3
|
+
> A data warehouse and knowledge graph for the Retold ecosystem
|
|
4
|
+
|
|
5
|
+
Ingest records from anywhere, track their provenance and certainty, compile them into projections via declarative mappings, and deploy those projections to any Meadow-supported backend. Run it standalone, as a Pict web application, or as an Ultravisor beacon exposing ingest / transform / deploy as workflow capabilities.
|
|
6
|
+
|
|
7
|
+
- **Records + Certainty** -- Every record carries source provenance, ingest-job lineage, and a configurable certainty index
|
|
8
|
+
- **Ingest Engine** -- CSV, JSON, folder-scan, and API ingest with dedupe, versioning, and job tracking
|
|
9
|
+
- **Projection Engine** -- Compile raw records into flat projections via declarative mappings and five merge strategies
|
|
10
|
+
- **Connection Manager** -- SQLite, MySQL, PostgreSQL, and MSSQL projection targets via Meadow connectors
|
|
11
|
+
- **Source Catalog** -- Research-grade catalog with README-based folder discovery
|
|
12
|
+
- **Ultravisor Beacon** -- Optional beacon mode exposes FactoData, FactoTransform, and FactoDeploy as workflow capabilities
|
|
13
|
+
|
|
14
|
+
[Overview](README.md)
|
|
15
|
+
[Quick Start](quickstart.md)
|
|
16
|
+
[Architecture](architecture.md)
|
|
17
|
+
[Ultravisor Integration](ultravisor-integration.md)
|
|
18
|
+
[GitHub](https://github.com/stevenvelozo/retold-facto)
|
package/docs/_sidebar.md
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
- Getting Started
|
|
2
|
+
|
|
3
|
+
- [Overview](README.md)
|
|
4
|
+
- [Quick Start](quickstart.md)
|
|
5
|
+
|
|
6
|
+
- Reference
|
|
7
|
+
|
|
8
|
+
- [Architecture](architecture.md)
|
|
9
|
+
- [API Reference](api-reference.md)
|
|
10
|
+
- [Ultravisor Integration](ultravisor-integration.md)
|
|
11
|
+
|
|
12
|
+
- Subsystems
|
|
13
|
+
|
|
14
|
+
- [Recordset](subsystems/recordset.md)
|
|
15
|
+
- [Projection](subsystems/projection.md)
|
|
16
|
+
- [Mapping](subsystems/mapping.md)
|
|
17
|
+
- [Connection](subsystems/connection.md)
|
|
18
|
+
- [Audit](subsystems/audit.md)
|
|
19
|
+
|
|
20
|
+
- Retold Ecosystem
|
|
21
|
+
|
|
22
|
+
- [Meadow](/meadow/meadow/)
|
|
23
|
+
- [Meadow Endpoints](/meadow/meadow-endpoints/)
|
|
24
|
+
- [Meadow Integration](/meadow/meadow-integration/)
|
|
25
|
+
- [Orator](/orator/orator/)
|
|
26
|
+
- [Pict](/pict/pict/)
|
|
27
|
+
- [Stricture](/utility/stricture/)
|
|
28
|
+
- [Ultravisor](/apps/ultravisor/)
|
|
29
|
+
- [Ultravisor Beacon](/utility/ultravisor-beacon/)
|
|
30
|
+
- [Fable](/fable/fable/)
|
package/docs/_topbar.md
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# Retold Facto
|
|
2
|
+
|
|
3
|
+
- [Overview](README.md)
|
|
4
|
+
- [Quick Start](quickstart.md)
|
|
5
|
+
- [Architecture](architecture.md)
|
|
6
|
+
- [API Reference](api-reference.md)
|
|
7
|
+
- [Ultravisor](ultravisor-integration.md)
|
|
8
|
+
- [Recordset](subsystems/recordset.md)
|
|
9
|
+
- [Projection](subsystems/projection.md)
|
|
10
|
+
- [Mapping](subsystems/mapping.md)
|
|
11
|
+
- [Connection](subsystems/connection.md)
|
|
12
|
+
- [Audit](subsystems/audit.md)
|
|
13
|
+
- [GitHub](https://github.com/stevenvelozo/retold-facto)
|
|
@@ -0,0 +1,245 @@
|
|
|
1
|
+
# API Reference
|
|
2
|
+
|
|
3
|
+
Facto exposes two parallel REST surfaces:
|
|
4
|
+
|
|
5
|
+
1. **`/1.0/<Entity>`** -- auto-generated Meadow CRUD endpoints for every entity in the schema (created by `meadow-endpoints` from the stricture JSON)
|
|
6
|
+
2. **`/facto/*`** -- subsystem-specific endpoints registered by each service manager
|
|
7
|
+
|
|
8
|
+
This reference groups endpoints by subsystem. For the Meadow CRUD conventions (filtering, sorting, bulk operations, page size, etc.) see the [meadow-endpoints](https://github.com/stevenvelozo/meadow-endpoints) documentation.
|
|
9
|
+
|
|
10
|
+
## Meadow CRUD
|
|
11
|
+
|
|
12
|
+
For every entity in the schema, `meadow-endpoints` generates:
|
|
13
|
+
|
|
14
|
+
| Method | Path | Purpose |
|
|
15
|
+
|---|---|---|
|
|
16
|
+
| `GET` | `/1.0/<Entity>` | List (supports `FBV~Column~Op~Value` filter expressions, page / pageSize, sort) |
|
|
17
|
+
| `GET` | `/1.0/<Entity>/:ID` | Read a single row |
|
|
18
|
+
| `POST` | `/1.0/<Entity>` | Create a row |
|
|
19
|
+
| `PUT` | `/1.0/<Entity>/:ID` | Update a row |
|
|
20
|
+
| `DELETE` | `/1.0/<Entity>/:ID` | Soft-delete a row |
|
|
21
|
+
| `GET` | `/1.0/<Entity>/Count` | Row count (with optional filter) |
|
|
22
|
+
|
|
23
|
+
Entities: `Source`, `SourceDocumentation`, `Dataset`, `DatasetSource`, `Record`, `RecordBinary`, `CertaintyIndex`, `IngestJob`, `SourceCatalogEntry`, `CatalogDatasetDefinition`, `MultiSetProjection`, `ProjectionStore`, `ProjectionMapping`, `ProjectionCertaintyLog`, `StoreConnection`, `FactoSchema`, `SchemaDocumentation`, `SchemaVersion`, `ThroughputEvent`.
|
|
24
|
+
|
|
25
|
+
## Source Manager (`/facto/source/*`)
|
|
26
|
+
|
|
27
|
+
| Method | Path | Purpose |
|
|
28
|
+
|---|---|---|
|
|
29
|
+
| `GET` | `/facto/source/by-hash/:Hash` | Look up a source by its human-readable hash |
|
|
30
|
+
| `GET` | `/facto/sources/active` | List only active (not soft-deleted, not deactivated) sources |
|
|
31
|
+
| `PUT` | `/facto/source/:IDSource/activate` | Mark a source as active |
|
|
32
|
+
| `PUT` | `/facto/source/:IDSource/deactivate` | Mark a source as inactive |
|
|
33
|
+
|
|
34
|
+
## Dataset Manager (`/facto/dataset/*`)
|
|
35
|
+
|
|
36
|
+
| Method | Path | Purpose |
|
|
37
|
+
|---|---|---|
|
|
38
|
+
| `GET` | `/facto/dataset/:IDDataset` | Fetch a dataset with its full metadata |
|
|
39
|
+
| `GET` | `/facto/dataset/by-hash/:Hash` | Look up a dataset by hash |
|
|
40
|
+
| `GET` | `/facto/datasets/by-type/:Type` | List datasets filtered by type (`Raw`, `Compositional`, `Projection`, `Derived`) |
|
|
41
|
+
| `GET` | `/facto/datasets/types` | List the allowed dataset types |
|
|
42
|
+
|
|
43
|
+
## Record Manager (`/facto/record/*`)
|
|
44
|
+
|
|
45
|
+
| Method | Path | Purpose |
|
|
46
|
+
|---|---|---|
|
|
47
|
+
| `POST` | `/facto/record/ingest` | Batch-ingest an array of records into a dataset. Body: `{ Records, IDDataset, IDSource, Type, DefaultCertainty }`. Returns `{ Ingested, Errors }`. |
|
|
48
|
+
|
|
49
|
+
See the [Recordset subsystem](subsystems/recordset.md) for the record shape and certainty semantics.
|
|
50
|
+
|
|
51
|
+
## Ingest Engine (`/facto/ingest/*`)
|
|
52
|
+
|
|
53
|
+
| Method | Path | Purpose |
|
|
54
|
+
|---|---|---|
|
|
55
|
+
| `GET` | `/facto/ingest/jobs` | List all ingest jobs, optionally filtered by dataset, source, or status |
|
|
56
|
+
| `GET` | `/facto/ingest/job/:IDIngestJob` | Fetch a job's full metadata and log |
|
|
57
|
+
| `GET` | `/facto/ingest/statuses` | List the allowed ingest statuses |
|
|
58
|
+
| `POST` | `/facto/ingest/file` | Ingest a file uploaded as multipart/form-data or by server-side path |
|
|
59
|
+
| `POST` | `/facto/ingest/job` | Create a new empty ingest job (for subsequent record-level posts) |
|
|
60
|
+
|
|
61
|
+
## Projection Engine (`/facto/projection*` and `/facto/projections*`)
|
|
62
|
+
|
|
63
|
+
The projection engine has the largest REST surface because it owns compilation, mapping CRUD, deployment, and querying.
|
|
64
|
+
|
|
65
|
+
### Discovery and Summary
|
|
66
|
+
|
|
67
|
+
| Method | Path | Purpose |
|
|
68
|
+
|---|---|---|
|
|
69
|
+
| `GET` | `/facto/projections` | List every dataset of type `Projection` |
|
|
70
|
+
| `GET` | `/facto/projections/summary` | Aggregate counts (projections, mappings, stores) |
|
|
71
|
+
| `GET` | `/facto/datasets/by-type/Projection` | Same as `/facto/projections` |
|
|
72
|
+
|
|
73
|
+
### Query
|
|
74
|
+
|
|
75
|
+
| Method | Path | Purpose |
|
|
76
|
+
|---|---|---|
|
|
77
|
+
| `POST` | `/facto/projections/query` | Run a parameterized query against a projection store |
|
|
78
|
+
| `POST` | `/facto/projections/aggregate` | Run an aggregation query |
|
|
79
|
+
| `POST` | `/facto/projections/certainty` | Analyze certainty distribution for a projection |
|
|
80
|
+
| `POST` | `/facto/projections/compare` | Compare two projections side-by-side |
|
|
81
|
+
|
|
82
|
+
### Schema and Store
|
|
83
|
+
|
|
84
|
+
| Method | Path | Purpose |
|
|
85
|
+
|---|---|---|
|
|
86
|
+
| `GET` | `/facto/projection/:IDDataset/schema` | Fetch the current schema for a projection |
|
|
87
|
+
| `POST` | `/facto/projection/:IDDataset/save-schema` | Save a schema edit |
|
|
88
|
+
| `GET` | `/facto/projection/:IDDataset/stores` | List projection stores configured for a dataset |
|
|
89
|
+
| `POST` | `/facto/projection/:IDDataset/deploy` | Deploy the projection to an external store. Body: `{ IDStoreConnection, TargetTableName }` |
|
|
90
|
+
|
|
91
|
+
### Mappings
|
|
92
|
+
|
|
93
|
+
| Method | Path | Purpose |
|
|
94
|
+
|---|---|---|
|
|
95
|
+
| `GET` | `/facto/projection/:IDDataset/mappings` | List mappings for a dataset |
|
|
96
|
+
| `GET` | `/facto/projection/mapping/:IDProjectionMapping` | Fetch a single mapping |
|
|
97
|
+
| `POST` | `/facto/projection/:IDDataset/mapping` | Create a new mapping |
|
|
98
|
+
| `POST` | `/facto/projection/mapping/:IDProjectionMapping/update` | Update an existing mapping |
|
|
99
|
+
| `POST` | `/facto/projection/:IDDataset/discover-fields` | Auto-discover the projection schema from a sample of records |
|
|
100
|
+
| `POST` | `/facto/projection/:IDDataset/import` | Import a mapping configuration from JSON |
|
|
101
|
+
|
|
102
|
+
### Comprehensions
|
|
103
|
+
|
|
104
|
+
| Method | Path | Purpose |
|
|
105
|
+
|---|---|---|
|
|
106
|
+
| `GET` | `/facto/projection/:IDDataset/comprehensions` | List the compiled comprehensions for a dataset |
|
|
107
|
+
| `GET` | `/facto/projection/:IDDataset/comprehension/:Filename` | Fetch a single compiled comprehension |
|
|
108
|
+
|
|
109
|
+
See the [Projection subsystem](subsystems/projection.md) and [Mapping subsystem](subsystems/mapping.md) for the semantics.
|
|
110
|
+
|
|
111
|
+
## Catalog Manager (`/facto/catalog/*`)
|
|
112
|
+
|
|
113
|
+
| Method | Path | Purpose |
|
|
114
|
+
|---|---|---|
|
|
115
|
+
| `GET` | `/facto/catalog/entries` | List catalog entries |
|
|
116
|
+
| `GET` | `/facto/catalog/entry/:ID` | Fetch a single catalog entry |
|
|
117
|
+
| `GET` | `/facto/catalog/export` | Export the full catalog as JSON |
|
|
118
|
+
| `GET` | `/facto/catalog/search` | Search catalog entries |
|
|
119
|
+
| `GET` | `/facto/catalog/source-links` | List source relationships for a catalog entry |
|
|
120
|
+
| `POST` | `/facto/catalog/entry` | Create a catalog entry |
|
|
121
|
+
| `POST` | `/facto/catalog/entry/:ID/update` | Update an existing catalog entry |
|
|
122
|
+
| `POST` | `/facto/catalog/dataset/:IDEntry` | Provision the dataset described by a catalog entry |
|
|
123
|
+
| `POST` | `/facto/catalog/import` | Import a catalog from JSON |
|
|
124
|
+
|
|
125
|
+
## Store Connection Manager (`/facto/connection*`)
|
|
126
|
+
|
|
127
|
+
| Method | Path | Purpose |
|
|
128
|
+
|---|---|---|
|
|
129
|
+
| `GET` | `/facto/connections` | List all store connections (passwords masked) |
|
|
130
|
+
| `GET` | `/facto/connection/:IDStoreConnection` | Fetch a single connection |
|
|
131
|
+
| `GET` | `/facto/connection/available-types` | List the supported connector types |
|
|
132
|
+
| `POST` | `/facto/connection` | Create a new connection |
|
|
133
|
+
| `POST` | `/facto/connection/test` | Test connectivity against an existing or prospective connection |
|
|
134
|
+
| `POST` | `/facto/connection/:IDStoreConnection` | Update a connection (password preserved if client sends `***`) |
|
|
135
|
+
| `DELETE` | `/facto/connection/:IDStoreConnection` | Soft-delete a connection |
|
|
136
|
+
|
|
137
|
+
See the [Connection subsystem](subsystems/connection.md) for the full configuration shape and security model.
|
|
138
|
+
|
|
139
|
+
## Source Folder Scanner (`/facto/scanner/*`)
|
|
140
|
+
|
|
141
|
+
| Method | Path | Purpose |
|
|
142
|
+
|---|---|---|
|
|
143
|
+
| `GET` | `/facto/scanner/paths` | List configured scan paths |
|
|
144
|
+
| `GET` | `/facto/scanner/datasets` | List datasets discovered by scanning |
|
|
145
|
+
| `GET` | `/facto/scanner/dataset/:Name` | Fetch a specific discovered dataset |
|
|
146
|
+
| `POST` | `/facto/scanner/dataset/:Name/provision` | Provision a discovered dataset (create Source and Dataset rows) |
|
|
147
|
+
| `POST` | `/facto/scanner/dataset/:Name/ingest` | Provision and run the full ingest pipeline for a discovered dataset |
|
|
148
|
+
|
|
149
|
+
## Schema Manager (`/facto/schema*`)
|
|
150
|
+
|
|
151
|
+
| Method | Path | Purpose |
|
|
152
|
+
|---|---|---|
|
|
153
|
+
| `GET` | `/facto/schemas/active` | List active schemas |
|
|
154
|
+
| `GET` | `/facto/schema/:IDSchema` | Fetch a schema |
|
|
155
|
+
| `GET` | `/facto/schema/:IDSchema/documentation` | Fetch schema documentation |
|
|
156
|
+
| `POST` | `/facto/schema` | Create a schema |
|
|
157
|
+
| `POST` | `/facto/schema/:IDSchema/update` | Update a schema |
|
|
158
|
+
| `POST` | `/facto/schema/:IDSchema/documentation` | Attach documentation to a schema |
|
|
159
|
+
| `POST` | `/facto/schema/compile` | Compile MicroDDL text into a Meadow schema |
|
|
160
|
+
| `POST` | `/facto/schema/preview-sql` | Preview the DDL that would be produced for a given Meadow schema |
|
|
161
|
+
|
|
162
|
+
## Application Class (`RetoldFacto`)
|
|
163
|
+
|
|
164
|
+
`RetoldFacto` extends `fable-serviceproviderbase` and exposes the following developer-facing methods:
|
|
165
|
+
|
|
166
|
+
### Model Management
|
|
167
|
+
|
|
168
|
+
| Method | Purpose |
|
|
169
|
+
|---|---|
|
|
170
|
+
| `loadModel(pModelName, pModelObject, pStorageProvider, fCallback)` | Load a schema model from a JavaScript object |
|
|
171
|
+
| `loadModelFromFile(pModelName, pModelPath, pModelFilename, fCallback)` | Load a schema model from a stricture JSON file |
|
|
172
|
+
| `rebuildFullModel()` | Rebuild the merged schema after a model load |
|
|
173
|
+
|
|
174
|
+
### Schema
|
|
175
|
+
|
|
176
|
+
| Method | Purpose |
|
|
177
|
+
|---|---|
|
|
178
|
+
| `createSchema(fCallback)` | Execute `FACTO_SCHEMA_SQL` against the current database |
|
|
179
|
+
| `isEndpointGroupEnabled(pGroupName)` | Returns `true` if a named endpoint group is enabled in `options.Endpoints` |
|
|
180
|
+
|
|
181
|
+
### Lifecycle
|
|
182
|
+
|
|
183
|
+
| Method | Purpose |
|
|
184
|
+
|---|---|
|
|
185
|
+
| `onBeforeInitialize(fCallback)` | Lifecycle hook, inherited from Fable |
|
|
186
|
+
| `onInitialize(fCallback)` | Lifecycle hook, inherited from Fable |
|
|
187
|
+
| `onAfterInitialize(fCallback)` | Lifecycle hook, inherited from Fable |
|
|
188
|
+
| `initializeService(fCallback)` | Initialize Meadow, Orator, and all subsystem managers |
|
|
189
|
+
| `initializePersistenceEngine(fCallback)` | Initialize Meadow connectors and load the schema |
|
|
190
|
+
| `startService(fCallback)` | Start the Orator HTTP server |
|
|
191
|
+
| `stopService(fCallback)` | Stop the Orator HTTP server and disconnect the beacon (if registered) |
|
|
192
|
+
|
|
193
|
+
### Utility
|
|
194
|
+
|
|
195
|
+
| Method | Purpose |
|
|
196
|
+
|---|---|
|
|
197
|
+
| `generateHash(pInput)` | Generate a stable slug from a string |
|
|
198
|
+
|
|
199
|
+
## Subsystem Manager Methods
|
|
200
|
+
|
|
201
|
+
Most subsystem managers expose only `connectRoutes(pOratorServiceServer)` as their public API (the routes themselves are the interface). The three that expose rich internal methods worth calling from code:
|
|
202
|
+
|
|
203
|
+
### `RetoldFactoRecordManager`
|
|
204
|
+
|
|
205
|
+
- `ingestSingleRecord(pRecordData, fCallback)` -- create a single `Record` with its `CertaintyIndex` entry
|
|
206
|
+
|
|
207
|
+
### `RetoldFactoIngestEngine`
|
|
208
|
+
|
|
209
|
+
- `ingestFile(pFilePath, pIDDataset, pIDSource, pOptions, fCallback)`
|
|
210
|
+
- `ingestJSON(pData, pIDDataset, pIDSource, pOptions, fCallback)`
|
|
211
|
+
- `ingestCSV(pCSVContent, pIDDataset, pIDSource, pOptions, fCallback)`
|
|
212
|
+
- `getNextDatasetVersion(pIDDataset, fCallback)`
|
|
213
|
+
- `computeContentSignature(pContent)` -- SHA-256 of ingested content used for dedupe
|
|
214
|
+
- `checkDuplicateSignature(pIDDataset, pSignature, fCallback)`
|
|
215
|
+
- `appendJobLog(pIDIngestJob, pMessage, fCallback)`
|
|
216
|
+
|
|
217
|
+
### `RetoldFactoProjectionEngine`
|
|
218
|
+
|
|
219
|
+
- `compileProjection(pIDDataset, pConfig, fCallback)`
|
|
220
|
+
- `deploySchema(pIDDataset, pIDStoreConnection, pTargetTableName, fCallback)`
|
|
221
|
+
- `discoverProjectionFields(pIDDataset, fCallback)`
|
|
222
|
+
- `importProjectionMapping(pIDDataset, pMappingJSON, fCallback)`
|
|
223
|
+
|
|
224
|
+
### `RetoldFactoBeaconProvider`
|
|
225
|
+
|
|
226
|
+
- `connectBeacon(pBeaconConfig, fCallback)` -- register with an Ultravisor coordinator
|
|
227
|
+
- `disconnectBeacon(fCallback)` -- unregister cleanly
|
|
228
|
+
|
|
229
|
+
See [Ultravisor Integration](ultravisor-integration.md) for the beacon config shape and the capabilities registered.
|
|
230
|
+
|
|
231
|
+
## Response Conventions
|
|
232
|
+
|
|
233
|
+
Every Facto subsystem endpoint follows the same response shape:
|
|
234
|
+
|
|
235
|
+
```json
|
|
236
|
+
{ "Success": true, "Data": { /* ... */ } }
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
Or on failure:
|
|
240
|
+
|
|
241
|
+
```json
|
|
242
|
+
{ "Success": false, "Error": "Human-readable message" }
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
Meadow CRUD endpoints use their own (longer-standing) conventions -- see the meadow-endpoints docs for the full reference.
|