meadow-integration 1.0.4 → 1.0.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.dockerignore +11 -0
- package/Docker-Build.sh +2 -0
- package/Docker-Compose.sh +2 -0
- package/Docker-Push.sh +2 -0
- package/Docker-Tag.sh +2 -0
- package/Dockerfile +28 -0
- package/Dockerfile_LUXURYCode +23 -0
- package/README.md +139 -25
- package/docker-compose.yml +16 -0
- package/docs/README.md +65 -18
- package/docs/{cover.md → _cover.md} +3 -2
- package/docs/_sidebar.md +52 -7
- package/docs/_topbar.md +2 -0
- package/docs/api/clone-rest-client.md +278 -0
- package/docs/api/connection-manager.md +179 -0
- package/docs/api/guid-map.md +234 -0
- package/docs/api/integration-adapter.md +283 -0
- package/docs/api/operation.md +241 -0
- package/docs/api/sync-entity-initial.md +227 -0
- package/docs/api/sync-entity-ongoing.md +244 -0
- package/docs/api/sync.md +213 -0
- package/docs/api/tabular-check.md +213 -0
- package/docs/api/tabular-transform.md +316 -0
- package/docs/architecture.md +423 -0
- package/docs/cli/comprehensionarray.md +111 -0
- package/docs/cli/comprehensionintersect.md +132 -0
- package/docs/cli/csvcheck.md +111 -0
- package/docs/cli/csvtransform.md +170 -0
- package/docs/cli/data-clone.md +277 -0
- package/docs/cli/jsonarraytransform.md +166 -0
- package/docs/cli/load-comprehension.md +129 -0
- package/docs/cli/objectarraytocsv.md +159 -0
- package/docs/cli/overview.md +96 -0
- package/docs/cli/serve.md +102 -0
- package/docs/cli/tsvtransform.md +144 -0
- package/docs/data-clone/configuration.md +357 -0
- package/docs/data-clone/connection-manager.md +206 -0
- package/docs/data-clone/docker.md +290 -0
- package/docs/data-clone/overview.md +173 -0
- package/docs/data-clone/sync-modes.md +186 -0
- package/docs/implementation-reference.md +311 -0
- package/docs/overview.md +156 -0
- package/docs/quickstart.md +233 -0
- package/docs/rest/comprehension-push.md +209 -0
- package/docs/rest/comprehension.md +506 -0
- package/docs/rest/csv.md +255 -0
- package/docs/rest/entity-generation.md +158 -0
- package/docs/rest/json-array.md +243 -0
- package/docs/rest/overview.md +120 -0
- package/docs/rest/status.md +63 -0
- package/docs/rest/tsv.md +241 -0
- package/docs/retold-catalog.json +93 -3
- package/docs/retold-keyword-index.json +23683 -1901
- package/package.json +13 -10
- package/scripts/run.sh +18 -0
- package/source/Meadow-Integration.js +15 -1
- package/source/cli/Default-Meadow-Integration-Configuration.json +37 -2
- package/source/cli/Meadow-Integration-CLI-Program.js +4 -1
- package/source/cli/commands/Meadow-Integration-Command-DataClone.js +284 -0
- package/source/services/clone/Meadow-Service-ConnectionManager.js +251 -0
- package/source/services/clone/Meadow-Service-Operation.js +196 -0
- package/source/services/clone/Meadow-Service-RestClient.js +364 -0
- package/source/services/clone/Meadow-Service-Sync-Entity-Initial.js +367 -0
- package/source/services/clone/Meadow-Service-Sync-Entity-Ongoing.js +457 -0
- package/source/services/clone/Meadow-Service-Sync.js +142 -0
- /package/docs/examples/bookstore/{mapping_books_Author.json → mapping_books_author.json} +0 -0
- /package/docs/examples/bookstore/{mapping_books_Book.json → mapping_books_book.json} +0 -0
|
@@ -0,0 +1,311 @@
|
|
|
1
|
+
# Implementation Reference
|
|
2
|
+
|
|
3
|
+
This document covers the internal structure of Meadow Integration, including module layout, service patterns, configuration resolution, error handling, and extension points.
|
|
4
|
+
|
|
5
|
+
## Module Structure
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
meadow-integration/
|
|
9
|
+
├── package.json
|
|
10
|
+
├── Dockerfile
|
|
11
|
+
├── docker-compose.yml
|
|
12
|
+
├── scripts/
|
|
13
|
+
│ └── run.sh
|
|
14
|
+
├── source/
|
|
15
|
+
│ ├── Meadow-Integration.js # Main export (programmatic API)
|
|
16
|
+
│ ├── Meadow-Service-Integration-Adapter.js # IntegrationAdapter service
|
|
17
|
+
│ ├── Meadow-Service-Integration-GUIDMap.js # GUIDMap service
|
|
18
|
+
│ ├── cli/
|
|
19
|
+
│ │ ├── Meadow-Integration-CLI-Program.js # CLI program setup
|
|
20
|
+
│ │ ├── Meadow-Integration-CLI-Run.js # CLI entry point
|
|
21
|
+
│ │ ├── Default-Meadow-Integration-Configuration.json
|
|
22
|
+
│ │ └── commands/
|
|
23
|
+
│ │ ├── Meadow-Integration-Command-CSVCheck.js
|
|
24
|
+
│ │ ├── Meadow-Integration-Command-CSVTransform.js
|
|
25
|
+
│ │ ├── Meadow-Integration-Command-TSVCheck.js
|
|
26
|
+
│ │ ├── Meadow-Integration-Command-TSVTransform.js
|
|
27
|
+
│ │ ├── Meadow-Integration-Command-JSONArrayTransform.js
|
|
28
|
+
│ │ ├── Meadow-Integration-Command-ComprehensionIntersect.js
|
|
29
|
+
│ │ ├── Meadow-Integration-Command-ComprehensionArray.js
|
|
30
|
+
│ │ ├── Meadow-Integration-Command-ComprehensionPush.js
|
|
31
|
+
│ │ ├── Meadow-Integration-Command-ObjectArrayToCSV.js
|
|
32
|
+
│ │ ├── Meadow-Integration-Command-EntityFromTabularFolder.js
|
|
33
|
+
│ │ ├── Meadow-Integration-Command-DataClone.js
|
|
34
|
+
│ │ ├── Meadow-Integration-Command-ConvertXLSMToXLSX.js
|
|
35
|
+
│ │ └── Meadow-Integration-Command-Serve.js
|
|
36
|
+
│ ├── restserver/
|
|
37
|
+
│ │ ├── Meadow-Integration-Server.js # REST server class
|
|
38
|
+
│ │ ├── Meadow-Integration-Server-Endpoints.js # Endpoint registration
|
|
39
|
+
│ │ └── endpoints/
|
|
40
|
+
│ │ ├── Endpoint-CSVCheck.js
|
|
41
|
+
│ │ ├── Endpoint-CSVTransform.js
|
|
42
|
+
│ │ ├── Endpoint-TSVCheck.js
|
|
43
|
+
│ │ ├── Endpoint-TSVTransform.js
|
|
44
|
+
│ │ ├── Endpoint-JSONArrayTransform.js
|
|
45
|
+
│ │ ├── Endpoint-ComprehensionIntersect.js
|
|
46
|
+
│ │ ├── Endpoint-ComprehensionArray.js
|
|
47
|
+
│ │ ├── Endpoint-ComprehensionPush.js
|
|
48
|
+
│ │ ├── Endpoint-ObjectArrayToCSV.js
|
|
49
|
+
│ │ └── Endpoint-EntityFromTabularFolder.js
|
|
50
|
+
│ └── services/
|
|
51
|
+
│ ├── clone/
|
|
52
|
+
│ │ ├── Meadow-Service-ConnectionManager.js # Database connection pooling
|
|
53
|
+
│ │ ├── Meadow-Service-RestClient.js # Authenticated REST client
|
|
54
|
+
│ │ ├── Meadow-Service-Sync.js # Sync orchestrator
|
|
55
|
+
│ │ ├── Meadow-Service-Sync-Entity-Initial.js # Initial sync strategy
|
|
56
|
+
│ │ ├── Meadow-Service-Sync-Entity-Ongoing.js # Ongoing sync strategy
|
|
57
|
+
│ │ └── Meadow-Service-Operation.js # Timing and progress tracking
|
|
58
|
+
│ └── tabular/
|
|
59
|
+
│ ├── Service-TabularCheck.js # Statistics collection
|
|
60
|
+
│ └── Service-TabularTransform.js # Record transformation
|
|
61
|
+
├── test/
|
|
62
|
+
├── examples/
|
|
63
|
+
└── docs/
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Service Registration Patterns
|
|
67
|
+
|
|
68
|
+
All services in Meadow Integration extend `fable-serviceproviderbase`. They follow the Fable service provider pattern for registration and instantiation.
|
|
69
|
+
|
|
70
|
+
### Registration and Instantiation
|
|
71
|
+
|
|
72
|
+
Services are registered with a service type name and then instantiated. This gives them access to `this.fable` (the Fable instance), `this.options` (merged configuration), and `this.log` (the Fable logger).
|
|
73
|
+
|
|
74
|
+
```javascript
|
|
75
|
+
// Register the service type
|
|
76
|
+
this.fable.serviceManager.addServiceType('MeadowConnectionManager', libMeadowConnectionManager);
|
|
77
|
+
|
|
78
|
+
// Instantiate with options and an optional service hash (name)
|
|
79
|
+
this.fable.serviceManager.instantiateServiceProvider('MeadowConnectionManager', tmpConfig.Destination);
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
After instantiation, the service is available on the Fable instance by its type name:
|
|
83
|
+
|
|
84
|
+
```javascript
|
|
85
|
+
this.fable.MeadowConnectionManager.connect(fCallback);
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Lazy Service Initialization
|
|
89
|
+
|
|
90
|
+
Several services check whether a dependency is already registered before adding it. This prevents duplicate registration when multiple services share a dependency.
|
|
91
|
+
|
|
92
|
+
```javascript
|
|
93
|
+
if (!this.fable.hasOwnProperty('MeadowGUIDMap'))
|
|
94
|
+
{
|
|
95
|
+
this.fable.addAndInstantiateServiceType('MeadowGUIDMap', libGUIDMap);
|
|
96
|
+
}
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
### Named Service Instances
|
|
100
|
+
|
|
101
|
+
When multiple instances of the same service type are needed (e.g., one `SyncEntity` per table), the third parameter to `instantiateServiceProvider` acts as a service hash:
|
|
102
|
+
|
|
103
|
+
```javascript
|
|
104
|
+
this.fable.serviceManager.instantiateServiceProvider(
|
|
105
|
+
'MeadowSyncEntityInitial',
|
|
106
|
+
tmpSyncEntityOptions,
|
|
107
|
+
`SyncEntity-${tmpEntitySchema.TableName}`
|
|
108
|
+
);
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
These are stored in `this.fable.servicesMap.MeadowSyncEntityInitial` keyed by the hash.
|
|
112
|
+
|
|
113
|
+
## Configuration Resolution Order
|
|
114
|
+
|
|
115
|
+
### CLI Program Configuration
|
|
116
|
+
|
|
117
|
+
The CLI program (`pict-service-commandlineutility`) resolves configuration in this order:
|
|
118
|
+
|
|
119
|
+
1. **Default** -- `Default-Meadow-Integration-Configuration.json` baked into the module
|
|
120
|
+
2. **File** -- `.meadow.config.json` in the current working directory (auto-gathered by the CLI framework)
|
|
121
|
+
3. **Command-line** -- Flags passed to the specific command (e.g., `--api_server`, `--db_host`)
|
|
122
|
+
|
|
123
|
+
The result is available as `this.fable.ProgramConfiguration`.
|
|
124
|
+
|
|
125
|
+
### Data Clone Configuration
|
|
126
|
+
|
|
127
|
+
The `data-clone` command has its own `_resolveConfig()` method that builds the final configuration:
|
|
128
|
+
|
|
129
|
+
```javascript
|
|
130
|
+
const tmpConfig = JSON.parse(JSON.stringify(this.fable.ProgramConfiguration));
|
|
131
|
+
|
|
132
|
+
// Apply CLI overrides for Source
|
|
133
|
+
if (this.CommandOptions.api_server) tmpConfig.Source.ServerURL = this.CommandOptions.api_server;
|
|
134
|
+
|
|
135
|
+
// Apply CLI overrides for Destination
|
|
136
|
+
if (this.CommandOptions.db_host) tmpConfig.Destination[tmpProvider].server = this.CommandOptions.db_host;
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
### Transformation Mapping Configuration
|
|
140
|
+
|
|
141
|
+
The `TabularTransform` service uses a three-layer cascade for mapping configuration:
|
|
142
|
+
|
|
143
|
+
1. **Implicit** -- Auto-generated from the first incoming record. Entity name is derived from the filename, GUID template uses the first column, and all columns are mapped 1:1.
|
|
144
|
+
2. **Explicit** -- Loaded from a mapping file (JSON). Specifies entity name, GUID template, and selective column mappings with Pict template expressions.
|
|
145
|
+
3. **User** -- CLI flags (`-e`, `-g`, `-n`, `-c`) override individual values.
|
|
146
|
+
|
|
147
|
+
The layers merge with `Object.assign`:
|
|
148
|
+
|
|
149
|
+
```javascript
|
|
150
|
+
tmpMappingOutcome.Configuration = Object.assign(
|
|
151
|
+
{},
|
|
152
|
+
tmpMappingOutcome.ImplicitConfiguration,
|
|
153
|
+
tmpMappingOutcome.ExplicitConfiguration,
|
|
154
|
+
tmpMappingOutcome.UserConfiguration
|
|
155
|
+
);
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
## Error Handling Patterns
|
|
159
|
+
|
|
160
|
+
### Retry with Threshold
|
|
161
|
+
|
|
162
|
+
The `IntegrationAdapter` uses a retry pattern for upsert operations. If a push fails, it retries up to `RecordPushRetryThreshold` times (default: 5, hard cap: 50):
|
|
163
|
+
|
|
164
|
+
```javascript
|
|
165
|
+
upsertSingleRecord(fCallback, pRecordGUID, pRetryCount)
|
|
166
|
+
{
|
|
167
|
+
let tmpRetryCount = (typeof(pRetryCount) === 'undefined') ? 0 : pRetryCount;
|
|
168
|
+
|
|
169
|
+
if ((tmpRetryCount > this._RecordPushRetryThreshold) || (tmpRetryCount > 50))
|
|
170
|
+
{
|
|
171
|
+
this.log.error(`Retry threshold reached for ${this.Entity}.${pRecordGUID}`);
|
|
172
|
+
return fCallback();
|
|
173
|
+
}
|
|
174
|
+
|
|
175
|
+
// ... attempt upsert, on failure:
|
|
176
|
+
this.upsertSingleRecord(fCallback, pRecordGUID, tmpRetryCount++);
|
|
177
|
+
}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
### Non-Fatal Continuation
|
|
181
|
+
|
|
182
|
+
Sync operations log errors but continue processing remaining records and entities. This prevents a single bad record from stopping an entire clone operation:
|
|
183
|
+
|
|
184
|
+
```javascript
|
|
185
|
+
if (pCreateError)
|
|
186
|
+
{
|
|
187
|
+
this.log.error(`Error creating record ${this.EntitySchema.TableName}: ${pCreateError}`);
|
|
188
|
+
return fEntitySyncComplete(); // Continue to next record
|
|
189
|
+
}
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
### Connection Failure
|
|
193
|
+
|
|
194
|
+
The `ConnectionManager` wraps database driver loading in try/catch blocks to handle cases where the driver package is not installed:
|
|
195
|
+
|
|
196
|
+
```javascript
|
|
197
|
+
try
|
|
198
|
+
{
|
|
199
|
+
const libMeadowConnectionMySQL = require('meadow-connection-mysql');
|
|
200
|
+
// ... connect
|
|
201
|
+
}
|
|
202
|
+
catch (pError)
|
|
203
|
+
{
|
|
204
|
+
this.log.error(`Failed to load MySQL provider. Ensure meadow-connection-mysql is installed.`);
|
|
205
|
+
return fCallback(pError);
|
|
206
|
+
}
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
### Validation at Construction
|
|
210
|
+
|
|
211
|
+
Sync entity services validate their options in the constructor, throwing errors for missing or invalid configuration rather than failing silently later:
|
|
212
|
+
|
|
213
|
+
```javascript
|
|
214
|
+
if (!this.options.hasOwnProperty('MeadowEntitySchema'))
|
|
215
|
+
{
|
|
216
|
+
throw new Error('MeadowSyncEntityInitial requires a valid MeadowEntitySchema option.');
|
|
217
|
+
}
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
## Extension Points
|
|
221
|
+
|
|
222
|
+
### Custom Record Marshaling
|
|
223
|
+
|
|
224
|
+
The `IntegrationAdapter.integrateRecords` method accepts an optional `fMarshalExtraData` callback that runs after each record is marshaled. This allows injecting additional data or performing side effects:
|
|
225
|
+
|
|
226
|
+
```javascript
|
|
227
|
+
adapter.integrateRecords(fCallback,
|
|
228
|
+
(pSourceRecord, pMarshaledRecord) =>
|
|
229
|
+
{
|
|
230
|
+
// Add computed fields, log specific records, etc.
|
|
231
|
+
pMarshaledRecord.ComputedField = computeValue(pSourceRecord);
|
|
232
|
+
});
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### Custom REST Client Preparation
|
|
236
|
+
|
|
237
|
+
The `CloneRestClient` exposes a `prepareRequestOptions` method that can be overridden to add custom headers, authentication tokens, or request modifications:
|
|
238
|
+
|
|
239
|
+
```javascript
|
|
240
|
+
cloneClient.prepareRequestOptions = (pOptions) =>
|
|
241
|
+
{
|
|
242
|
+
pOptions.headers = { 'X-Custom-Header': 'value' };
|
|
243
|
+
return pOptions;
|
|
244
|
+
};
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Custom Sync Entity Lists
|
|
248
|
+
|
|
249
|
+
The `Sync` service accepts a `SyncEntityList` array in its options or via `ProgramConfiguration`. When provided, only the listed entities are synced, in the specified order. When empty, all entities in the loaded schema are synced:
|
|
250
|
+
|
|
251
|
+
```json
|
|
252
|
+
{
|
|
253
|
+
"SyncEntityList": ["User", "Role", "Permission"]
|
|
254
|
+
}
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
### Per-Entity Sync Options
|
|
258
|
+
|
|
259
|
+
The `SyncEntityOptions` configuration key allows per-entity overrides:
|
|
260
|
+
|
|
261
|
+
```json
|
|
262
|
+
{
|
|
263
|
+
"SyncEntityOptions": {
|
|
264
|
+
"AuditLog": { "PageSize": 500 }
|
|
265
|
+
}
|
|
266
|
+
}
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### Solver-Based Multi-Entity Extraction
|
|
270
|
+
|
|
271
|
+
The `TabularTransform` service supports Solvers in mapping configurations. Solvers use the Fable Expression Parser to run arbitrary expressions during transformation, enabling extraction of multiple entity types from a single source row:
|
|
272
|
+
|
|
273
|
+
```json
|
|
274
|
+
{
|
|
275
|
+
"Entity": "BookAuthor",
|
|
276
|
+
"MultipleGUIDUniqueness": true,
|
|
277
|
+
"GUIDTemplate": "BookAuthor_{~D:Record.book_id~}_{~D:_GUIDUniqueness~}",
|
|
278
|
+
"Solvers": [
|
|
279
|
+
"// Custom solver expressions that populate NewRecordsGUIDUniqueness"
|
|
280
|
+
]
|
|
281
|
+
}
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
### MappingOutcome Lifecycle Hooks
|
|
285
|
+
|
|
286
|
+
The `TabularTransform` service provides overridable hooks for the mapping outcome lifecycle:
|
|
287
|
+
|
|
288
|
+
- `onBeforeInitializeMappingOutcomeObject(pMappingOutcomeObject)` -- Called before the mapping configuration is resolved
|
|
289
|
+
- `onAfterInitializeMappingOutcomeObject(pMappingOutcomeObject)` -- Called after configuration is merged and the comprehension entity container is created
|
|
290
|
+
|
|
291
|
+
These can be overridden in a subclass to inject custom initialization logic.
|
|
292
|
+
|
|
293
|
+
## Testing Approach
|
|
294
|
+
|
|
295
|
+
Tests are written with Mocha in TDD style and run via Quackage:
|
|
296
|
+
|
|
297
|
+
```shell
|
|
298
|
+
npm test # Run tests
|
|
299
|
+
npm run coverage # Run with nyc coverage
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
Test files are located in the `test/` directory. The test suite covers:
|
|
303
|
+
|
|
304
|
+
- **Tabular parsing** -- Verifying CSV, TSV, and JSON array parsing with known input files
|
|
305
|
+
- **Transformation** -- Testing mapping configurations, GUID generation, and comprehension output
|
|
306
|
+
- **Comprehension operations** -- Intersect, array conversion, and CSV export
|
|
307
|
+
- **Integration Adapter** -- Record marshaling, GUID prefixing, and cross-entity ID resolution
|
|
308
|
+
- **REST endpoints** -- Starting the server and exercising each endpoint with HTTP requests
|
|
309
|
+
- **Data Clone services** -- ConnectionManager, CloneRestClient, and Sync service integration
|
|
310
|
+
|
|
311
|
+
Test data files are stored in the `debug/testdata/` directory and include sample CSV, TSV, and JSON files along with mapping configurations and expected outputs.
|
package/docs/overview.md
ADDED
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
# Overview
|
|
2
|
+
|
|
3
|
+
Meadow Integration is a toolkit for bridging external data sources and the Meadow data access layer. It handles two complementary workflows: **data transformation** (getting data _into_ Meadow) and **data synchronization** (replicating data _from_ a Meadow API to a local database).
|
|
4
|
+
|
|
5
|
+
## Why Meadow Integration Exists
|
|
6
|
+
|
|
7
|
+
Applications built on the Retold stack use Meadow as their data access layer, with entities defined by Stricture schemas and exposed through auto-generated REST APIs. Meadow Integration fills the gaps around that core:
|
|
8
|
+
|
|
9
|
+
- **Ingestion** -- External data (CSV exports, TSV dumps, JSON feeds) needs to be converted into the entity format that Meadow expects before it can be loaded.
|
|
10
|
+
- **Merging** -- Data often comes from multiple sources that describe the same entities. These sources need to be reconciled by shared keys before loading.
|
|
11
|
+
- **Cloning** -- Teams running analytics, reporting, or development work need local copies of production data without direct database access. Data Clone pulls entity records through the Meadow REST API and writes them into a local MySQL or MSSQL database.
|
|
12
|
+
|
|
13
|
+
## Three Modes of Operation
|
|
14
|
+
|
|
15
|
+
Meadow Integration exposes its functionality through three interfaces. All three share the same underlying service implementations.
|
|
16
|
+
|
|
17
|
+
### CLI
|
|
18
|
+
|
|
19
|
+
The `meadow-integration` command-line program (also available as `mdwint` when installed globally) provides commands for every operation. It is built on `pict-service-commandlineutility` and reads configuration from a `.meadow.config.json` file if one exists in the working directory.
|
|
20
|
+
|
|
21
|
+
```shell
|
|
22
|
+
npx meadow-integration csvtransform ./books.csv -m mapping.json -o books.json
|
|
23
|
+
npx meadow-integration data-clone --api_server https://api.example.com/1.0/ --schema_path ./schema.json
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
### REST API
|
|
27
|
+
|
|
28
|
+
The `serve` command starts an HTTP server (powered by Orator and Restify) that exposes every transformation and merge operation as a POST endpoint. This lets other services call Meadow Integration over the network.
|
|
29
|
+
|
|
30
|
+
```shell
|
|
31
|
+
npx meadow-integration serve -p 8086
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Endpoints are grouped under `/1.0/` and include CSV, TSV, JSON Array, and Comprehension operations.
|
|
35
|
+
|
|
36
|
+
### Programmatic API
|
|
37
|
+
|
|
38
|
+
The module exports its services directly for use in your own Node.js code:
|
|
39
|
+
|
|
40
|
+
```javascript
|
|
41
|
+
const MeadowIntegration = require('meadow-integration');
|
|
42
|
+
|
|
43
|
+
// TabularCheck, IntegrationServer, ConnectionManager,
|
|
44
|
+
// CloneRestClient, Sync, SyncEntityInitial,
|
|
45
|
+
// SyncEntityOngoing, Operation
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Services follow the Fable service provider pattern: instantiate them with a Fable instance and they get access to logging, configuration, and dependency injection automatically.
|
|
49
|
+
|
|
50
|
+
## Data Transformation Pipeline
|
|
51
|
+
|
|
52
|
+
The transformation pipeline converts external tabular data into Meadow-compatible entity records through a format called a **Comprehension**.
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
External Data (CSV / TSV / JSON Array)
|
|
56
|
+
|
|
|
57
|
+
v
|
|
58
|
+
TabularTransform Service
|
|
59
|
+
(column mapping via Pict templates)
|
|
60
|
+
|
|
|
61
|
+
v
|
|
62
|
+
Comprehension Object
|
|
63
|
+
(entity records keyed by GUID)
|
|
64
|
+
|
|
|
65
|
+
v
|
|
66
|
+
Integration Adapter
|
|
67
|
+
(marshal to Meadow schema, resolve cross-entity GUIDs)
|
|
68
|
+
|
|
|
69
|
+
v
|
|
70
|
+
GUID Map
|
|
71
|
+
(track external <-> Meadow IDs)
|
|
72
|
+
|
|
|
73
|
+
v
|
|
74
|
+
Meadow REST API
|
|
75
|
+
(single upsert or bulk upsert)
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
**Key steps:**
|
|
79
|
+
|
|
80
|
+
1. **Parse** -- The source file is parsed into individual records (rows).
|
|
81
|
+
2. **Map** -- Each record is transformed using a mapping configuration. Pict template expressions (`{~D:Record.column_name~}`) resolve column values. A GUID is generated for each record from a template.
|
|
82
|
+
3. **Collect** -- Transformed records are gathered into a Comprehension object, keyed by their generated GUID. Duplicate GUIDs are merged automatically.
|
|
83
|
+
4. **Marshal** -- The Integration Adapter converts comprehension records into Meadow entity format, prefixing GUIDs and resolving cross-entity ID references through the GUID Map.
|
|
84
|
+
5. **Push** -- Records are upserted to the target Meadow REST API, either individually or in bulk batches (configurable threshold).
|
|
85
|
+
|
|
86
|
+
## Data Synchronization Pipeline
|
|
87
|
+
|
|
88
|
+
The Data Clone pipeline replicates entities from a remote Meadow REST API into a local relational database.
|
|
89
|
+
|
|
90
|
+
```
|
|
91
|
+
Remote Meadow REST API
|
|
92
|
+
|
|
|
93
|
+
v
|
|
94
|
+
CloneRestClient
|
|
95
|
+
(authenticated HTTP, session management, caching)
|
|
96
|
+
|
|
|
97
|
+
v
|
|
98
|
+
Sync Service
|
|
99
|
+
(orchestrates per-entity sync in schema order)
|
|
100
|
+
|
|
|
101
|
+
v
|
|
102
|
+
SyncEntity (Initial or Ongoing)
|
|
103
|
+
(paginated download, record comparison, create/update)
|
|
104
|
+
|
|
|
105
|
+
v
|
|
106
|
+
ConnectionManager
|
|
107
|
+
(MySQL or MSSQL connection pool)
|
|
108
|
+
|
|
|
109
|
+
v
|
|
110
|
+
Local Database
|
|
111
|
+
(tables auto-created from Meadow schema, indexes on GUID and Deleted columns)
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
**Key steps:**
|
|
115
|
+
|
|
116
|
+
1. **Connect** -- The ConnectionManager establishes a connection pool to the local database. The CloneRestClient authenticates with the remote API (if credentials are configured).
|
|
117
|
+
2. **Schema** -- The Meadow extended schema JSON is loaded. Tables are created locally if they do not exist. Indexes are added on GUID and Deleted columns.
|
|
118
|
+
3. **Compare** -- For each entity, the sync service compares local and remote max IDs and record counts to determine what needs to be synchronized.
|
|
119
|
+
4. **Download** -- Records are fetched from the remote API in configurable page sizes using filtered, sorted endpoint calls.
|
|
120
|
+
5. **Write** -- Each downloaded record is marshaled into the local schema format and either created or updated in the local database. Identity insert is enabled so that primary keys match the remote system.
|
|
121
|
+
|
|
122
|
+
Two sync modes are available:
|
|
123
|
+
|
|
124
|
+
- **Initial** -- Downloads all records with IDs greater than the local maximum. Intended for first-time clones or catch-up operations.
|
|
125
|
+
- **Ongoing** -- Compares `UpdateDate` timestamps between local and remote records and updates any that have changed. Handles both new records and modifications.
|
|
126
|
+
|
|
127
|
+
## Key Concepts
|
|
128
|
+
|
|
129
|
+
### Comprehension
|
|
130
|
+
|
|
131
|
+
A JSON object that stores entity records keyed by their GUID. A single comprehension can hold multiple entity types. This is the intermediate data format that connects all transformation operations.
|
|
132
|
+
|
|
133
|
+
### GUID
|
|
134
|
+
|
|
135
|
+
A deterministic, template-generated identifier that uniquely identifies a record across systems. GUIDs are composed from a configurable template (e.g., `Book_{~D:Record.isbn~}`), ensuring the same source data always produces the same key.
|
|
136
|
+
|
|
137
|
+
### Entity
|
|
138
|
+
|
|
139
|
+
A named type of record corresponding to a Meadow schema table (e.g., `Book`, `Author`, `Airport`). Entities have a standard identity column (`IDEntityName`), a GUID column (`GUIDEntityName`), and timestamping columns (`CreateDate`, `UpdateDate`).
|
|
140
|
+
|
|
141
|
+
### Mapping
|
|
142
|
+
|
|
143
|
+
A configuration object that describes how to transform source columns into entity fields. It includes the entity name, a GUID template, and a dictionary of field-to-template mappings. Optional Solvers allow multi-entity extraction from a single source row.
|
|
144
|
+
|
|
145
|
+
## When to Use Which Tool
|
|
146
|
+
|
|
147
|
+
| Scenario | Tool |
|
|
148
|
+
|----------|------|
|
|
149
|
+
| One-off CSV/TSV analysis | `csvcheck` CLI command |
|
|
150
|
+
| Transform a file into entity records | `csvtransform` / `tsvtransform` / `jsonarraytransform` CLI commands |
|
|
151
|
+
| Combine multiple data sources | `comprehensionintersect` CLI command |
|
|
152
|
+
| Push transformed data to a Meadow API | `load_comprehension` CLI command or Comprehension Push REST endpoint |
|
|
153
|
+
| Expose transformations as a service | `serve` CLI command (REST API) |
|
|
154
|
+
| Replicate a Meadow API to a local database | `data-clone` CLI command |
|
|
155
|
+
| Run data clone in a container | Docker deployment with `.meadow.config.json` |
|
|
156
|
+
| Use services in your own code | Programmatic API via `require('meadow-integration')` |
|
|
@@ -0,0 +1,233 @@
|
|
|
1
|
+
# Quick Start
|
|
2
|
+
|
|
3
|
+
This guide walks through the core workflows of Meadow Integration, from analyzing a CSV file to cloning a remote Meadow API into a local database.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
Install the package from npm:
|
|
8
|
+
|
|
9
|
+
```shell
|
|
10
|
+
npm install meadow-integration
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
For CLI usage without a global install:
|
|
14
|
+
|
|
15
|
+
```shell
|
|
16
|
+
npx meadow-integration --help
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
If you install globally (or add it to your project), the `mdwint` shorthand is also available:
|
|
20
|
+
|
|
21
|
+
```shell
|
|
22
|
+
npm install -g meadow-integration
|
|
23
|
+
mdwint --help
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Your First CSV Analysis
|
|
27
|
+
|
|
28
|
+
Start by inspecting a CSV file to understand its structure. The `csvcheck` command produces statistics about rows, columns, and per-column data quality.
|
|
29
|
+
|
|
30
|
+
```shell
|
|
31
|
+
npx meadow-integration csvcheck ./data/books.csv -o book-stats.json
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
This writes a JSON file containing:
|
|
35
|
+
|
|
36
|
+
- Row and column counts
|
|
37
|
+
- Header names
|
|
38
|
+
- First and last rows as sample data
|
|
39
|
+
- Per-column statistics: total count, empty count, numeric count, first value, and last value
|
|
40
|
+
|
|
41
|
+
Use this to verify your data before transforming it.
|
|
42
|
+
|
|
43
|
+
## Your First CSV Transformation
|
|
44
|
+
|
|
45
|
+
Transform a CSV into a Comprehension -- the intermediate JSON format used across all Meadow Integration operations.
|
|
46
|
+
|
|
47
|
+
```shell
|
|
48
|
+
npx meadow-integration csvtransform ./data/books.csv \
|
|
49
|
+
-e "Book" \
|
|
50
|
+
-n "GUIDBook" \
|
|
51
|
+
-g "Book_{~D:Record.id~}" \
|
|
52
|
+
-o books-comprehension.json
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
This produces a JSON file where each record is keyed by its generated GUID:
|
|
56
|
+
|
|
57
|
+
```json
|
|
58
|
+
{
|
|
59
|
+
"Book": {
|
|
60
|
+
"Book_1": { "GUIDBook": "Book_1", "id": "1", "title": "The Hunger Games", ... },
|
|
61
|
+
"Book_2": { "GUIDBook": "Book_2", "id": "2", "title": "Harry Potter", ... }
|
|
62
|
+
}
|
|
63
|
+
}
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**Flags explained:**
|
|
67
|
+
|
|
68
|
+
- `-e Book` -- The entity name in the comprehension
|
|
69
|
+
- `-n GUIDBook` -- The name of the GUID column on each record
|
|
70
|
+
- `-g "Book_{~D:Record.id~}"` -- A Pict template that generates each record's GUID from its `id` column
|
|
71
|
+
- `-o books-comprehension.json` -- Where to write the output
|
|
72
|
+
|
|
73
|
+
## Using Mapping Files
|
|
74
|
+
|
|
75
|
+
For production use, mapping files give you precise control over which columns map to which fields and how GUIDs are generated.
|
|
76
|
+
|
|
77
|
+
Create a file called `mapping_Book.json`:
|
|
78
|
+
|
|
79
|
+
```json
|
|
80
|
+
{
|
|
81
|
+
"Entity": "Book",
|
|
82
|
+
"GUIDTemplate": "Book_{~D:Record.id~}",
|
|
83
|
+
"Mappings": {
|
|
84
|
+
"Title": "{~D:Record.title~}",
|
|
85
|
+
"Language": "{~D:Record.language_code~}",
|
|
86
|
+
"ISBN": "{~D:Record.isbn~}",
|
|
87
|
+
"AverageRating": "{~D:Record.average_rating~}"
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Then run the transform with the mapping file:
|
|
93
|
+
|
|
94
|
+
```shell
|
|
95
|
+
npx meadow-integration csvtransform ./data/books.csv \
|
|
96
|
+
-m mapping_Book.json \
|
|
97
|
+
-o books-comprehension.json
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
The mapping file acts as the "Explicit" configuration layer. Any CLI flags you pass (like `-e` or `-g`) override the mapping file as the "User" layer.
|
|
101
|
+
|
|
102
|
+
## Merging Comprehensions
|
|
103
|
+
|
|
104
|
+
When data comes from multiple files or sources, merge them with `comprehensionintersect`. Records with the same GUID are combined, with later values overwriting earlier ones.
|
|
105
|
+
|
|
106
|
+
First, create a second comprehension from a different source:
|
|
107
|
+
|
|
108
|
+
```shell
|
|
109
|
+
npx meadow-integration csvtransform ./data/book-ratings.csv \
|
|
110
|
+
-m mapping_BookRatings.json \
|
|
111
|
+
-o ratings-comprehension.json
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Then merge them:
|
|
115
|
+
|
|
116
|
+
```shell
|
|
117
|
+
npx meadow-integration comprehensionintersect books-comprehension.json \
|
|
118
|
+
-i ratings-comprehension.json \
|
|
119
|
+
-e Book \
|
|
120
|
+
-o merged-books.json
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
You can also merge during transformation by passing an existing comprehension with `-i`:
|
|
124
|
+
|
|
125
|
+
```shell
|
|
126
|
+
npx meadow-integration csvtransform ./data/authors.csv \
|
|
127
|
+
-m mapping_Author.json \
|
|
128
|
+
-i books-comprehension.json \
|
|
129
|
+
-o full-comprehension.json
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
This adds `Author` records to the same comprehension that already contains `Book` records.
|
|
133
|
+
|
|
134
|
+
## Pushing to a Meadow API
|
|
135
|
+
|
|
136
|
+
Once your comprehension is ready, push it to a running Meadow REST API:
|
|
137
|
+
|
|
138
|
+
```shell
|
|
139
|
+
npx meadow-integration load_comprehension ./full-comprehension.json \
|
|
140
|
+
-p "IMPORT-2024"
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
This creates an Integration Adapter for each entity type in the comprehension. For each record, the adapter:
|
|
144
|
+
|
|
145
|
+
1. Generates a Meadow-compatible GUID using the configured prefix
|
|
146
|
+
2. Resolves cross-entity ID references through the GUID Map
|
|
147
|
+
3. Upserts the record to the Meadow API (single or bulk, based on record count)
|
|
148
|
+
|
|
149
|
+
The `-p` flag sets the adapter set GUID marshal prefix, which namespaces the import.
|
|
150
|
+
|
|
151
|
+
## Setting Up Data Clone
|
|
152
|
+
|
|
153
|
+
Data Clone replicates a remote Meadow API into a local database. Configuration is stored in a `.meadow.config.json` file in your working directory.
|
|
154
|
+
|
|
155
|
+
Create `.meadow.config.json`:
|
|
156
|
+
|
|
157
|
+
```json
|
|
158
|
+
{
|
|
159
|
+
"Source": {
|
|
160
|
+
"ServerURL": "https://api.example.com/1.0/",
|
|
161
|
+
"UserID": "sync-user",
|
|
162
|
+
"Password": "sync-password"
|
|
163
|
+
},
|
|
164
|
+
"Destination": {
|
|
165
|
+
"Provider": "MySQL",
|
|
166
|
+
"MySQL": {
|
|
167
|
+
"server": "127.0.0.1",
|
|
168
|
+
"port": 3306,
|
|
169
|
+
"user": "root",
|
|
170
|
+
"password": "localpass",
|
|
171
|
+
"database": "meadow_clone",
|
|
172
|
+
"connectionLimit": 20
|
|
173
|
+
}
|
|
174
|
+
},
|
|
175
|
+
"SchemaPath": "./schema/Model-Extended.json",
|
|
176
|
+
"Sync": {
|
|
177
|
+
"DefaultSyncMode": "Initial",
|
|
178
|
+
"PageSize": 100,
|
|
179
|
+
"SyncEntityList": [],
|
|
180
|
+
"SyncEntityOptions": {}
|
|
181
|
+
}
|
|
182
|
+
}
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Configuration sections:**
|
|
186
|
+
|
|
187
|
+
- **Source** -- The remote Meadow REST API to read from. `UserID` and `Password` are optional; if omitted, authentication is skipped.
|
|
188
|
+
- **Destination** -- The local database to write to. Supports `MySQL` and `MSSQL` providers.
|
|
189
|
+
- **SchemaPath** -- Path to the Meadow extended schema JSON (generated by Stricture). This defines which tables and columns to create locally.
|
|
190
|
+
- **Sync** -- Controls the sync behavior. Leave `SyncEntityList` empty to sync all entities in the schema, or list specific entity names to sync a subset.
|
|
191
|
+
|
|
192
|
+
## Running a Data Clone
|
|
193
|
+
|
|
194
|
+
With the configuration in place, run the clone:
|
|
195
|
+
|
|
196
|
+
```shell
|
|
197
|
+
npx meadow-integration data-clone
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
This will:
|
|
201
|
+
|
|
202
|
+
1. Authenticate with the source API (if credentials are configured)
|
|
203
|
+
2. Connect to the local database
|
|
204
|
+
3. Load the Meadow schema and create any missing tables
|
|
205
|
+
4. Add indexes on GUID and Deleted columns
|
|
206
|
+
5. Download and insert all records for each entity
|
|
207
|
+
|
|
208
|
+
For subsequent runs, switch to Ongoing mode to only sync changes:
|
|
209
|
+
|
|
210
|
+
```shell
|
|
211
|
+
npx meadow-integration data-clone -s Ongoing
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
You can also override configuration from the command line:
|
|
215
|
+
|
|
216
|
+
```shell
|
|
217
|
+
npx meadow-integration data-clone \
|
|
218
|
+
--api_server https://api.example.com/1.0/ \
|
|
219
|
+
--db_host 127.0.0.1 \
|
|
220
|
+
--db_name meadow_clone \
|
|
221
|
+
--schema_path ./schema/Model-Extended.json \
|
|
222
|
+
--sync_mode Initial
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
## Next Steps
|
|
226
|
+
|
|
227
|
+
- [Overview](overview.md) -- Full feature overview and when to use each tool
|
|
228
|
+
- [Architecture](architecture.md) -- System design with diagrams
|
|
229
|
+
- [Mapping Files](mapping-files.md) -- Detailed mapping configuration reference
|
|
230
|
+
- [Comprehensions](comprehensions.md) -- The comprehension data format in depth
|
|
231
|
+
- [CLI Reference](cli-reference.md) -- All commands and their options
|
|
232
|
+
- [REST API Reference](rest-api-reference.md) -- All REST endpoints
|
|
233
|
+
- [Examples](examples-walkthrough.md) -- Walkthrough of all runnable examples
|