npm - forgecraft-mcp - Versions diffs - 1.2.0 → 1.3.2 - Mend

forgecraft-mcp 1.2.0 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (136) hide show

package/README.md +525 -525
package/dist/cli/help.js +44 -44
package/dist/registry/renderer-skeletons.js +92 -92
package/dist/shared/gs-score-logger.js +6 -6
package/dist/tools/add-module.js +123 -123
package/dist/tools/advice-registry.js +18 -18
package/dist/tools/check-cascade-report.js +64 -64
package/dist/tools/configure-mcp.d.ts +3 -0
package/dist/tools/configure-mcp.d.ts.map +1 -1
package/dist/tools/configure-mcp.js +10 -0
package/dist/tools/configure-mcp.js.map +1 -1
package/dist/tools/forgecraft-dispatch.d.ts.map +1 -1
package/dist/tools/forgecraft-dispatch.js +3 -0
package/dist/tools/forgecraft-dispatch.js.map +1 -1
package/dist/tools/forgecraft-schema-params.d.ts +9 -0
package/dist/tools/forgecraft-schema-params.d.ts.map +1 -1
package/dist/tools/forgecraft-schema-params.js +21 -0
package/dist/tools/forgecraft-schema-params.js.map +1 -1
package/dist/tools/forgecraft-schema.d.ts +9 -0
package/dist/tools/forgecraft-schema.d.ts.map +1 -1
package/dist/tools/refresh-output.js +14 -14
package/dist/tools/scaffold-spec-stubs.js +115 -115
package/dist/tools/scaffold-templates.js +62 -62
package/dist/tools/setup-artifact-writers.d.ts +30 -0
package/dist/tools/setup-artifact-writers.d.ts.map +1 -1
package/dist/tools/setup-artifact-writers.js +120 -8
package/dist/tools/setup-artifact-writers.js.map +1 -1
package/dist/tools/setup-phase1.d.ts +3 -0
package/dist/tools/setup-phase1.d.ts.map +1 -1
package/dist/tools/setup-phase1.js +79 -35
package/dist/tools/setup-phase1.js.map +1 -1
package/dist/tools/setup-phase2.d.ts +2 -0
package/dist/tools/setup-phase2.d.ts.map +1 -1
package/dist/tools/setup-phase2.js +10 -1
package/dist/tools/setup-phase2.js.map +1 -1
package/dist/tools/setup-project.d.ts +18 -0
package/dist/tools/setup-project.d.ts.map +1 -1
package/dist/tools/setup-project.js +77 -1
package/dist/tools/setup-project.js.map +1 -1
package/dist/tools/spec-parser-tags.d.ts +9 -0
package/dist/tools/spec-parser-tags.d.ts.map +1 -1
package/dist/tools/spec-parser-tags.js +92 -0
package/dist/tools/spec-parser-tags.js.map +1 -1
package/package.json +89 -86
package/templates/analytics/instructions.yaml +37 -37
package/templates/analytics/mcp-servers.yaml +11 -11
package/templates/analytics/structure.yaml +25 -25
package/templates/api/instructions.yaml +231 -231
package/templates/api/mcp-servers.yaml +22 -13
package/templates/api/nfr.yaml +23 -23
package/templates/api/review.yaml +103 -103
package/templates/api/structure.yaml +34 -34
package/templates/api/verification.yaml +132 -132
package/templates/cli/instructions.yaml +31 -31
package/templates/cli/mcp-servers.yaml +11 -11
package/templates/cli/review.yaml +53 -53
package/templates/cli/structure.yaml +16 -16
package/templates/data-lineage/instructions.yaml +28 -28
package/templates/data-lineage/mcp-servers.yaml +22 -22
package/templates/data-pipeline/instructions.yaml +84 -84
package/templates/data-pipeline/mcp-servers.yaml +13 -13
package/templates/data-pipeline/nfr.yaml +39 -39
package/templates/data-pipeline/structure.yaml +23 -23
package/templates/fintech/hooks.yaml +55 -55
package/templates/fintech/instructions.yaml +112 -112
package/templates/fintech/mcp-servers.yaml +13 -13
package/templates/fintech/nfr.yaml +46 -46
package/templates/fintech/playbook.yaml +210 -210
package/templates/fintech/verification.yaml +239 -239
package/templates/game/instructions.yaml +289 -289
package/templates/game/mcp-servers.yaml +38 -38
package/templates/game/nfr.yaml +64 -64
package/templates/game/playbook.yaml +214 -214
package/templates/game/review.yaml +97 -97
package/templates/game/structure.yaml +67 -67
package/templates/game/verification.yaml +174 -174
package/templates/healthcare/instructions.yaml +42 -42
package/templates/healthcare/mcp-servers.yaml +13 -13
package/templates/healthcare/nfr.yaml +47 -47
package/templates/hipaa/instructions.yaml +41 -41
package/templates/hipaa/mcp-servers.yaml +13 -13
package/templates/infra/instructions.yaml +104 -104
package/templates/infra/mcp-servers.yaml +20 -20
package/templates/infra/nfr.yaml +46 -46
package/templates/infra/review.yaml +65 -65
package/templates/infra/structure.yaml +25 -25
package/templates/library/instructions.yaml +36 -36
package/templates/library/mcp-servers.yaml +20 -20
package/templates/library/review.yaml +56 -56
package/templates/library/structure.yaml +19 -19
package/templates/medallion-architecture/instructions.yaml +41 -41
package/templates/medallion-architecture/mcp-servers.yaml +22 -22
package/templates/ml/instructions.yaml +85 -85
package/templates/ml/mcp-servers.yaml +11 -11
package/templates/ml/nfr.yaml +39 -39
package/templates/ml/structure.yaml +25 -25
package/templates/ml/verification.yaml +156 -156
package/templates/mobile/instructions.yaml +44 -44
package/templates/mobile/mcp-servers.yaml +11 -11
package/templates/mobile/nfr.yaml +49 -49
package/templates/mobile/structure.yaml +27 -27
package/templates/mobile/verification.yaml +121 -121
package/templates/observability-xray/instructions.yaml +40 -40
package/templates/observability-xray/mcp-servers.yaml +15 -15
package/templates/realtime/instructions.yaml +42 -42
package/templates/realtime/mcp-servers.yaml +13 -13
package/templates/soc2/instructions.yaml +41 -41
package/templates/soc2/mcp-servers.yaml +24 -24
package/templates/social/instructions.yaml +43 -43
package/templates/social/mcp-servers.yaml +24 -24
package/templates/state-machine/instructions.yaml +42 -42
package/templates/state-machine/mcp-servers.yaml +11 -11
package/templates/tools-registry.yaml +164 -164
package/templates/universal/hooks.yaml +531 -531
package/templates/universal/instructions.yaml +1692 -1692
package/templates/universal/mcp-servers.yaml +50 -50
package/templates/universal/nfr.yaml +197 -197
package/templates/universal/reference.yaml +326 -326
package/templates/universal/review.yaml +204 -204
package/templates/universal/skills.yaml +262 -262
package/templates/universal/structure.yaml +67 -67
package/templates/universal/verification.yaml +416 -416
package/templates/web-react/hooks.yaml +44 -44
package/templates/web-react/instructions.yaml +207 -207
package/templates/web-react/mcp-servers.yaml +20 -20
package/templates/web-react/nfr.yaml +27 -27
package/templates/web-react/review.yaml +94 -94
package/templates/web-react/structure.yaml +46 -46
package/templates/web-react/verification.yaml +126 -126
package/templates/web-static/instructions.yaml +115 -115
package/templates/web-static/mcp-servers.yaml +20 -20
package/templates/web3/instructions.yaml +44 -44
package/templates/web3/mcp-servers.yaml +11 -11
package/templates/web3/verification.yaml +159 -159
package/templates/zero-trust/instructions.yaml +41 -41
package/templates/zero-trust/mcp-servers.yaml +15 -15

package/templates/infra/structure.yaml CHANGED Viewed

@@ -1,25 +1,25 @@
-tag: INFRA
-section: structure
-entries:
-  - path: terraform/
-    description: "Terraform root — one directory per stack/environment"
-  - path: terraform/modules/
-    description: "Reusable Terraform modules (VPC, ECS, RDS, etc.)"
-  - path: terraform/environments/
-    description: "Per-environment tfvars: dev.tfvars, staging.tfvars, prod.tfvars"
-  - path: cdk/
-    description: "CDK app (if using CDK instead of Terraform)"
-  - path: cdk/lib/
-    description: "CDK stack definitions — one file per stack"
-  - path: cdk/bin/
-    description: "CDK app entry point"
-  - path: docker/
-    description: "Dockerfiles and docker-compose for local development"
-  - path: docker/docker-compose.yml
-    description: "Local development stack: app + backing services"
-  - path: scripts/
-    description: "Operational scripts: deploy, migrate, seed, rotate-secrets"
-  - path: .github/workflows/
-    description: "CI/CD pipeline definitions (GitHub Actions)"
-  - path: docs/runbooks/
-    description: "Operational runbooks: incident response, failover, scaling"
+tag: INFRA
+section: structure
+entries:
+  - path: terraform/
+    description: "Terraform root — one directory per stack/environment"
+  - path: terraform/modules/
+    description: "Reusable Terraform modules (VPC, ECS, RDS, etc.)"
+  - path: terraform/environments/
+    description: "Per-environment tfvars: dev.tfvars, staging.tfvars, prod.tfvars"
+  - path: cdk/
+    description: "CDK app (if using CDK instead of Terraform)"
+  - path: cdk/lib/
+    description: "CDK stack definitions — one file per stack"
+  - path: cdk/bin/
+    description: "CDK app entry point"
+  - path: docker/
+    description: "Dockerfiles and docker-compose for local development"
+  - path: docker/docker-compose.yml
+    description: "Local development stack: app + backing services"
+  - path: scripts/
+    description: "Operational scripts: deploy, migrate, seed, rotate-secrets"
+  - path: .github/workflows/
+    description: "CI/CD pipeline definitions (GitHub Actions)"
+  - path: docs/runbooks/
+    description: "Operational runbooks: incident response, failover, scaling"

package/templates/library/instructions.yaml CHANGED Viewed

@@ -1,36 +1,36 @@
-tag: LIBRARY
-section: instructions
-blocks:
-  - id: library-standards
-    tier: recommended
-    title: "Library Standards"
-    content: |
-      ## Library / Package Standards
-      ### Public API
-      - Clear, minimal public API surface. Export only what consumers need.
-      - Barrel file (index.ts / __init__.py) defines the public API explicitly.
-      - Internal modules prefixed with underscore or in internal/ directory.
-      - Every public API has JSDoc/docstring with examples.
-      ### Versioning & Compatibility
-      - Semantic versioning: MAJOR.MINOR.PATCH.
-      - MAJOR: breaking API changes. MINOR: new features, backward compatible. PATCH: bug fixes.
-      - CHANGELOG.md maintained with every release.
-      - Deprecation warnings before removal (minimum 1 minor version).
-      ### Distribution
-      - Package includes only dist/ and necessary runtime files.
-      - Types included (declaration files for TypeScript).
-      - Peer dependencies used for framework integrations.
-      - Minimize runtime dependencies — every dep is a risk.
-      ### Testing
-      - Test against the public API, not internals.
-      - Test with multiple versions of peer dependencies.
-      - Integration tests simulate real consumer usage patterns.
-      ### Documentation
-      - README with: install, quick start, API reference, examples.
-      - Usage examples for every major feature.
-      - Migration guide for every major version bump.
+tag: LIBRARY
+section: instructions
+blocks:
+  - id: library-standards
+    tier: recommended
+    title: "Library Standards"
+    content: |
+      ## Library / Package Standards
+      ### Public API
+      - Clear, minimal public API surface. Export only what consumers need.
+      - Barrel file (index.ts / __init__.py) defines the public API explicitly.
+      - Internal modules prefixed with underscore or in internal/ directory.
+      - Every public API has JSDoc/docstring with examples.
+      ### Versioning & Compatibility
+      - Semantic versioning: MAJOR.MINOR.PATCH.
+      - MAJOR: breaking API changes. MINOR: new features, backward compatible. PATCH: bug fixes.
+      - CHANGELOG.md maintained with every release.
+      - Deprecation warnings before removal (minimum 1 minor version).
+      ### Distribution
+      - Package includes only dist/ and necessary runtime files.
+      - Types included (declaration files for TypeScript).
+      - Peer dependencies used for framework integrations.
+      - Minimize runtime dependencies — every dep is a risk.
+      ### Testing
+      - Test against the public API, not internals.
+      - Test with multiple versions of peer dependencies.
+      - Integration tests simulate real consumer usage patterns.
+      ### Documentation
+      - README with: install, quick start, API reference, examples.
+      - Usage examples for every major feature.
+      - Migration guide for every major version bump.

package/templates/library/mcp-servers.yaml CHANGED Viewed

@@ -1,20 +1,20 @@
-tag: LIBRARY
-section: mcp-servers
-servers:
-  - name: filesystem
-    description: "Secure file operations — read, write, search, and directory management with configurable access controls"
-    command: npx
-    args: ["-y", "@modelcontextprotocol/server-filesystem", "/"]
-    tags: [CLI, LIBRARY]
-    category: general
-    tier: recommended
-    url: "https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem"
-  - name: npm-search
-    description: "Search npm registry for packages, check versions, and inspect dependencies"
-    command: npx
-    args: ["-y", "mcp-server-npm-search"]
-    tags: [LIBRARY]
-    category: documentation
-    tier: optional
-    url: "https://github.com/nicholasgriffintn/npm-search-mcp-server"
+tag: LIBRARY
+section: mcp-servers
+servers:
+  - name: filesystem
+    description: "Secure file operations — read, write, search, and directory management with configurable access controls"
+    command: npx
+    args: ["-y", "@modelcontextprotocol/server-filesystem", "/"]
+    tags: [CLI, LIBRARY]
+    category: general
+    tier: recommended
+    url: "https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem"
+  - name: npm-search
+    description: "Search npm registry for packages, check versions, and inspect dependencies"
+    command: npx
+    args: ["-y", "mcp-server-npm-search"]
+    tags: [LIBRARY]
+    category: documentation
+    tier: optional
+    url: "https://github.com/nicholasgriffintn/npm-search-mcp-server"

package/templates/library/review.yaml CHANGED Viewed

@@ -1,56 +1,56 @@
-tag: LIBRARY
-section: review
-blocks:
-  - id: library-architecture-review
-    tier: recommended
-    dimension: architecture
-    title: "Library Architecture Review"
-    description: |
-      Evaluate public API surface, backward compatibility, and consumer ergonomics.
-    checklist:
-      - id: public-api-surface
-        description: "Public API is minimal and intentional. Every export has a reason. Internal modules are not leaked."
-        severity: critical
-      - id: backward-compatibility
-        description: "Semantic versioning enforced. Breaking changes gated behind major version bumps."
-        severity: critical
-      - id: tree-shaking
-        description: "Package supports tree-shaking: named exports, sideEffects: false, ESM output."
-        severity: important
-      - id: type-exports
-        description: "TypeScript types exported and documented. Declaration files (.d.ts) ship with the package."
-        severity: important
-  - id: library-code-quality-review
-    tier: recommended
-    dimension: code-quality
-    title: "Library Code Quality Review"
-    description: |
-      Evaluate API consistency, documentation, and error messages for consumers.
-    checklist:
-      - id: api-consistency
-        description: "Consistent naming conventions and parameter ordering across all public methods."
-        severity: important
-      - id: jsdoc-coverage
-        description: "100% JSDoc coverage on all public APIs with typed params, returns, and examples."
-        severity: critical
-      - id: error-messages
-        description: "Error messages are actionable for consumers: what went wrong, what they should do."
-        severity: important
-  - id: library-test-review
-    tier: recommended
-    dimension: tests
-    title: "Library Test Review"
-    description: |
-      Evaluate public API contract tests and cross-environment compatibility.
-    checklist:
-      - id: contract-tests
-        description: "Every public API method has contract tests verifying inputs, outputs, and error cases."
-        severity: critical
-      - id: cross-env-testing
-        description: "Tests run in target environments (Node versions, browser if applicable)."
-        severity: important
-      - id: example-tests
-        description: "README examples are executable tests that stay in sync with the API."
-        severity: nice-to-have
+tag: LIBRARY
+section: review
+blocks:
+  - id: library-architecture-review
+    tier: recommended
+    dimension: architecture
+    title: "Library Architecture Review"
+    description: |
+      Evaluate public API surface, backward compatibility, and consumer ergonomics.
+    checklist:
+      - id: public-api-surface
+        description: "Public API is minimal and intentional. Every export has a reason. Internal modules are not leaked."
+        severity: critical
+      - id: backward-compatibility
+        description: "Semantic versioning enforced. Breaking changes gated behind major version bumps."
+        severity: critical
+      - id: tree-shaking
+        description: "Package supports tree-shaking: named exports, sideEffects: false, ESM output."
+        severity: important
+      - id: type-exports
+        description: "TypeScript types exported and documented. Declaration files (.d.ts) ship with the package."
+        severity: important
+  - id: library-code-quality-review
+    tier: recommended
+    dimension: code-quality
+    title: "Library Code Quality Review"
+    description: |
+      Evaluate API consistency, documentation, and error messages for consumers.
+    checklist:
+      - id: api-consistency
+        description: "Consistent naming conventions and parameter ordering across all public methods."
+        severity: important
+      - id: jsdoc-coverage
+        description: "100% JSDoc coverage on all public APIs with typed params, returns, and examples."
+        severity: critical
+      - id: error-messages
+        description: "Error messages are actionable for consumers: what went wrong, what they should do."
+        severity: important
+  - id: library-test-review
+    tier: recommended
+    dimension: tests
+    title: "Library Test Review"
+    description: |
+      Evaluate public API contract tests and cross-environment compatibility.
+    checklist:
+      - id: contract-tests
+        description: "Every public API method has contract tests verifying inputs, outputs, and error cases."
+        severity: critical
+      - id: cross-env-testing
+        description: "Tests run in target environments (Node versions, browser if applicable)."
+        severity: important
+      - id: example-tests
+        description: "README examples are executable tests that stay in sync with the API."
+        severity: nice-to-have

package/templates/library/structure.yaml CHANGED Viewed

@@ -1,19 +1,19 @@
-tag: LIBRARY
-section: structure
-language: typescript
-entries:
-  - path: src/index.ts
-    type: file
-    description: "Public API barrel — exports only what consumers need"
-  - path: src/shared
-    type: directory
-    description: "Shared internal utilities"
-  - path: dist
-    type: directory
-    description: "Compiled output (not committed)"
-  - path: README.md
-    type: file
-    description: "Install, quick start, API reference"
-  - path: CHANGELOG.md
-    type: file
-    description: "Release history"
+tag: LIBRARY
+section: structure
+language: typescript
+entries:
+  - path: src/index.ts
+    type: file
+    description: "Public API barrel — exports only what consumers need"
+  - path: src/shared
+    type: directory
+    description: "Shared internal utilities"
+  - path: dist
+    type: directory
+    description: "Compiled output (not committed)"
+  - path: README.md
+    type: file
+    description: "Install, quick start, API reference"
+  - path: CHANGELOG.md
+    type: file
+    description: "Release history"

package/templates/medallion-architecture/instructions.yaml CHANGED Viewed

@@ -1,41 +1,41 @@
-tag: MEDALLION-ARCHITECTURE
-section: instructions
-blocks:
-  - id: bronze-layer
-    tier: recommended
-    title: "Bronze Layer — Immutable Raw Ingestion"
-    content: |
-      ## Bronze Layer — Immutable Raw Ingestion
-      - Bronze is the landing zone. Data arrives exactly as received from source systems — no transformations, no filtering, no deduplication.
-      - Store Bronze data in append-only, immutable formats (Parquet, Delta Lake, Iceberg). Never overwrite or delete Bronze records.
-      - Include ingestion metadata on every record: source_system, ingested_at, batch_id, file_origin, raw_checksum.
-      - Partition Bronze tables by ingestion date for efficient reprocessing. Never partition by business keys at this layer.
-      - Implement schema-on-read: Bronze accepts any schema from source. Schema validation happens at the Silver promotion step.
-      - Retain Bronze data indefinitely (or per legal retention policy). Bronze is the recovery point for all downstream reprocessing.
-  - id: silver-layer
-    tier: recommended
-    title: "Silver Layer — Validated & Conformed"
-    content: |
-      ## Silver Layer — Validated & Conformed
-      - Silver applies data quality rules: type validation, null checks, range constraints, referential integrity, deduplication.
-      - Every record in Silver must pass defined quality gates. Records failing validation are routed to a quarantine table, not dropped.
-      - Conform naming conventions: standardize column names, date formats, currency codes, and enum values across all Silver tables.
-      - Track record lineage: every Silver record links back to its Bronze source via source_record_id and bronze_batch_id.
-      - Apply slowly changing dimension (SCD) Type 2 for reference data: maintain history with effective_from and effective_to timestamps.
-      - Silver tables define explicit schemas enforced on write. Schema changes require a migration plan and backward compatibility check.
-  - id: gold-layer
-    tier: recommended
-    title: "Gold Layer — Aggregated & Business-Ready"
-    content: |
-      ## Gold Layer — Aggregated & Business-Ready
-      - Gold tables are purpose-built for specific business use cases: dashboards, ML features, API responses, reports.
-      - Apply business logic at the Gold layer: aggregations, joins, calculated fields, business rules, KPI definitions.
-      - Optimize Gold tables for read performance: pre-aggregate, denormalize, partition by access patterns.
-      - Document every Gold table with its business purpose, refresh cadence, source Silver tables, and SLA.
-      - Version Gold table definitions: changes to aggregation logic or business rules are tracked, reviewed, and deployed like code.
-      - Implement data contracts for Gold consumers: define expected schema, update frequency, and quality guarantees.
+tag: MEDALLION-ARCHITECTURE
+section: instructions
+blocks:
+  - id: bronze-layer
+    tier: recommended
+    title: "Bronze Layer — Immutable Raw Ingestion"
+    content: |
+      ## Bronze Layer — Immutable Raw Ingestion
+      - Bronze is the landing zone. Data arrives exactly as received from source systems — no transformations, no filtering, no deduplication.
+      - Store Bronze data in append-only, immutable formats (Parquet, Delta Lake, Iceberg). Never overwrite or delete Bronze records.
+      - Include ingestion metadata on every record: source_system, ingested_at, batch_id, file_origin, raw_checksum.
+      - Partition Bronze tables by ingestion date for efficient reprocessing. Never partition by business keys at this layer.
+      - Implement schema-on-read: Bronze accepts any schema from source. Schema validation happens at the Silver promotion step.
+      - Retain Bronze data indefinitely (or per legal retention policy). Bronze is the recovery point for all downstream reprocessing.
+  - id: silver-layer
+    tier: recommended
+    title: "Silver Layer — Validated & Conformed"
+    content: |
+      ## Silver Layer — Validated & Conformed
+      - Silver applies data quality rules: type validation, null checks, range constraints, referential integrity, deduplication.
+      - Every record in Silver must pass defined quality gates. Records failing validation are routed to a quarantine table, not dropped.
+      - Conform naming conventions: standardize column names, date formats, currency codes, and enum values across all Silver tables.
+      - Track record lineage: every Silver record links back to its Bronze source via source_record_id and bronze_batch_id.
+      - Apply slowly changing dimension (SCD) Type 2 for reference data: maintain history with effective_from and effective_to timestamps.
+      - Silver tables define explicit schemas enforced on write. Schema changes require a migration plan and backward compatibility check.
+  - id: gold-layer
+    tier: recommended
+    title: "Gold Layer — Aggregated & Business-Ready"
+    content: |
+      ## Gold Layer — Aggregated & Business-Ready
+      - Gold tables are purpose-built for specific business use cases: dashboards, ML features, API responses, reports.
+      - Apply business logic at the Gold layer: aggregations, joins, calculated fields, business rules, KPI definitions.
+      - Optimize Gold tables for read performance: pre-aggregate, denormalize, partition by access patterns.
+      - Document every Gold table with its business purpose, refresh cadence, source Silver tables, and SLA.
+      - Version Gold table definitions: changes to aggregation logic or business rules are tracked, reviewed, and deployed like code.
+      - Implement data contracts for Gold consumers: define expected schema, update frequency, and quality guarantees.

package/templates/medallion-architecture/mcp-servers.yaml CHANGED Viewed

@@ -1,22 +1,22 @@
-tag: MEDALLION-ARCHITECTURE
-section: mcp-servers
-servers:
-  - name: postgres
-    description: "PostgreSQL for medallion layer metadata, quality gate results, and lineage tracking"
-    command: npx
-    args: ["-y", "@modelcontextprotocol/server-postgres"]
-    tags: [MEDALLION-ARCHITECTURE, DATA-PIPELINE]
-    category: database
-    tier: recommended
-    env:
-      POSTGRES_CONNECTION_STRING: ""
-    url: "https://github.com/modelcontextprotocol/servers/tree/main/src/postgres"
-  - name: filesystem
-    description: "Filesystem access for Bronze/Silver/Gold layer data files and pipeline configs"
-    command: npx
-    args: ["-y", "@modelcontextprotocol/server-filesystem"]
-    tags: [MEDALLION-ARCHITECTURE, DATA-PIPELINE]
-    category: filesystem
-    tier: optional
-    url: "https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem"
+tag: MEDALLION-ARCHITECTURE
+section: mcp-servers
+servers:
+  - name: postgres
+    description: "PostgreSQL for medallion layer metadata, quality gate results, and lineage tracking"
+    command: npx
+    args: ["-y", "@modelcontextprotocol/server-postgres"]
+    tags: [MEDALLION-ARCHITECTURE, DATA-PIPELINE]
+    category: database
+    tier: recommended
+    env:
+      POSTGRES_CONNECTION_STRING: ""
+    url: "https://github.com/modelcontextprotocol/servers/tree/main/src/postgres"
+  - name: filesystem
+    description: "Filesystem access for Bronze/Silver/Gold layer data files and pipeline configs"
+    command: npx
+    args: ["-y", "@modelcontextprotocol/server-filesystem"]
+    tags: [MEDALLION-ARCHITECTURE, DATA-PIPELINE]
+    category: filesystem
+    tier: optional
+    url: "https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem"

package/templates/ml/instructions.yaml CHANGED Viewed

@@ -1,85 +1,85 @@
-tag: ML
-section: instructions
-blocks:
-  - id: training-pipelines
-    tier: recommended
-    title: "Model Training & Experiment Tracking"
-    content: |
-      ## Model Training & Experiment Tracking
-      - Track every experiment with a tool like MLflow, Weights & Biases, or DVC. Log hyperparameters, metrics, dataset versions, and the git commit hash for full reproducibility.
-      - Pin all dependencies (Python version, library versions, CUDA version) in a lockfile or Docker image. A training run from six months ago must be reproducible today.
-      - Separate data preprocessing, feature engineering, model training, and evaluation into distinct pipeline stages with clear interfaces.
-      - Use configuration files (YAML/TOML) for hyperparameters rather than hard-coding them. Support overrides via CLI arguments for sweep runs.
-      - Always hold out a test set that is never used during training or hyperparameter tuning. Report final metrics exclusively on this set.
-      - Version datasets alongside code. Use content-addressable storage or DVC to track dataset lineage and detect drift between training runs.
-      - Set random seeds for all sources of non-determinism (NumPy, PyTorch, TensorFlow, data shuffling) to ensure reproducible results on the same hardware.
-  - id: feature-engineering
-    tier: recommended
-    title: "Feature Engineering & Data Preparation"
-    content: |
-      ## Feature Engineering & Data Preparation
-      - Build a centralized feature store (or at minimum a shared feature module) to avoid duplicating feature logic across training and serving.
-      - Document every feature: its definition, data source, expected range, update frequency, and any known caveats or biases.
-      - Compute feature statistics (mean, std, min, max, null rate) on the training set and persist them. Apply the same transformations at inference time using these saved statistics—never recompute from inference data.
-      - Handle missing values explicitly. Choose an imputation strategy per feature (mean, median, mode, sentinel value, or model-based) and document the rationale.
-      - Detect and handle data leakage: ensure no future information leaks into features, and that train/validation/test splits respect temporal or entity boundaries.
-      - Monitor feature distributions in production. Alert when serving-time distributions diverge significantly from training-time distributions (data/concept drift).
-  - id: model-deployment
-    tier: recommended
-    title: "Model Versioning & Deployment"
-    content: |
-      ## Model Versioning & Serving
-      - Store trained model artifacts in a model registry with semantic versioning. Tag each artifact with the training run ID, dataset version, and evaluation metrics.
-      - Serve models behind a versioned API endpoint. Support A/B testing and canary rollouts by routing a percentage of traffic to a new model version.
-      - Define minimum performance thresholds (accuracy, latency p99, throughput) as promotion gates. A model must pass automated evaluation before being promoted to production.
-      - Implement shadow mode: run a new model in parallel with the current production model, compare outputs, and alert on significant divergence before cutting over.
-      - Log all predictions with input features, model version, and timestamp. This enables debugging, bias auditing, and retraining on production data.
-      - Set up automated retraining triggers based on performance degradation, data drift detection, or a fixed schedule. Ensure the retraining pipeline is fully automated end-to-end.
-  - id: ml-testing
-    tier: recommended
-    title: "ML-Specific Testing Requirements"
-    content: |
-      ## ML-Specific Testing Requirements
-      ### Data Distribution Tests
-      Assert dataset integrity before any model trains on it:
-      - Train/val/test split ratios are within configured tolerances.
-      - Class balance in classification tasks is within acceptable imbalance threshold (document the threshold and the chosen mitigation — oversampling, class weights, threshold tuning).
-      - No temporal leakage: all records in the validation set postdate all records in the training set for time-series problems.
-      - Feature null rates are within threshold per column. Schema types match declared types.
-      These assertions run as a pre-training gate, not as an afterthought during debugging.
-      ### Model Regression Tests
-      A model deployment must not silently degrade the evaluation metric baseline:
-      - Persist the baseline metric (accuracy, F1, AUC, RMSE, etc.) for the current production model in the model registry.
-      - On each candidate model: evaluate on the held-out test set; compare to baseline.
-      - Define the minimum acceptable delta (e.g., F1 must not drop more than 0.01 from baseline).
-      - A regression below threshold is a failing CI gate. It does not block deployment only if an ADR explicitly accepts the tradeoff.
-      ### Inference Latency Tests
-      Model serving has an SLA. Test it:
-      - At staging: send representative input at the defined peak request rate.
-      - Assert p95 latency ≤ SLA threshold (document the threshold).
-      - Assert throughput ≥ minimum requests/second.
-      - Assert memory footprint under sustained load stays within spec.
-      Latency tests run post-deploy to staging, not pre-deploy. They are a promotion gate, not a build gate.
-      ### Adversarial Input Tests
-      Models fail in unexpected ways on distribution-edge inputs:
-      - **Out-of-distribution inputs**: feature values well outside training range; entirely unseen categories.
-      - **Corrupted inputs**: nulls in required fields, wrong types, empty sequences, extreme numeric values.
-      - **Edge-case inputs**: minimum/maximum sequence length, single-token inputs, very long inputs if applicable.
-      Assert: the model does not crash; output is within plausible range; uncertainty (if modeled) is high for OOD inputs.
-      ### Bias / Fairness Assertions
-      Where the model's output affects people and protected attributes are present in the data:
-      - Define the protected attributes and the fairness metric (demographic parity, equalized odds, calibration by group).
-      - Evaluate the fairness metric on the test set per protected-attribute group.
-      - Assert the fairness gap is within the documented acceptable threshold.
-      - If no protected attributes are present but the output has downstream human impact, document this determination and its rationale in an ADR.
+tag: ML
+section: instructions
+blocks:
+  - id: training-pipelines
+    tier: recommended
+    title: "Model Training & Experiment Tracking"
+    content: |
+      ## Model Training & Experiment Tracking
+      - Track every experiment with a tool like MLflow, Weights & Biases, or DVC. Log hyperparameters, metrics, dataset versions, and the git commit hash for full reproducibility.
+      - Pin all dependencies (Python version, library versions, CUDA version) in a lockfile or Docker image. A training run from six months ago must be reproducible today.
+      - Separate data preprocessing, feature engineering, model training, and evaluation into distinct pipeline stages with clear interfaces.
+      - Use configuration files (YAML/TOML) for hyperparameters rather than hard-coding them. Support overrides via CLI arguments for sweep runs.
+      - Always hold out a test set that is never used during training or hyperparameter tuning. Report final metrics exclusively on this set.
+      - Version datasets alongside code. Use content-addressable storage or DVC to track dataset lineage and detect drift between training runs.
+      - Set random seeds for all sources of non-determinism (NumPy, PyTorch, TensorFlow, data shuffling) to ensure reproducible results on the same hardware.
+  - id: feature-engineering
+    tier: recommended
+    title: "Feature Engineering & Data Preparation"
+    content: |
+      ## Feature Engineering & Data Preparation
+      - Build a centralized feature store (or at minimum a shared feature module) to avoid duplicating feature logic across training and serving.
+      - Document every feature: its definition, data source, expected range, update frequency, and any known caveats or biases.
+      - Compute feature statistics (mean, std, min, max, null rate) on the training set and persist them. Apply the same transformations at inference time using these saved statistics—never recompute from inference data.
+      - Handle missing values explicitly. Choose an imputation strategy per feature (mean, median, mode, sentinel value, or model-based) and document the rationale.
+      - Detect and handle data leakage: ensure no future information leaks into features, and that train/validation/test splits respect temporal or entity boundaries.
+      - Monitor feature distributions in production. Alert when serving-time distributions diverge significantly from training-time distributions (data/concept drift).
+  - id: model-deployment
+    tier: recommended
+    title: "Model Versioning & Deployment"
+    content: |
+      ## Model Versioning & Serving
+      - Store trained model artifacts in a model registry with semantic versioning. Tag each artifact with the training run ID, dataset version, and evaluation metrics.
+      - Serve models behind a versioned API endpoint. Support A/B testing and canary rollouts by routing a percentage of traffic to a new model version.
+      - Define minimum performance thresholds (accuracy, latency p99, throughput) as promotion gates. A model must pass automated evaluation before being promoted to production.
+      - Implement shadow mode: run a new model in parallel with the current production model, compare outputs, and alert on significant divergence before cutting over.
+      - Log all predictions with input features, model version, and timestamp. This enables debugging, bias auditing, and retraining on production data.
+      - Set up automated retraining triggers based on performance degradation, data drift detection, or a fixed schedule. Ensure the retraining pipeline is fully automated end-to-end.
+  - id: ml-testing
+    tier: recommended
+    title: "ML-Specific Testing Requirements"
+    content: |
+      ## ML-Specific Testing Requirements
+      ### Data Distribution Tests
+      Assert dataset integrity before any model trains on it:
+      - Train/val/test split ratios are within configured tolerances.
+      - Class balance in classification tasks is within acceptable imbalance threshold (document the threshold and the chosen mitigation — oversampling, class weights, threshold tuning).
+      - No temporal leakage: all records in the validation set postdate all records in the training set for time-series problems.
+      - Feature null rates are within threshold per column. Schema types match declared types.
+      These assertions run as a pre-training gate, not as an afterthought during debugging.
+      ### Model Regression Tests
+      A model deployment must not silently degrade the evaluation metric baseline:
+      - Persist the baseline metric (accuracy, F1, AUC, RMSE, etc.) for the current production model in the model registry.
+      - On each candidate model: evaluate on the held-out test set; compare to baseline.
+      - Define the minimum acceptable delta (e.g., F1 must not drop more than 0.01 from baseline).
+      - A regression below threshold is a failing CI gate. It does not block deployment only if an ADR explicitly accepts the tradeoff.
+      ### Inference Latency Tests
+      Model serving has an SLA. Test it:
+      - At staging: send representative input at the defined peak request rate.
+      - Assert p95 latency ≤ SLA threshold (document the threshold).
+      - Assert throughput ≥ minimum requests/second.
+      - Assert memory footprint under sustained load stays within spec.
+      Latency tests run post-deploy to staging, not pre-deploy. They are a promotion gate, not a build gate.
+      ### Adversarial Input Tests
+      Models fail in unexpected ways on distribution-edge inputs:
+      - **Out-of-distribution inputs**: feature values well outside training range; entirely unseen categories.
+      - **Corrupted inputs**: nulls in required fields, wrong types, empty sequences, extreme numeric values.
+      - **Edge-case inputs**: minimum/maximum sequence length, single-token inputs, very long inputs if applicable.
+      Assert: the model does not crash; output is within plausible range; uncertainty (if modeled) is high for OOD inputs.
+      ### Bias / Fairness Assertions
+      Where the model's output affects people and protected attributes are present in the data:
+      - Define the protected attributes and the fairness metric (demographic parity, equalized odds, calibration by group).
+      - Evaluate the fairness metric on the test set per protected-attribute group.
+      - Assert the fairness gap is within the documented acceptable threshold.
+      - If no protected attributes are present but the output has downstream human impact, document this determination and its rationale in an ADR.