npm - @cubis/foundry - Versions diffs - 0.3.10 → 0.3.11 - Mend

@cubis/foundry 0.3.10 → 0.3.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (56) hide show

package/Ai Agent Workflow/skills/database-skills/skills/mongodb/references/mongoose-nestjs.md CHANGED Viewed

@@ -1,5 +1,134 @@
-# Mongoose + NestJS
+# MongoDB — Mongoose and NestJS Patterns
-- Keep repository boundaries clear.
-- Define schema validation and indexes in one place.
-- Validate query plans on production-like volumes.
+## Repository pattern with Mongoose + NestJS
+Keep data access behind a repository class — don't scatter `Model.find()` calls across services.
+```ts
+// order.repository.ts
+@Injectable()
+export class OrderRepository {
+  constructor(@InjectModel(Order.name) private model: Model<Order>) {}
+  async findByUser(userId: string, limit = 20, afterId?: string): Promise<Order[]> {
+    const query: FilterQuery<Order> = { userId };
+    if (afterId) query._id = { $gt: new Types.ObjectId(afterId) };
+    return this.model
+      .find(query)
+      .sort({ _id: 1 })
+      .limit(limit)
+      .lean()   // return plain JS objects — skip Mongoose hydration overhead
+      .exec();
+  }
+  async create(dto: CreateOrderDto): Promise<Order> {
+    return this.model.create(dto);
+  }
+}
+```
+## Schema index definition
+Define indexes on the schema, not as ad-hoc calls. This makes them part of your codebase and reviewable.
+```ts
+@Schema({ timestamps: true })
+export class Order {
+  @Prop({ required: true, index: true })
+  userId: string;
+  @Prop({ required: true })
+  status: string;
+  @Prop({ type: [{ sku: String, qty: Number, price: Number }] })
+  lineItems: LineItem[];
+}
+// Compound index — define at schema level
+OrderSchema.index({ userId: 1, status: 1, createdAt: -1 });
+// TTL index — auto-delete after 90 days
+OrderSchema.index({ createdAt: 1 }, { expireAfterSeconds: 60 * 60 * 24 * 90 });
+```
+Always track index definitions in migration scripts when adding to existing collections.
+## Lean reads and projection
+- `lean()` returns plain JS objects instead of Mongoose Document instances — no hydration overhead, no change tracking. Use for read paths.
+- Always project only what you need to reduce transfer size.
+```ts
+// Lean + projection for list endpoints
+this.model
+  .find({ userId })
+  .select('status createdAt total')   // project only needed fields
+  .lean()
+  .exec();
+// Full document with Mongoose methods only when saving/updating
+const doc = await this.model.findById(id).exec();
+doc.status = 'complete';
+await doc.save();
+```
+## Transactions (MongoDB 4.0+ replica set or sharded cluster)
+```ts
+const session = await this.connection.startSession();
+session.startTransaction();
+try {
+  await this.orderModel.create([orderData], { session });
+  await this.inventoryModel.updateOne(
+    { sku: orderData.sku },
+    { $inc: { qty: -1 } },
+    { session }
+  );
+  await session.commitTransaction();
+} catch (e) {
+  await session.abortTransaction();
+  throw e;
+} finally {
+  session.endSession();
+}
+```
+Transactions are only needed for multi-document atomicity. Single-document operations are always atomic in MongoDB.
+## Aggregation pipeline in NestJS
+```ts
+const result = await this.model.aggregate([
+  { $match: { userId, status: 'complete' } },
+  { $group: { _id: '$category', total: { $sum: '$amount' } } },
+  { $sort: { total: -1 } },
+  { $limit: 10 },
+]);
+```
+Use `.aggregate()` for reporting/analytics. For regular queries, prefer `.find()` so Mongoose can apply schema type casting.
+## Connection and pool setup (NestJS module)
+```ts
+MongooseModule.forRoot(uri, {
+  maxPoolSize: 10,          // default 5 — tune to app concurrency
+  serverSelectionTimeoutMS: 5000,
+  socketTimeoutMS: 45000,
+  connectTimeoutMS: 10000,
+})
+```
+## Common mistakes
+- Calling `Model.find()` directly in service/controller — bypasses repository, untestable.
+- Forgetting `.lean()` on list endpoints — returns Mongoose Documents with full overhead.
+- Defining compound indexes ad-hoc in `onModuleInit` — use schema-level definition instead.
+- Not projecting fields on list queries — transfers full documents when only 3 fields are needed.
+- Using `new Model(data).save()` in a loop — batch with `Model.insertMany()` instead.
+## Sources
+- Mongoose documentation: https://mongoosejs.com/docs/
+- MongoDB index strategies: https://www.mongodb.com/docs/manual/indexes/
+- MongoDB data modeling: https://www.mongodb.com/docs/manual/data-modeling/
+- MongoDB transactions: https://www.mongodb.com/docs/manual/core/transactions/

package/Ai Agent Workflow/skills/database-skills/skills/mysql/SKILL.md CHANGED Viewed

@@ -1,15 +1,41 @@
 ---
 name: mysql
-description: MySQL/InnoDB schema design, indexing, query tuning, and operational safety.
+description: MySQL/InnoDB schema design, indexing, pagination, query tuning, and operational safety.
 ---
 # MySQL
-Load references as needed:
+## Version posture
+- Prefer **8.4 LTS** for long-lived production stability.
+- Use **9.x Innovation** only when you need newest features and can absorb faster change cadence.
+## Optimization workflow
+1. Baseline with `EXPLAIN` and `EXPLAIN ANALYZE`.
+2. Tune indexes around dominant filter and sort paths.
+3. Validate pagination path (`ORDER BY` + index coverage).
+4. Evaluate DDL lock/replication impact before migration.
+## Indexing techniques
+- Composite indexes that match predicate and ordering direction.
+- Covering indexes for hot read endpoints.
+- Keep clustered primary key narrow to reduce secondary index overhead.
+- Avoid shotgun indexing; measure write amplification impact.
+## Pagination techniques
+- Prefer seek/keyset pagination with deterministic ordering.
+- Include unique tie-breaker for stable page boundaries.
+- Avoid large offset pagination for deep traversal.
+## Operational guardrails
+- Treat online DDL mode and lock behavior as explicit rollout risks.
+- Test DDL on production-like data volume and replica topology.
+## References
 - `references/query-indexing.md`
 - `references/locking-ddl.md`
-Key rules:
-- Use `EXPLAIN` before optimization.
-- Prefer online-safe schema change plans.
-- Track lock waits and deadlocks during rollout.

package/Ai Agent Workflow/skills/database-skills/skills/mysql/references/locking-ddl.md CHANGED Viewed

@@ -1,5 +1,104 @@
-# MySQL Locking and DDL
+# MySQL — Locking and DDL Safety
-- Estimate lock impact before ALTER operations.
-- Use online DDL where possible.
-- Prepare fallback/rollback before production DDL.
+## Online DDL algorithms
+MySQL InnoDB can perform many DDL operations without blocking reads/writes. Always check the algorithm before running in production.
+| Algorithm | Write impact | When used |
+| --- | --- | --- |
+| `INSTANT` | None | Adding nullable columns at end (MySQL 8.0+), some metadata-only |
+| `INPLACE` | No copy; may block briefly at start/end | Most index adds, some column modifications |
+| `COPY` | Full table rewrite; blocks writes for duration | Changing primary key, column type changes, some charset changes |
+Check before applying:
+```sql
+ALTER TABLE orders ADD COLUMN notes TEXT, ALGORITHM=INPLACE, LOCK=NONE;
+-- If MySQL rejects it, it needs COPY → use pt-online-schema-change or gh-ost
+```
+Force dry-run check without applying:
+```sql
+-- Will error if it can't do INPLACE, without touching the table
+ALTER TABLE orders ADD INDEX idx_test (status), ALGORITHM=INPLACE, LOCK=NONE;
+```
+## Metadata lock (MDL) exposure
+DDL acquires a Metadata Lock on the table. Even an `INSTANT` or `INPLACE` operation blocks if a long-running transaction or idle connection holds a conflicting MDL.
+```sql
+-- Check for MDL waiters and holders before running DDL
+SELECT r.trx_id waiting_trx_id, r.trx_mysql_thread_id waiting_thread,
+       b.trx_id blocking_trx_id, b.trx_mysql_thread_id blocking_thread,
+       b.trx_query blocking_query
+FROM information_schema.innodb_lock_waits w
+JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id
+JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id;
+-- Also check for long-running active transactions
+SELECT * FROM information_schema.innodb_trx WHERE trx_started < NOW() - INTERVAL 30 SECOND;
+```
+Kill blockers with caution before DDL:
+```sql
+KILL <thread_id>;   -- kills connection, rolls back its transaction
+```
+## Replication lag impact
+- `COPY` algorithm: full table rewrite flows through binary log as row events — replica must replay every row.
+- `INPLACE` lock-free DDL: usually light on replicas.
+- Monitor `Seconds_Behind_Master` / `Seconds_Behind_Source` during DDL.
+```sql
+-- On replica
+SHOW REPLICA STATUS\G
+-- Watch: Seconds_Behind_Source
+```
+## Online schema change tools
+For tables too large or busy for native online DDL:
+- **gh-ost** (GitHub): uses binlog streaming, minimal impact, best for production.
+- **pt-online-schema-change** (Percona): trigger-based, established tooling.
+Both create a shadow table, migrates data in background, then atomically cuts over with a brief lock.
+## InnoDB row-level locking
+- InnoDB locks rows, not tables (except DDL).
+- `SELECT ... FOR UPDATE` takes exclusive row locks — keep duration short.
+- `REPEATABLE READ` (default) uses **gap locks** to prevent phantom reads; causes more lock contention than `READ COMMITTED`.
+- Switch to `READ COMMITTED` for high-contention OLTP workloads:
+  ```sql
+  SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
+  ```
+## Deadlock handling
+```sql
+-- Show last deadlock detail
+SHOW ENGINE INNODB STATUS\G  -- search for LATEST DETECTED DEADLOCK
+-- Enable deadlock logging
+SET GLOBAL innodb_print_all_deadlocks = ON;
+```
+Prevention:
+- Always access rows in a consistent order across transactions.
+- Keep transactions short — do I/O and computation outside the transaction boundary.
+- Retry with exponential backoff on error 1213 (`ER_LOCK_DEADLOCK`).
+## MySQL release tracks
+| Track | Description |
+| --- | --- |
+| **LTS** (e.g. 8.4, 9.7+) | Long-term support; production recommended |
+| **Innovation** (8.1, 8.2, etc.) | Frequent releases with new features, shorter support window |
+Check which features are available for your version before proposing DDL changes.
+## Sources
+- Online DDL operations: https://dev.mysql.com/doc/refman/8.4/en/innodb-online-ddl-operations.html
+- InnoDB locking: https://dev.mysql.com/doc/refman/8.4/en/innodb-locking.html
+- MySQL release tracks: https://dev.mysql.com/doc/refman/8.4/en/mysql-releases.html

package/Ai Agent Workflow/skills/database-skills/skills/mysql/references/query-indexing.md CHANGED Viewed

@@ -1,5 +1,104 @@
-# MySQL Query and Indexing
+# MySQL — Query Optimization and Indexing
-- Build indexes by predicate order and sort usage.
-- Avoid broad scans in high-traffic paths.
-- Prefer cursor pagination over large OFFSET queries.
+## EXPLAIN workflow
+Always run `EXPLAIN` (or `EXPLAIN ANALYZE`) before and after index changes.
+```sql
+EXPLAIN SELECT * FROM orders WHERE user_id = 42 ORDER BY created_at DESC LIMIT 20;
+EXPLAIN ANALYZE SELECT ...;  -- MySQL 8.0+: shows actual row counts and timing
+```
+Key columns to read:
+| Column | Red flag |
+| --- | --- |
+| `type` | `ALL` = full table scan. Target: `ref`, `eq_ref`, `range`, or `const`. |
+| `Extra` | `Using filesort` = no index satisfying ORDER BY. `Using temporary` = costly in-memory or on-disk sort. |
+| `rows` | Estimated rows examined. Should be close to rows returned. |
+| `key` | Which index MySQL chose. `NULL` = no index used. |
+## Composite index design (leftmost prefix rule)
+- Column order is critical: **equality predicates first, then range, then sort**.
+- The planner can use any leading prefix of the index — trailing columns after a range stop being used.
+- **Good**: `(status, user_id, created_at)` for `WHERE status = 'open' AND user_id = 42 ORDER BY created_at`.
+- **Bad**: `(created_at, status)` for `WHERE status = 'open'` — planner must scan the whole index.
+```sql
+-- Supports: WHERE status = ? ORDER BY created_at
+-- Supports: WHERE status = ? AND user_id = ?
+CREATE INDEX idx_orders_status_user_created ON orders (status, user_id, created_at);
+```
+## Covering indexes
+Include all selected columns in the index to avoid a heap row lookup (index-only read).
+```sql
+-- Query: SELECT status, total FROM orders WHERE user_id = 42
+CREATE INDEX idx_orders_user_covering ON orders (user_id) INCLUDE (status, total);
+-- Or in older MySQL without INCLUDE, use a composite key that covers the columns
+```
+Range predicates in the key stop the index from being used for subsequent columns — use `INCLUDE`-style or a separate index for those.
+## Seek (cursor) pagination — avoid OFFSET
+`OFFSET N` forces MySQL to scan and discard N rows. On large tables this is catastrophically slow.
+```sql
+-- BAD: OFFSET pagination
+SELECT * FROM orders ORDER BY id LIMIT 20 OFFSET 10000;
+-- GOOD: Seek pagination using last seen ID
+SELECT * FROM orders WHERE id > :last_seen_id ORDER BY id LIMIT 20;
+-- For composite sort keys
+SELECT * FROM orders
+WHERE (created_at, id) < (:last_created_at, :last_id)
+ORDER BY created_at DESC, id DESC
+LIMIT 20;
+```
+## Function calls on indexed columns break index usage
+```sql
+-- BAD: function call prevents index use
+SELECT * FROM users WHERE YEAR(created_at) = 2025;
+-- GOOD: range predicate preserves index
+SELECT * FROM users WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01';
+```
+## Index maintenance
+```sql
+-- Find unused indexes (after collecting stats for a while)
+SELECT object_schema, object_name, index_name, count_read
+FROM performance_schema.table_io_waits_summary_by_index_usage
+WHERE count_read = 0 AND object_schema NOT IN ('mysql', 'sys', 'information_schema')
+ORDER BY object_schema, object_name;
+-- Check index sizes
+SELECT table_name, index_name, stat_value * @@innodb_page_size / 1024 / 1024 AS size_mb
+FROM mysql.innodb_index_stats
+WHERE stat_name = 'size' AND database_name = DATABASE()
+ORDER BY size_mb DESC;
+```
+- Drop indexes with `count_read = 0` — every index adds write and lock overhead.
+- Use `ALTER TABLE ... ADD INDEX ..., ALGORITHM=INPLACE, LOCK=NONE` for online index changes.
+## Key guardrails
+- Avoid leading `%` in LIKE (`LIKE '%foo'`) — can't use a B-tree index.
+- Avoid `OR` across different indexed columns — use `UNION ALL` instead.
+- Avoid implicit type coercions in `WHERE` (e.g., `WHERE varchar_col = 123`) — breaks index usage.
+- Batch large INSERTs (500–5000 rows per statement) to reduce per-statement overhead.
+## Sources
+- Using EXPLAIN: https://dev.mysql.com/doc/refman/8.4/en/using-explain.html
+- Optimization and indexes: https://dev.mysql.com/doc/refman/8.4/en/optimization-indexes.html
+- LIMIT/OFFSET optimization: https://dev.mysql.com/doc/refman/8.4/en/limit-optimization.html
+- performance_schema index stats: https://dev.mysql.com/doc/refman/8.4/en/table-io-waits-summary-by-index-usage-table.html

package/Ai Agent Workflow/skills/database-skills/skills/mysql/references/replication.md ADDED Viewed

@@ -0,0 +1,142 @@
+# MySQL — Replication
+## Replication basics
+MySQL replication streams changes from a **source** (primary) to one or more **replicas** (secondaries) using the **binary log (binlog)**.
+Common setups:
+- **Single primary + read replicas**: route writes to primary, reads to replicas.
+- **Group Replication / InnoDB Cluster**: multi-primary with automatic failover.
+- **Semi-sync**: primary waits for at least one replica to acknowledge before commit.
+## Binary log formats
+| Format | What it logs | Use for |
+| --- | --- | --- |
+| `ROW` (recommended) | Actual changed rows | Best consistency — exact row deltas |
+| `STATEMENT` | SQL statements | Smaller binlog size, but non-deterministic functions (`NOW()`, `UUID()`) unsafe |
+| `MIXED` | Statement by default, row for unsafe statements | Compromise |
+```sql
+-- Check current format
+SHOW VARIABLES LIKE 'binlog_format';
+-- Set to ROW (recommended for most setups)
+SET GLOBAL binlog_format = 'ROW';
+```
+## Check replication health
+```sql
+-- On replica
+SHOW REPLICA STATUS\G
+-- Key fields to monitor:
+-- Seconds_Behind_Source       → replication lag in seconds (0 = caught up)
+-- Replica_SQL_Running         → YES (must be YES)
+-- Replica_IO_Running          → YES (must be YES)
+-- Last_SQL_Error              → empty = no error
+-- Last_IO_Error               → empty = no error
+```
+Alert when `Seconds_Behind_Source > N` where N depends on your acceptable staleness (typically < 30s for OLTP).
+## GTID-based replication (MySQL 5.6+, recommended)
+GTID (Global Transaction Identifier) gives every committed transaction a unique ID. Enables:
+- Automatic failover without manually computing binlog coordinates.
+- Easier replica promotion.
+```ini
+# my.cnf on source and all replicas
+gtid_mode = ON
+enforce_gtid_consistency = ON
+```
+```sql
+-- Check GTID executed set on replica
+SHOW GLOBAL VARIABLES LIKE 'gtid_executed';
+-- Should match source's gtid_executed when fully caught up
+```
+## Read replica routing
+Route reads to replicas only for **eventually consistent** reads — data on the replica may be seconds behind the source.
+```ts
+// Example: separate pools per role
+const writePool = createPool({ host: PRIMARY_HOST });
+const readPool = createPool({ host: REPLICA_HOST });
+// Writes always go to primary
+await writePool.query('INSERT INTO orders ...');
+// Reads that can tolerate slight lag
+const orders = await readPool.query('SELECT * FROM orders WHERE ...');
+```
+**Never** route reads to replica for:
+- Reading immediately after a write in the same request ("read your own writes").
+- Writes that depend on current state (check-and-set patterns).
+## Replication lag and DDL impact
+DDL with `COPY` algorithm generates a full table rewrite in the binlog — the replica must replay every row. This causes massive lag on large tables.
+Best practices:
+- Use `ALGORITHM=INPLACE, LOCK=NONE` for all DDL when possible.
+- Schedule large `COPY`-algorithm DDL during off-peak.
+- Monitor `Seconds_Behind_Source` during DDL and pause if lag grows.
+- Consider gh-ost or pt-online-schema-change for zero-downtime DDL on replicas.
+## Binlog retention
+```sql
+-- How long binlogs are kept (days)
+SHOW VARIABLES LIKE 'binlog_expire_logs_seconds';  -- MySQL 8.0
+SHOW VARIABLES LIKE 'expire_logs_days';            -- MySQL 5.7
+-- Set retention (in seconds, MySQL 8.0)
+SET GLOBAL binlog_expire_logs_seconds = 604800;   -- 7 days
+```
+Keep binlogs long enough to:
+- Recover from a replica rebuild without a full dump.
+- Support point-in-time recovery.
+- Feed CDC (change data capture) consumers.
+## Semi-synchronous replication
+Prevents data loss on primary crash at the cost of slightly higher write latency.
+```sql
+-- Install and enable on source
+INSTALL PLUGIN rpl_semi_sync_source SONAME 'semisync_source.so';
+SET GLOBAL rpl_semi_sync_source_enabled = 1;
+-- Install and enable on replica
+INSTALL PLUGIN rpl_semi_sync_replica SONAME 'semisync_replica.so';
+SET GLOBAL rpl_semi_sync_replica_enabled = 1;
+```
+With semi-sync: source waits for at least one replica ACK per commit. If no replica ACKs within `rpl_semi_sync_source_timeout` ms, falls back to async automatically.
+## Monitoring replication in production
+```sql
+-- Source: check active replica connections
+SHOW PROCESSLIST;  -- look for "Waiting for semi-sync ACK" or "Binlog Dump"
+-- Replica: continuous lag check
+SELECT lag.seconds_behind_source
+FROM performance_schema.replication_applier_status_by_worker lag;
+-- Source binlog position
+SHOW BINARY LOG STATUS\G
+-- File, Position — use for replica setup without GTID
+```
+## Sources
+- Replication overview: https://dev.mysql.com/doc/refman/8.4/en/replication.html
+- GTID-based replication: https://dev.mysql.com/doc/refman/8.4/en/replication-gtids.html
+- Semi-sync replication: https://dev.mysql.com/doc/refman/8.4/en/replication-semisync.html

package/Ai Agent Workflow/skills/database-skills/skills/neki/SKILL.md CHANGED Viewed

@@ -1,15 +1,26 @@
 ---
 name: neki
-description: Neki-oriented guidance for sharded Postgres planning, placement, and operational constraints.
+description: Neki planning guidance for sharded Postgres architecture decisions and operational guardrails.
 ---
 # Neki
-Load references as needed:
+Neki is currently pre-GA (announced and under active development), so guidance is architecture-first and risk-aware.
+## Planning workflow
+1. Define shard key, tenant locality, and cross-shard boundaries.
+2. Map query classes to expected shard-local or cross-shard paths.
+3. Define migration milestones and fallback checkpoints.
+4. Preserve compatibility path with current managed Postgres baseline.
+## Performance planning focus
+- Prioritize shard-local access for hot request paths.
+- Plan read/write amplification expectations early.
+- Avoid hard assumptions about undocumented internals.
+## References
 - `references/architecture.md`
 - `references/operations.md`
-Key rules:
-- Treat shard boundaries as primary architecture decisions.
-- Model tenant/data locality early.
-- Validate failover and maintenance behavior before production.