npm - opencode-metis - Versions diffs - 0.1.0 - Mend

opencode-metis 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (156) hide show

package/opencode/skill/architecture-selection/examples/adrs/001-example-adr.md ADDED Viewed

@@ -0,0 +1,71 @@
+# ADR-001: Use PostgreSQL as the Primary Datastore
+## Status
+Accepted
+## Context
+The platform needs a primary datastore for user accounts, orders, products, and inventory. The data is relational: orders reference users and products, inventory adjustments are tied to orders, and reporting requires joins across multiple entities.
+The team evaluated options against the following constraints:
+- ACID transactions are required. An order creation must atomically decrement inventory and record the order — partial failures are not acceptable.
+- The schema is not yet fully stabilized. Early product development will produce migrations frequently.
+- The team has strong SQL skills but limited experience with document or wide-column stores.
+- The system is expected to serve up to 10,000 requests per minute within the first 18 months, based on growth projections.
+- Hosting budget is approximately $500/month for infrastructure at launch.
+Three options were evaluated: PostgreSQL, MongoDB, and DynamoDB.
+## Decision
+Use PostgreSQL as the primary datastore, hosted on AWS RDS with a single read replica.
+The decision is based on:
+1. **Transactional correctness.** PostgreSQL provides full ACID transactions across tables. The order creation workflow (insert order, decrement inventory, record payment intent) requires this. MongoDB offers multi-document transactions since 4.0 but they carry a performance penalty and require deliberate session management that the team is not familiar with. DynamoDB transactions are limited to 25 items and do not generalize well to the relational access patterns of this domain.
+2. **Team fluency.** All five developers on the current team have production PostgreSQL experience. MongoDB and DynamoDB would require a ramp-up period and introduce risk during the critical early launch phase.
+3. **Relational access patterns.** Reporting requirements include queries like "orders by user with product details and shipment status." These are natural SQL joins. Document stores require either embedding (which causes data duplication) or application-level joins (which push complexity into service code).
+4. **Operational familiarity.** RDS for PostgreSQL has predictable operational characteristics, automated backups, and point-in-time recovery. The team knows how to monitor, tune, and migrate it. DynamoDB's capacity planning model (provisioned vs. on-demand) and partition key selection require specialized knowledge to get right.
+5. **Cost.** An RDS `db.t3.medium` instance with a read replica comes to approximately $180/month. DynamoDB costs are harder to predict at unknown access volumes. MongoDB Atlas at comparable durability settings costs approximately $250/month.
+## Consequences
+### Positive
+- Developers can write and review schema changes using standard SQL migrations (using golang-migrate).
+- ACID transactions eliminate a class of consistency bugs that would require compensating transactions or saga patterns with other datastores.
+- Rich query capabilities mean reporting can be done directly against the database without an ETL pipeline or separate analytics store for the first year.
+- Read replica allows reporting queries to be routed off the primary, protecting transactional write performance.
+- pgvector extension is available if the product roadmap adds semantic search or recommendation features.
+### Negative
+- PostgreSQL is not horizontally scalable for writes. If write throughput exceeds what a single primary can handle, the migration path is complex: sharding, moving to CockroachDB, or extracting high-write domains into separate services with their own datastores.
+- The schema is a shared artifact. All services that evolve from this monolith will need to coordinate schema migrations or extract their own databases — this is a known constraint of the modular monolith starting point.
+- Full-text search capabilities are limited compared to a dedicated search engine. If product search becomes a core user experience feature, Elasticsearch or OpenSearch will need to be introduced.
+- Column-oriented queries (large aggregations over historical data) will be slow at scale. A columnar store or materialized views will be needed for analytics beyond the first year.
+### Neutral
+- PostgreSQL's connection-per-process model requires connection pooling (PgBouncer or RDS Proxy) before the application scales to more than ~100 concurrent application instances.
+- JSON/JSONB columns are available for semi-structured data, but using them to avoid schema definition should be treated as a deliberate trade-off, not a default, since it sacrifices query performance and type safety.
+## Alternatives Considered
+### MongoDB Atlas
+- Pros: Flexible schema useful during early product discovery; horizontal write scaling via sharding; native document model fits product catalog with variable attribute sets.
+- Cons: Multi-document transactions add complexity to the order creation workflow; team has no production MongoDB experience; costs more at equivalent durability; joins require application-level code.
+- Why rejected: The consistency requirements of the financial transaction workflow outweigh the schema flexibility benefits. The team would spend more time learning the operational model than building product.
+### DynamoDB
+- Pros: Fully managed with no connection limits; seamlessly handles unpredictable traffic spikes; pay-per-use pricing works well for uncertain early workloads.
+- Cons: Requires upfront access pattern design that is difficult to change later; no ad hoc queries; 25-item transaction limit does not fit the domain; requires DynamoDB-specific expertise for capacity planning and key design.
+- Why rejected: The rigid access pattern requirement makes it unsuitable for a domain in early discovery. The cost of a wrong key design is a full table rebuild. PostgreSQL's flexibility is worth paying for at this stage.

package/opencode/skill/architecture-selection/examples/architecture-patterns.md ADDED Viewed

@@ -0,0 +1,239 @@
+# Architecture Patterns: Decision Guide
+Practical catalog of architectural patterns with trade-offs. Use this alongside the pattern selection table in SKILL.md. Each entry answers the two questions that matter most: when to reach for this pattern, and when to avoid it.
+---
+## Monolith
+A single deployable unit. All modules share the same process, memory space, and database.
+**Reach for this when:**
+- Team is under 10 developers sharing the same codebase
+- Domain is not yet well understood — premature decomposition locks in wrong boundaries
+- Time to first working product is the dominant constraint
+- Infrastructure expertise is limited or unavailable
+- Transactions must be ACID across what would otherwise be service boundaries
+**Do not use when:**
+- Multiple teams need to deploy independently without coordinating releases
+- Different parts of the system have wildly different scaling profiles (e.g., file uploads vs. real-time feeds)
+- You need polyglot persistence (each domain using its most appropriate database)
+- The codebase is already too large to hold in one developer's head
+**Scaling characteristics:**
+- Scale by cloning the entire application behind a load balancer
+- All modules scale together regardless of where the bottleneck actually is
+- Database becomes the bottleneck first; address with read replicas and caching before reaching for microservices
+- Practical ceiling: tens of millions of requests per day with a well-tuned monolith on modern hardware
+**Team size fit:** 1–10 developers
+---
+## Modular Monolith
+A monolith with enforced internal module boundaries. Modules own their data and expose explicit interfaces to each other. No shared tables. No calling into another module's internals.
+**Reach for this when:**
+- You want the operational simplicity of a monolith but are building toward future service extraction
+- Domain boundaries are becoming clearer but you are not ready to pay the operational cost of microservices
+- Team is growing (10–20 developers) and you need to reduce coordination overhead without splitting deployment
+- You have been burned by a big ball of mud monolith and want structure without the distributed systems tax
+**Do not use when:**
+- Teams genuinely need independent deployment pipelines today
+- Module isolation is not enforced by tooling — without enforcement, boundaries erode under deadline pressure
+- Scaling requirements are already differentiated between modules
+**Scaling characteristics:**
+- Same ceiling as a monolith: scale by replication
+- Module isolation makes it significantly easier to extract a service later when a specific module becomes the bottleneck
+- The database is still shared at the infrastructure level even if modules have schema ownership
+**Team size fit:** 5–25 developers
+---
+## Microservices
+Independent services each owning a bounded context, deployed and scaled separately.
+**Reach for this when:**
+- Multiple autonomous teams need to ship independently without release coordination
+- Scaling needs are genuinely differentiated: the checkout service needs 50x more capacity than the admin dashboard
+- Different services have legitimately different technology requirements (streaming processing vs. CRUD vs. ML inference)
+- The domain is well understood and bounded contexts are stable — wrong boundaries are expensive to fix later
+- High availability is a hard requirement: a failure in recommendations must not take down checkout
+**Do not use when:**
+- Team is under 20 developers — you will spend more time on distributed systems plumbing than on product
+- Domain boundaries are still being discovered — wait until the seams are clear
+- The team lacks experience with distributed systems, container orchestration, service meshes, and observability
+- Transactions must be strongly consistent across what would be service boundaries — distributed sagas are a significant complexity investment
+- You are starting a new product — the monolith-first rule exists for good reason
+**Scaling characteristics:**
+- Scale individual services independently based on measured demand
+- Each service can adopt the database technology that fits its access patterns
+- Horizontal scaling at the service level; the API gateway and message bus become new bottlenecks to manage
+- Requires investment in service discovery, load balancing, distributed tracing, and centralized logging
+**Team size fit:** 20+ developers, organized into product teams aligned to service ownership
+---
+## Event-Driven Architecture
+Services communicate by publishing and subscribing to events on a message broker. No direct synchronous calls between services for primary workflows.
+**Reach for this when:**
+- Workflows span multiple services and you cannot afford to couple their availability
+- You need a durable audit trail of everything that has happened in the system
+- Processing is inherently asynchronous: order fulfillment, email delivery, report generation
+- You need to replay historical events to rebuild state or feed new downstream consumers
+- Fan-out is natural: one event (OrderPlaced) needs to trigger actions in multiple independent systems
+**Do not use when:**
+- The user needs a synchronous response to complete their action — eventual consistency is a user experience trade-off, not just a technical one
+- The team has no experience operating a message broker under production load
+- The domain has complex ordering requirements that are difficult to enforce across independent consumers
+- Debugging and distributed tracing tooling is not in place — event-driven systems are significantly harder to debug without it
+**Scaling characteristics:**
+- Producers and consumers scale independently
+- The message broker (Kafka, RabbitMQ, SQS) becomes the critical scaling and reliability component
+- Consumer groups allow parallel processing of event streams
+- Backpressure is managed through queue depth monitoring and consumer scaling policies
+**Team size fit:** Any size, but requires operational maturity. Do not introduce event-driven patterns without investing in observability first.
+---
+## CQRS (Command Query Responsibility Segregation)
+Separate models for write operations (commands) and read operations (queries). Often paired with event sourcing but does not require it.
+**Reach for this when:**
+- Read and write access patterns are fundamentally different: complex reporting queries on the same data used for transactional writes
+- The read model needs to be denormalized for performance but the write model needs normalization for consistency
+- Read throughput is orders of magnitude higher than write throughput
+- You need multiple specialized read models from the same underlying data (e.g., search index, reporting database, and API response shape)
+**Do not use when:**
+- The domain is simple CRUD — CQRS adds complexity without benefit
+- The team is not prepared to handle eventual consistency between the write and read sides
+- You are in early product discovery — CQRS optimizes for a usage pattern you may not have confirmed yet
+- Event sourcing is not in scope and the synchronization strategy between write and read stores has not been designed
+**Scaling characteristics:**
+- Read and write sides scale independently
+- Read models can be replicated aggressively since they are derived, not authoritative
+- Write side remains the consistency bottleneck; scale cautiously and measure before sharding
+- Projection lag (time between a write and the read model reflecting it) must be measured and communicated to users where it matters
+**Team size fit:** 5–50 developers, in systems where read/write asymmetry has been measured
+---
+## Hexagonal Architecture (Ports and Adapters)
+The application core contains pure business logic with no framework or infrastructure dependencies. All external concerns (HTTP, databases, message queues, external APIs) connect through defined ports implemented by adapters.
+**Reach for this when:**
+- Business logic must be testable in isolation without spinning up databases or HTTP servers
+- You need to swap infrastructure implementations: test database vs. production, email adapter vs. SMS adapter
+- The domain is complex and protecting it from framework churn is a priority
+- The team is practicing DDD and needs a clear boundary between domain and infrastructure
+**Do not use when:**
+- The application is primarily CRUD with little business logic — the overhead of ports and adapters is not justified
+- The team is unfamiliar with the pattern — misapplied hexagonal architecture produces more abstractions than value
+- Rapid prototyping is the goal — the pattern slows initial development in exchange for long-term maintainability
+**Scaling characteristics:**
+- The pattern is architectural, not a scaling strategy
+- Infrastructure adapters can be swapped to use scaling-oriented implementations (e.g., swap in-memory queue adapter for SQS) without changing the domain
+- Enables independent testing of scaling scenarios at the adapter level
+**Team size fit:** Any size. Most valuable on teams of 5+ working on complex domains. Not worth the overhead for scripts or simple CRUD services.
+---
+## Serverless
+Business logic runs as short-lived functions invoked by events or HTTP requests. The platform manages all server provisioning, scaling, and availability.
+**Reach for this when:**
+- Workload is irregular or spiky — pay for actual execution time, not idle capacity
+- Operations are event-triggered: file uploaded, webhook received, scheduled job
+- Functions are short-lived (under 15 minutes) and stateless
+- Time to production is the dominant constraint and the team cannot afford infrastructure management
+- Cost optimization for low-traffic workloads is a priority
+**Do not use when:**
+- Operations are long-running or require persistent in-memory state
+- Cold start latency is unacceptable for user-facing requests (sub-100ms response time requirements)
+- Local development and testing parity with production is critical — serverless local emulation is imperfect
+- Vendor lock-in is a hard constraint — serverless functions are deeply coupled to platform-specific runtimes and event formats
+- The system has complex inter-function orchestration — function chaining becomes a debugging and reliability problem
+**Scaling characteristics:**
+- Scales to zero when idle — no cost for unused capacity
+- Scales to thousands of concurrent invocations automatically
+- Concurrency limits are platform-enforced and can cause throttling at high traffic without reserved concurrency configuration
+- The database connection problem: functions scale to thousands; databases have connection limits. Use connection poolers (RDS Proxy, PgBouncer) or serverless-native databases (DynamoDB, Aurora Serverless)
+**Team size fit:** 1–20 developers, particularly effective for small teams that cannot staff platform/infrastructure roles
+---
+## Layered Architecture (N-Tier)
+Code organized into horizontal layers: Presentation, Application/Business Logic, Data Access. Each layer only communicates with the layer directly below it.
+**Reach for this when:**
+- Building a traditional web application where the layered model is well understood by the team
+- Enforcing separation between UI, business logic, and data access is the primary goal
+- The domain is not complex enough to warrant DDD but you still want structure
+- The team is early-career or coming from frameworks (Spring, ASP.NET, Django) that naturally express this structure
+**Do not use when:**
+- Business logic needs to be tested in isolation without the data access layer — strict layering often causes domain logic to be expressed in terms of database rows rather than business concepts
+- The architecture needs to evolve toward hexagonal or DDD — layered architecture is harder to refactor than ports and adapters
+- The "layer" boundaries become a ritual rather than a structural constraint — teams often skip layers or create anemic domain models where business logic leaks into controllers
+**Scaling characteristics:**
+- The same scaling constraints as a monolith apply
+- Each tier can in principle be scaled independently (stateless presentation tier, application server pool, database)
+- In practice, the data layer is the constraint and the pattern does not provide meaningful tools for addressing it
+**Team size fit:** 1–15 developers, particularly teams starting with a framework-centric approach
+---
+## Pattern Comparison at a Glance
+| Pattern | Operational Complexity | Time to First Feature | Long-Term Maintainability | Scaling Ceiling |
+|---|---|---|---|---|
+| Monolith | Low | Fast | Medium | Medium |
+| Modular Monolith | Low | Medium | High | Medium |
+| Microservices | High | Slow | High (with discipline) | High |
+| Event-Driven | High | Medium | High (with tooling) | High |
+| CQRS | Medium | Slow | High | High |
+| Hexagonal | Low | Medium | High | N/A (architectural) |
+| Serverless | Low | Fast | Medium | High |
+| Layered | Low | Fast | Low-Medium | Medium |
+---
+## The Default Path
+When in doubt, follow this sequence:
+1. Start with a **modular monolith** using hexagonal architecture internally
+2. Extract services only when a specific module has a scaling or deployment independence requirement that is measured, not anticipated
+3. Introduce event-driven communication at service boundaries when synchronous coupling becomes a reliability risk
+4. Add CQRS to specific modules when read/write asymmetry is measured and causing problems
+Premature decomposition is the most common architecture mistake. The cost of splitting a well-structured monolith is low. The cost of merging a poorly-bounded microservices system is very high.

package/opencode/skill/bug-diagnosis/SKILL.md ADDED Viewed

@@ -0,0 +1,235 @@
+---
+name: bug-diagnosis
+description: "Scientific debugging methodology through conversational investigation, hypothesis testing, and root cause analysis"
+license: MIT
+compatibility: opencode
+metadata:
+  category: development
+  version: "1.0"
+---
+# Bug Diagnosis
+Roleplay as a debugging methodology specialist that applies the scientific method to systematically diagnose and resolve bugs through natural conversation.
+BugDiagnosis {
+  Activation {
+    When investigating error messages or stack traces
+    When diagnosing logic errors or wrong output
+    When troubleshooting integration failures
+    When debugging timing or async issues
+    When analyzing intermittent or flaky behavior
+    When investigating performance degradation
+    When resolving environment-specific issues
+  }
+  Constraints {
+    ObservableActionsOnly {
+      Report only what you actually verified
+      State what you read, ran, or traced
+      - "I read auth/service.ts line 47 and found..."
+      - "I ran npm test and saw 3 failures"
+      - "I checked git log and found this file was last modified 2 days ago"
+      When you have not checked something, be honest: "I haven't looked at X yet."
+    }
+    ProgressiveDisclosure {
+      Start brief
+      Expand on request
+      Reveal detail incrementally
+    }
+    UserInControl {
+      Propose actions and await user decision
+      "Want me to...?" as proposal pattern
+      Never assume consent
+    }
+    DebuggingTruths {
+      The bug is always logical -- computers do exactly what code tells them
+      Most bugs are simpler than they first appear
+      If you cannot explain what you found, you have not found it yet
+      Intermittent bugs have deterministic causes not yet identified
+      Transparency builds trust
+    }
+  }
+  BugTypeInvestigation {
+    Evaluate the bug description. First match determines initial investigation focus.
+    | Bug Type | What to Investigate | Reporting Pattern |
+    |----------|---------------------|-------------------|
+    | Error message / stack trace | Error propagation, exception handling, error origin | "The error originates at X because Y" |
+    | Logic error / wrong output | Data flow, boundary conditions, conditional branches | "The condition on line X doesn't handle case Y" |
+    | Integration failure | API contracts, versions, request/response shapes | "The API expects X but we're sending Y" |
+    | Timing / async issue | Race conditions, await handling, event ordering | "There's a race between A and B" |
+    | Intermittent / flaky | Variable conditions, state leaks, concurrency | "This fails when [condition] because [reason]" |
+    | Performance degradation | Resource leaks, algorithm complexity, blocking ops | "The bottleneck is at X causing Y" |
+    | Environment-specific | Configuration, dependency versions, platform diffs | "The config differs: prod has X, local has Y" |
+  }
+  InvestigationPerspectives {
+    For complex bugs, investigate from multiple angles to test competing hypotheses
+    | Perspective | Intent | What to Investigate |
+    |-------------|--------|---------------------|
+    | Error Trace | Follow the error path | Stack traces, error messages, exception handling, error propagation |
+    | Code Path | Trace execution flow | Conditional branches, data transformations, control flow, early returns |
+    | Dependencies | Check external factors | External services, database queries, API calls, network issues |
+    | State | Inspect runtime values | Variable values, object states, race conditions, timing issues |
+    | Environment | Compare contexts | Configuration, versions, deployment differences, env variables |
+  }
+  InvestigationTechniques {
+    | Technique | Commands / Approach |
+    |-----------|---------------------|
+    | Log and Error Analysis | Check application logs, parse stack traces, correlate timestamps |
+    | Code Investigation | `git log -p <file>`, `git bisect`, trace execution paths |
+    | Runtime Debugging | Strategic logging, debugger breakpoints, inspect variable state |
+    | Environment Checks | Verify config consistency, check dependency versions, compare environments |
+  }
+  InvestigationTaskTemplate {
+    For each perspective, describe the investigation intent:
+    ```
+    Investigate [PERSPECTIVE] for bug:
+    CONTEXT:
+    - Bug: [Error description, symptoms]
+    - Reproduction: [Steps to reproduce]
+    - Environment: [Where it occurs]
+    FOCUS: [What this perspective investigates -- from perspectives table]
+    OUTPUT: Findings formatted as:
+      area: [Investigation Area]
+      location: file:line
+      checked: [What was verified]
+      result: FOUND | CLEAR
+      detail: [Evidence discovered] OR [No issues found]
+      hypothesis: [What this suggests]
+    ```
+  }
+  Workflow {
+    Phase1_UnderstandTheProblem {
+      1. Acknowledge the bug
+      2. Perform initial investigation (check git status, look for obvious errors)
+      3. Classify bug type using the Bug Type Investigation table
+      4. Present brief summary, invite user direction:
+      ```
+      "I see you're hitting [brief symptom summary]. Let me take a quick look..."
+      [Investigation results]
+      "Here's what I found so far: [1-2 sentence summary]
+      Want me to dig deeper, or can you tell me more about when this started?"
+      ```
+    }
+    Phase2_NarrowItDown {
+      Form hypotheses, track internally with todowrite
+      Present theories conversationally:
+      ```
+      "I have a couple of theories:
+      1. [Most likely] - because I saw [evidence]
+      2. [Alternative] - though this seems less likely
+      Want me to dig into the first one?"
+      ```
+      Let user guide next investigation direction
+    }
+    Phase3_FindTheRootCause {
+      Trace execution, gather specific evidence
+      Present finding with specific code reference (file:line):
+      ```
+      "Found it. In [file:line], [describe what's wrong].
+      [Show only relevant code, not walls of text]
+      The problem: [one sentence explanation]
+      Should I fix this, or do you want to discuss the approach first?"
+      ```
+    }
+    Phase4_FixAndVerify {
+      1. Propose minimal fix, get user approval:
+      ```
+      "Here's what I'd change:
+      [Show the proposed fix -- just the relevant diff]
+      This fixes it by [brief explanation].
+      Want me to apply this?"
+      ```
+      2. After approval: Apply change, run tests
+      3. Report actual results honestly:
+      ```
+      "Applied the fix. Tests are passing now.
+      Can you verify on your end?"
+      ```
+    }
+    Phase5_WrapUp {
+      Quick closure by default: "All done! Anything else?"
+      Detailed summary only if user asks
+      Offer follow-ups without pushing:
+      - "Should I add a test case for this?"
+      - "Want me to check if this pattern exists elsewhere?"
+    }
+  }
+  WhenStuck {
+    Be honest:
+    ```
+    "I've looked at [what you checked] but haven't pinpointed it yet.
+    A few options:
+    - I could check [alternative area]
+    - You could tell me more about [specific question]
+    - We could take a different angle
+    What sounds most useful?"
+    ```
+  }
+  AdversarialInvestigation {
+    For complex bugs with multiple competing hypotheses:
+    1. Map evidence: for each hypothesis, list supporting and refuting evidence
+    2. Score: hypotheses with more supporting evidence and fewer successful challenges rank higher
+    3. Identify the survivor: the hypothesis that withstood the most scrutiny
+    4. Build evidence chain: symptom -> evidence -> root cause
+    Present conversationally: winning hypothesis with evidence, runner-up, ruled-out theories, the smoking gun
+  }
+  HypothesisTracking {
+    Use todowrite internally to track:
+    - Hypotheses formed with supporting evidence
+    - What was checked and what was found
+    - What was ruled out and why
+  }
+  FixProtocol {
+    1. Propose fix with explanation
+    2. Get user approval
+    3. Apply minimal change
+    4. Run tests
+    5. Report honest results
+    6. Ask user to verify
+  }
+}