npm - ag-cortex - Versions diffs - 0.1.0 - Mend

ag-cortex 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (162) hide show

package/.agent/skills/agent-native-architecture/references/mcp-tool-design.md ADDED Viewed

@@ -0,0 +1,506 @@
+<overview>
+How to design MCP tools following prompt-native principles. Tools should be primitives that enable capability, not workflows that encode decisions.
+**Core principle:** Whatever a user can do, the agent should be able to do. Don't artificially limit the agent—give it the same primitives a power user would have.
+</overview>
+<principle name="primitives-not-workflows">
+## Tools Are Primitives, Not Workflows
+**Wrong approach:** Tools that encode business logic
+```typescript
+tool("process_feedback", {
+  feedback: z.string(),
+  category: z.enum(["bug", "feature", "question"]),
+  priority: z.enum(["low", "medium", "high"]),
+}, async ({ feedback, category, priority }) => {
+  // Tool decides how to process
+  const processed = categorize(feedback);
+  const stored = await saveToDatabase(processed);
+  const notification = await notify(priority);
+  return { processed, stored, notification };
+});
+```
+**Right approach:** Primitives that enable any workflow
+```typescript
+tool("store_item", {
+  key: z.string(),
+  value: z.any(),
+}, async ({ key, value }) => {
+  await db.set(key, value);
+  return { text: `Stored ${key}` };
+});
+tool("send_message", {
+  channel: z.string(),
+  content: z.string(),
+}, async ({ channel, content }) => {
+  await messenger.send(channel, content);
+  return { text: "Sent" };
+});
+```
+The agent decides categorization, priority, and when to notify based on the system prompt.
+</principle>
+<principle name="descriptive-names">
+## Tools Should Have Descriptive, Primitive Names
+Names should describe the capability, not the use case:
+| Wrong | Right |
+|-------|-------|
+| `process_user_feedback` | `store_item` |
+| `create_feedback_summary` | `write_file` |
+| `send_notification` | `send_message` |
+| `deploy_to_production` | `git_push` |
+The prompt tells the agent *when* to use primitives. The tool just provides *capability*.
+</principle>
+<principle name="simple-inputs">
+## Inputs Should Be Simple
+Tools accept data. They don't accept decisions.
+**Wrong:** Tool accepts decisions
+```typescript
+tool("format_content", {
+  content: z.string(),
+  format: z.enum(["markdown", "html", "json"]),
+  style: z.enum(["formal", "casual", "technical"]),
+}, ...)
+```
+**Right:** Tool accepts data, agent decides format
+```typescript
+tool("write_file", {
+  path: z.string(),
+  content: z.string(),
+}, ...)
+// Agent decides to write index.html with HTML content, or data.json with JSON
+```
+</principle>
+<principle name="rich-outputs">
+## Outputs Should Be Rich
+Return enough information for the agent to verify and iterate.
+**Wrong:** Minimal output
+```typescript
+async ({ key }) => {
+  await db.delete(key);
+  return { text: "Deleted" };
+}
+```
+**Right:** Rich output
+```typescript
+async ({ key }) => {
+  const existed = await db.has(key);
+  if (!existed) {
+    return { text: `Key ${key} did not exist` };
+  }
+  await db.delete(key);
+  return { text: `Deleted ${key}. ${await db.count()} items remaining.` };
+}
+```
+</principle>
+<design_template>
+## Tool Design Template
+```typescript
+import { createSdkMcpServer, tool } from "@antigravity/agent-sdk";
+import { z } from "zod";
+export const serverName = createSdkMcpServer({
+  name: "server-name",
+  version: "1.0.0",
+  tools: [
+    // READ operations
+    tool(
+      "read_item",
+      "Read an item by key",
+      { key: z.string().describe("Item key") },
+      async ({ key }) => {
+        const item = await storage.get(key);
+        return {
+          content: [{
+            type: "text",
+            text: item ? JSON.stringify(item, null, 2) : `Not found: ${key}`,
+          }],
+          isError: !item,
+        };
+      }
+    ),
+    tool(
+      "list_items",
+      "List all items, optionally filtered",
+      {
+        prefix: z.string().optional().describe("Filter by key prefix"),
+        limit: z.number().default(100).describe("Max items"),
+      },
+      async ({ prefix, limit }) => {
+        const items = await storage.list({ prefix, limit });
+        return {
+          content: [{
+            type: "text",
+            text: `Found ${items.length} items:\n${items.map(i => i.key).join("\n")}`,
+          }],
+        };
+      }
+    ),
+    // WRITE operations
+    tool(
+      "store_item",
+      "Store an item",
+      {
+        key: z.string().describe("Item key"),
+        value: z.any().describe("Item data"),
+      },
+      async ({ key, value }) => {
+        await storage.set(key, value);
+        return {
+          content: [{ type: "text", text: `Stored ${key}` }],
+        };
+      }
+    ),
+    tool(
+      "delete_item",
+      "Delete an item",
+      { key: z.string().describe("Item key") },
+      async ({ key }) => {
+        const existed = await storage.delete(key);
+        return {
+          content: [{
+            type: "text",
+            text: existed ? `Deleted ${key}` : `${key} did not exist`,
+          }],
+        };
+      }
+    ),
+    // EXTERNAL operations
+    tool(
+      "call_api",
+      "Make an HTTP request",
+      {
+        url: z.string().url(),
+        method: z.enum(["GET", "POST", "PUT", "DELETE"]).default("GET"),
+        body: z.any().optional(),
+      },
+      async ({ url, method, body }) => {
+        const response = await fetch(url, { method, body: JSON.stringify(body) });
+        const text = await response.text();
+        return {
+          content: [{
+            type: "text",
+            text: `${response.status} ${response.statusText}\n\n${text}`,
+          }],
+          isError: !response.ok,
+        };
+      }
+    ),
+  ],
+});
+```
+</design_template>
+<example name="feedback-server">
+## Example: Feedback Storage Server
+This server provides primitives for storing feedback. It does NOT decide how to categorize or organize feedback—that's the agent's job via the prompt.
+```typescript
+export const feedbackMcpServer = createSdkMcpServer({
+  name: "feedback",
+  version: "1.0.0",
+  tools: [
+    tool(
+      "store_feedback",
+      "Store a feedback item",
+      {
+        item: z.object({
+          id: z.string(),
+          author: z.string(),
+          content: z.string(),
+          importance: z.number().min(1).max(5),
+          timestamp: z.string(),
+          status: z.string().optional(),
+          urls: z.array(z.string()).optional(),
+          metadata: z.any().optional(),
+        }).describe("Feedback item"),
+      },
+      async ({ item }) => {
+        await db.feedback.insert(item);
+        return {
+          content: [{
+            type: "text",
+            text: `Stored feedback ${item.id} from ${item.author}`,
+          }],
+        };
+      }
+    ),
+    tool(
+      "list_feedback",
+      "List feedback items",
+      {
+        limit: z.number().default(50),
+        status: z.string().optional(),
+      },
+      async ({ limit, status }) => {
+        const items = await db.feedback.list({ limit, status });
+        return {
+          content: [{
+            type: "text",
+            text: JSON.stringify(items, null, 2),
+          }],
+        };
+      }
+    ),
+    tool(
+      "update_feedback",
+      "Update a feedback item",
+      {
+        id: z.string(),
+        updates: z.object({
+          status: z.string().optional(),
+          importance: z.number().optional(),
+          metadata: z.any().optional(),
+        }),
+      },
+      async ({ id, updates }) => {
+        await db.feedback.update(id, updates);
+        return {
+          content: [{ type: "text", text: `Updated ${id}` }],
+        };
+      }
+    ),
+  ],
+});
+```
+The system prompt then tells the agent *how* to use these primitives:
+```markdown
+## Feedback Processing
+When someone shares feedback:
+1. Extract author, content, and any URLs
+2. Rate importance 1-5 based on actionability
+3. Store using feedback.store_feedback
+4. If high importance (4-5), notify the channel
+Use your judgment about importance ratings.
+```
+</example>
+<principle name="dynamic-capability-discovery">
+## Dynamic Capability Discovery vs Static Tool Mapping
+**This pattern is specifically for agent-native apps** where you want the agent to have full access to an external API—the same access a user would have. It follows the core agent-native principle: "Whatever the user can do, the agent can do."
+If you're building a constrained agent with limited capabilities, static tool mapping may be intentional. But for agent-native apps integrating with HealthKit, HomeKit, GraphQL, or similar APIs:
+**Static Tool Mapping (Anti-pattern for Agent-Native):**
+Build individual tools for each API capability. Always out of date, limits agent to only what you anticipated.
+```typescript
+// ❌ Static: Every API type needs a hardcoded tool
+tool("read_steps", async ({ startDate, endDate }) => {
+  return healthKit.query(HKQuantityType.stepCount, startDate, endDate);
+});
+tool("read_heart_rate", async ({ startDate, endDate }) => {
+  return healthKit.query(HKQuantityType.heartRate, startDate, endDate);
+});
+tool("read_sleep", async ({ startDate, endDate }) => {
+  return healthKit.query(HKCategoryType.sleepAnalysis, startDate, endDate);
+});
+// When HealthKit adds glucose tracking... you need a code change
+```
+**Dynamic Capability Discovery (Preferred):**
+Build a meta-tool that discovers what's available, and a generic tool that can access anything.
+```typescript
+// ✅ Dynamic: Agent discovers and uses any capability
+// Discovery tool - returns what's available at runtime
+tool("list_available_capabilities", async () => {
+  const quantityTypes = await healthKit.availableQuantityTypes();
+  const categoryTypes = await healthKit.availableCategoryTypes();
+  return {
+    text: `Available health metrics:\n` +
+          `Quantity types: ${quantityTypes.join(", ")}\n` +
+          `Category types: ${categoryTypes.join(", ")}\n` +
+          `\nUse read_health_data with any of these types.`
+  };
+});
+// Generic access tool - type is a string, API validates
+tool("read_health_data", {
+  dataType: z.string(),  // NOT z.enum - let HealthKit validate
+  startDate: z.string(),
+  endDate: z.string(),
+  aggregation: z.enum(["sum", "average", "samples"]).optional()
+}, async ({ dataType, startDate, endDate, aggregation }) => {
+  // HealthKit validates the type, returns helpful error if invalid
+  const result = await healthKit.query(dataType, startDate, endDate, aggregation);
+  return { text: JSON.stringify(result, null, 2) };
+});
+```
+**When to Use Each Approach:**
+| Dynamic (Agent-Native) | Static (Constrained Agent) |
+|------------------------|---------------------------|
+| Agent should access anything user can | Agent has intentionally limited scope |
+| External API with many endpoints (HealthKit, HomeKit, GraphQL) | Internal domain with fixed operations |
+| API evolves independently of your code | Tightly coupled domain logic |
+| You want full action parity | You want strict guardrails |
+**The agent-native default is Dynamic.** Only use Static when you're intentionally limiting the agent's capabilities.
+**Complete Dynamic Pattern:**
+```swift
+// 1. Discovery tool: What can I access?
+tool("list_health_types", "Get available health data types") { _ in
+    let store = HKHealthStore()
+    let quantityTypes = HKQuantityTypeIdentifier.allCases.map { $0.rawValue }
+    let categoryTypes = HKCategoryTypeIdentifier.allCases.map { $0.rawValue }
+    let characteristicTypes = HKCharacteristicTypeIdentifier.allCases.map { $0.rawValue }
+    return ToolResult(text: """
+        Available HealthKit types:
+        ## Quantity Types (numeric values)
+        \(quantityTypes.joined(separator: ", "))
+        ## Category Types (categorical data)
+        \(categoryTypes.joined(separator: ", "))
+        ## Characteristic Types (user info)
+        \(characteristicTypes.joined(separator: ", "))
+        Use read_health_data or write_health_data with any of these.
+        """)
+}
+// 2. Generic read: Access any type by name
+tool("read_health_data", "Read any health metric", {
+    dataType: z.string().describe("Type name from list_health_types"),
+    startDate: z.string(),
+    endDate: z.string()
+}) { request in
+    // Let HealthKit validate the type name
+    guard let type = HKQuantityTypeIdentifier(rawValue: request.dataType)
+                     ?? HKCategoryTypeIdentifier(rawValue: request.dataType) else {
+        return ToolResult(
+            text: "Unknown type: \(request.dataType). Use list_health_types to see available types.",
+            isError: true
+        )
+    }
+    let samples = try await healthStore.querySamples(type: type, start: startDate, end: endDate)
+    return ToolResult(text: samples.formatted())
+}
+// 3. Context injection: Tell agent what's available in system prompt
+func buildSystemPrompt() -> String {
+    let availableTypes = healthService.getAuthorizedTypes()
+    return """
+    ## Available Health Data
+    You have access to these health metrics:
+    \(availableTypes.map { "- \($0)" }.joined(separator: "\n"))
+    Use read_health_data with any type above. For new types not listed,
+    use list_health_types to discover what's available.
+    """
+}
+```
+**Benefits:**
+- Agent can use any API capability, including ones added after your code shipped
+- API is the validator, not your enum definition
+- Smaller tool surface (2-3 tools vs N tools)
+- Agent naturally discovers capabilities by asking
+- Works with any API that has introspection (HealthKit, GraphQL, OpenAPI)
+</principle>
+<principle name="crud-completeness">
+## CRUD Completeness
+Every data type the agent can create, it should be able to read, update, and delete. Incomplete CRUD = broken action parity.
+**Anti-pattern: Create-only tools**
+```typescript
+// ❌ Can create but not modify or delete
+tool("create_experiment", { hypothesis, variable, metric })
+tool("write_journal_entry", { content, author, tags })
+// User: "Delete that experiment" → Agent: "I can't do that"
+```
+**Correct: Full CRUD for each entity**
+```typescript
+// ✅ Complete CRUD
+tool("create_experiment", { hypothesis, variable, metric })
+tool("read_experiment", { id })
+tool("update_experiment", { id, updates: { hypothesis?, status?, endDate? } })
+tool("delete_experiment", { id })
+tool("create_journal_entry", { content, author, tags })
+tool("read_journal", { query?, dateRange?, author? })
+tool("update_journal_entry", { id, content, tags? })
+tool("delete_journal_entry", { id })
+```
+**The CRUD Audit:**
+For each entity type in your app, verify:
+- [ ] Create: Agent can create new instances
+- [ ] Read: Agent can query/search/list instances
+- [ ] Update: Agent can modify existing instances
+- [ ] Delete: Agent can remove instances
+If any operation is missing, users will eventually ask for it and the agent will fail.
+</principle>
+<checklist>
+## MCP Tool Design Checklist
+**Fundamentals:**
+- [ ] Tool names describe capability, not use case
+- [ ] Inputs are data, not decisions
+- [ ] Outputs are rich (enough for agent to verify)
+- [ ] CRUD operations are separate tools (not one mega-tool)
+- [ ] No business logic in tool implementations
+- [ ] Error states clearly communicated via `isError`
+- [ ] Descriptions explain what the tool does, not when to use it
+**Dynamic Capability Discovery (for agent-native apps):**
+- [ ] For external APIs where agent should have full access, use dynamic discovery
+- [ ] Include a `list_*` or `discover_*` tool for each API surface
+- [ ] Use string inputs (not enums) when the API validates
+- [ ] Inject available capabilities into system prompt at runtime
+- [ ] Only use static tool mapping if intentionally limiting agent scope
+**CRUD Completeness:**
+- [ ] Every entity has create, read, update, delete operations
+- [ ] Every UI action has a corresponding agent tool
+- [ ] Test: "Can the agent undo what it just did?"
+</checklist>