npm - @agentled/cli - Versions diffs - 0.1.5 → 0.4.3 - Mend

@agentled/cli 0.1.5 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.md +136 -0
package/dist/commands/auth.js +30 -0
package/dist/commands/auth.js.map +1 -1
package/dist/commands/examples.d.ts +15 -0
package/dist/commands/examples.js +100 -0
package/dist/commands/examples.js.map +1 -0
package/dist/commands/scaffold.d.ts +14 -0
package/dist/commands/scaffold.js +103 -0
package/dist/commands/scaffold.js.map +1 -0
package/dist/commands/schema.d.ts +10 -0
package/dist/commands/schema.js +58 -0
package/dist/commands/schema.js.map +1 -0
package/dist/commands/skills.d.ts +9 -0
package/dist/commands/skills.js +94 -0
package/dist/commands/skills.js.map +1 -0
package/dist/commands/workflows.js +227 -9
package/dist/commands/workflows.js.map +1 -1
package/dist/index.js +6 -0
package/dist/index.js.map +1 -1
package/dist/utils/preflight.d.ts +25 -0
package/dist/utils/preflight.js +185 -0
package/dist/utils/preflight.js.map +1 -0
package/dist/utils/skills.d.ts +49 -0
package/dist/utils/skills.js +214 -0
package/dist/utils/skills.js.map +1 -0
package/package.json +4 -1
package/patterns/v1/00-why-agentic-ops.md +107 -0
package/patterns/v1/01-trigger-design.md +107 -0
package/patterns/v1/02-dedup-gates.md +135 -0
package/patterns/v1/03-credit-efficiency.md +130 -0
package/patterns/v1/04-loop-patterns.md +147 -0
package/patterns/v1/05-child-workflow-contracts.md +151 -0
package/patterns/v1/06-conditional-routing.md +151 -0
package/patterns/v1/07-error-handling.md +157 -0
package/patterns/v1/08-composed-email-approval.md +130 -0
package/patterns/v1/09-reports-and-knowledge-storage.md +166 -0
package/scaffolds/README.md +61 -0
package/scaffolds/email-polling-dedup.json +71 -0
package/scaffolds/extract-threshold-alert.json +131 -0
package/scaffolds/lead-scoring-kg.json +84 -0
package/scaffolds/list-match-email.json +131 -0
package/scaffolds/minimal.json +20 -0
package/skills/agentled/SKILL.md +568 -0

package/patterns/v1/01-trigger-design.md ADDED Viewed

@@ -0,0 +1,107 @@
+# 01 — Trigger design: polling vs event triggers
+**Problem**: Developers default to event triggers for email/document intake workflows, creating fragile pipelines that drop records, can't backfill, and are hard to debug.
+**Why it fails silently**: Event triggers appear to work in testing (low volume, reliable delivery). At production scale, re-deliveries cause duplicates, Pub/Sub TTL loses events during outages, and there's no way to backfill records missed during downtime — without any visible error.
+---
+## Decision framework
+| | Schedule (polling) | App Event (real-time) |
+|---|---|---|
+| **Latency** | minutes–hours | seconds |
+| **Idempotency** | trivial — label/flag marks processed | must dedupe on messageId; re-deliveries happen |
+| **Backfill** | built-in — widen the query window | doesn't exist; needs a separate bootstrap run |
+| **Replay after outage** | automatic on next scheduled run | events can be permanently lost (TTL) |
+| **Debugging** | read last execution log | subscription status + delivery + filter + dedupe all need checking |
+| **Infrastructure** | none | webhook receiver, watch renewal, Pub/Sub |
+**Default rule: polling for intake, events for reactions.**
+---
+## Anti-pattern
+Using an event trigger for email intake because it "feels more real-time":
+```yaml
+# Wrong: event trigger for deal flow email intake
+trigger:
+  type: app_event
+  app: gmail
+  event: GMAIL_NEW_MESSAGE_RECEIVED
+  filters:
+    query: "from:investor subject:pitch"
+```
+Problems:
+- Duplicate delivery means the same email gets processed 2-3× with no dedup mechanism
+- Gmail watch tokens expire — you need a renewal job or emails stop arriving silently
+- No backfill: if the workflow is down for 2 days, those emails are gone
+- Debugging requires checking: is the watch active? Did the webhook fire? Did the filter match? Did the dedup run?
+---
+## Correct pattern
+Schedule trigger with label-based dedup:
+```yaml
+# Correct: scheduled polling with dedup gate
+trigger:
+  type: schedule
+  config:
+    frequency: daily
+    time: "08:00"
+steps:
+  - id: fetch-emails
+    action: GMAIL_FETCH_EMAILS
+    input:
+      query: "-label:processed newer_than:1d"
+      max_results: 50
+  - id: process-email
+    type: loop
+    over: "{{steps.fetch-emails.messages}}"
+    # ... processing steps ...
+  - id: mark-processed
+    action: GMAIL_ADD_LABEL
+    input:
+      message_id: "{{currentItem.id}}"
+      label_id: "{{steps.ensure-label.id}}"  # resolved ID, not display name
+```
+The `-label:processed` filter does the dedup work. Each email is processed exactly once. If the workflow goes down for a week, widen to `newer_than:7d` on the next run to backfill.
+---
+## When to use event triggers
+Event triggers are correct when:
+- The user explicitly requires sub-minute latency ("alert within 30 seconds", "as soon as", "real-time")
+- The workflow is a **side effect** (fire-and-forget notification), not a record-of-truth producer
+- Missed events are acceptable (or you have a separate reconciliation job)
+---
+## Trigger type cheatsheet
+| User says | Trigger |
+|---|---|
+| "process inbound pitch emails" | Schedule (daily) |
+| "triage support emails every morning" | Schedule (daily 08:00) |
+| "every Monday summarize last week's emails" | Schedule (weekly) |
+| "analyze my inbox and create Notion entries" | Schedule (daily) |
+| "page oncall within 30s of an escalation email" | App event |
+| "create a ticket the moment a customer emails" | App event |
+| "run every time a form is submitted" | Webhook |
+| "user clicks Run" | Manual |
+---
+## One-line rule
+> Default to Schedule + label-based dedup for email and document intake; use event triggers only when the user explicitly states a latency requirement under one minute.

package/patterns/v1/02-dedup-gates.md ADDED Viewed

@@ -0,0 +1,135 @@
+# 02 — Dedup gates: idempotency for agentic workflows
+**Problem**: Without a dedup gate, every record in a polling or webhook workflow gets processed multiple times — silently, expensively, and often with conflicting writes.
+**Why it fails silently**: The first few runs look correct. Duplicates only surface when you notice your CRM has 3 entries for the same company, your enrichment bill is 2× expected, or your outreach tool sent the same email twice. By then, the damage is done.
+---
+## The Gmail label-ID bug (the most common dedup failure)
+This is the one that wastes 2 hours and isn't documented anywhere.
+You build an email polling workflow. You add a step to mark each email as processed. You pass the label name:
+```json
+{
+  "action": "GMAIL_ADD_LABEL",
+  "input": {
+    "message_id": "{{currentItem.id}}",
+    "label_id": "processed"
+  }
+}
+```
+Result: `400 Bad Request: Invalid label: processed`
+The Gmail API does not accept label **display names**. It requires internal label **IDs** — strings that look like `Label_3456789012345678`. The display name "processed" is what you see in Gmail's UI. The ID is what the API needs.
+Same bug with any user-created label: `"agentled"`, `"reviewed"`, `"done"` — all invalid.
+---
+## Anti-pattern
+```json
+// Wrong: passing label display name
+{
+  "action": "GMAIL_ADD_LABEL",
+  "input": {
+    "message_id": "{{currentItem.id}}",
+    "label_id": "processed"
+  }
+}
+// → 400: Invalid label: processed
+```
+---
+## Correct pattern
+Always resolve the label ID first using a create-or-get-label step:
+```json
+// Step 1: create label if it doesn't exist, or get existing (idempotent)
+{
+  "id": "ensure-label",
+  "action": "GMAIL_CREATE_LABEL",
+  "input": { "name": "processed" }
+}
+// Returns: { "id": "Label_3456789012345678", "name": "processed" }
+// Step 2: fetch unprocessed emails
+{
+  "id": "fetch-emails",
+  "action": "GMAIL_FETCH_EMAILS",
+  "input": {
+    "query": "-label:processed newer_than:1d",
+    "max_results": 50
+  }
+}
+// Step 3 (inside loop): mark each email processed using the resolved ID
+{
+  "id": "mark-processed",
+  "action": "GMAIL_ADD_LABEL",
+  "input": {
+    "message_id": "{{currentItem.id}}",
+    "label_id": "{{steps.ensure-label.id}}"  // ← resolved ID, not display name
+  }
+}
+```
+`GMAIL_CREATE_LABEL` is idempotent — if the label already exists, it returns the existing label's ID. Run it every time with no side effects.
+---
+## How label-based dedup works
+The `-label:processed` filter in the fetch query does the dedup work:
+1. First run: fetches 50 emails. Processes each. Adds `processed` label to each.
+2. Second run: fetches emails without `processed` label. Those 50 are now excluded. Only new emails are returned.
+3. Outage for 3 days: widen to `newer_than:7d` on the next run. All unprocessed emails in the window are caught. Processed ones are excluded.
+This gives you **exactly-once processing** with no database, no external state store, and no coordination overhead.
+---
+## Dedup for webhook triggers
+Webhooks re-deliver. Always. Your endpoint will receive the same event 2–5× under normal conditions (retries on timeout, delivery confirmation failures). Without dedup:
+```
+Webhook fires → workflow starts → enrichment call × 3 duplicates → 3 CRM entries
+```
+The fix: use a unique event ID as an idempotency key and check before processing:
+```javascript
+// Code step at workflow entry
+const eventId = input.webhookPayload.id;  // or messageId, leadId, etc.
+const alreadyProcessed = await kv.get(`processed:${eventId}`);
+if (alreadyProcessed) {
+  return { skipped: true, reason: "duplicate" };
+}
+await kv.set(`processed:${eventId}`, true, { ttl: 86400 });
+```
+---
+## Dedup patterns by source
+| Source | Dedup mechanism |
+|---|---|
+| Gmail polling | `-label:processed` query + `GMAIL_ADD_LABEL` after processing |
+| Webhook | Idempotency key from event ID, stored in KV or DB |
+| Scheduled API poll | Cursor / `since_id` / `updated_at` timestamp stored in persistent memory |
+| File/S3 intake | Move to `processed/` prefix after reading |
+| Form submissions | Unique submission ID checked before processing |
+---
+## One-line rule
+> Always resolve label IDs before passing them to the Gmail API — display names cause a silent 400 error — and always add the processed label as the final step in every email intake loop.

package/patterns/v1/03-credit-efficiency.md ADDED Viewed

@@ -0,0 +1,130 @@
+# 03 — Credit efficiency: not burning money while building
+**Problem**: Developers restart full workflow executions to debug a single failed step, burning credits on work that was already done correctly.
+**Why it fails silently**: The restarted execution appears to succeed. The wasted spend accumulates in the background — 3–5× expected credit usage during development — until the invoice arrives.
+---
+## The core discipline: fix → retry → verify
+Every debugging cycle should follow exactly this sequence:
+1. **Identify** the failed step and its error
+2. **Fix** the configuration, prompt, or code
+3. **Retry from the failed step** — not from the beginning
+4. **Verify** the step output
+Starting a new full execution to debug a failed step is the most expensive habit in agentic development. It re-runs every step that already succeeded: the enrichment API call, the LLM prompt, the database read. All paid again. None of them changed.
+---
+## Anti-pattern
+```
+Execution fails at step 5 (AI scoring)
+→ Developer reads the error
+→ Fixes the prompt
+→ Starts a NEW execution from step 1
+→ Steps 1-4 run again: enrichment (5 credits), profile fetch (2 credits), web scrape (0 credits), data parse (0 credits)
+→ Step 5 runs with the fixed prompt
+→ Total wasted: 7 credits × every debug cycle
+```
+In a workflow with 3 debug cycles per feature: 21 wasted credits before it works.
+---
+## Correct pattern
+```
+Execution fails at step 5 (AI scoring)
+→ Developer reads the error
+→ Fixes the prompt in the workflow config
+→ Retries from step 5 — the platform reuses outputs from steps 1-4
+→ Step 5 runs with the fixed prompt
+→ Total wasted: 0 credits
+```
+Most workflow platforms expose a "retry from this step" action on failed executions. Use it every time.
+---
+## Test steps in isolation before wiring them
+Before adding a step to a live workflow, test it standalone with representative input data:
+```bash
+# Test an AI step with real input — no execution, no credits for upstream steps
+test_ai_action(
+  template: "Analyze this company: {{input.company}}. Score fit 0-100.",
+  responseStructure: { score: "number", reasoning: "string" },
+  input: { company: { name: "Stripe", industry: "fintech", employees: 4000 } }
+)
+# Test a code step in the same sandbox as production
+test_code_action(
+  code: "return input.items.filter(i => i.score > 70)",
+  input: { items: [{ name: "A", score: 85 }, { name: "B", score: 60 }] }
+)
+```
+This catches errors before they're in a running execution. Zero credits for upstream steps.
+---
+## Mock downstream steps with prior output
+When you need to test a downstream step (step 6) but don't want to re-run expensive upstream steps (steps 1-5):
+1. Find a prior execution where steps 1-5 succeeded
+2. Copy the output of step 5 from that execution
+3. Use it as mock input to `test_ai_action` or `test_code_action` for step 6
+```javascript
+// Prior execution step 5 output (saved from execution abc-123):
+const priorOutput = {
+  company: { name: "Stripe", score: 85, signals: ["YC", "series B"] }
+};
+// Test step 6 in isolation using that output
+test_ai_action(
+  template: "Based on this profile, draft a 3-sentence outreach: {{input.company}}",
+  input: priorOutput
+)
+```
+No re-enrichment. No re-fetching. No wasted credits.
+---
+## One execution at a time
+Don't start a new execution while one is in flight for the same workflow. Reasons:
+- Parallel executions on the same data produce duplicate writes
+- You can't read the output of execution A while debugging it if execution B is also running
+- If both fail, you now have two half-processed states to reconcile
+The discipline: start → observe → retry or fix → verify. Sequential, not parallel.
+---
+## Credit cost by step type (reference)
+| Step type | Typical cost | Notes |
+|---|---|---|
+| AI action (standard model) | 5–15 credits | Varies by model tier and output length |
+| Data enrichment (LinkedIn, Hunter) | 2–5 credits | Per-record cost |
+| Web scrape | 0 credits | Free |
+| HTTP request | 0 credits | Free |
+| Code step | 0 credits | Free |
+| Knowledge graph read/write | 1 credit | Flat |
+| Browser automation | 10–15 credits | Per task |
+Expensive steps are AI and enrichment. These are the ones you never want to re-run unnecessarily.
+---
+## One-line rule
+> When a step fails, fix it and retry from that step — never start a new execution; use isolated step testing to catch errors before they're in a running workflow.

package/patterns/v1/04-loop-patterns.md ADDED Viewed

@@ -0,0 +1,147 @@
+# 04 — Loop patterns: iterating without N+1 or data loss
+**Problem**: Loops in agentic workflows silently drop items, produce N+1 API calls, or pass incomplete results to downstream steps because the loop hasn't finished yet.
+**Why it fails silently**: A loop that processes 10 items looks the same in logs as one that processes 9 — the missing item has no error, just an absence. Downstream steps that read loop output before completion get partial data with no warning.
+---
+## The loop completion trap
+The most common loop mistake: a downstream step reads loop results before the loop has finished.
+```yaml
+# Wrong: downstream step starts before loop finishes
+steps:
+  - id: enrich-companies
+    type: loop
+    over: "{{input.companies}}"
+    step: enrich-each
+  - id: generate-report   # starts immediately, reads partial results
+    type: ai-action
+    input: "{{steps.enrich-companies.results}}"
+```
+In async execution, `generate-report` may start with 3 of 10 companies enriched. The report is incomplete. No error is raised.
+---
+## Anti-pattern
+```yaml
+# Wrong: no loop completion gate
+- id: process-items
+  type: loop
+  over: "{{steps.fetch.items}}"
+  step: process-each
+- id: summarize   # may run with 0 items if loop is still in flight
+  type: ai-action
+  prompt: "Summarize these results: {{steps.process-items.outputs}}"
+```
+---
+## Correct pattern
+Add a `loop_completion` entry condition on every step that consumes loop output:
+```yaml
+- id: process-items
+  type: loop
+  over: "{{steps.fetch.items}}"
+  step: process-each
+- id: summarize
+  type: ai-action
+  entryConditions:
+    onCriteriaFail: "wait"          # block until condition is met
+    conditionText: "Wait for all processing to complete"
+    criteria:
+      - type: loop_completion
+        stepId: process-items       # which loop to wait for
+        operator: "=="
+        value: true
+  prompt: "Summarize these results: {{steps.process-items.outputs}}"
+```
+`onCriteriaFail: "wait"` blocks this step until all loop iterations finish. The step then runs once with the complete output.
+---
+## Pairing loop results back to source records
+After a loop that calls an external API or runs an AI step per item, you often need to pair each result back to the original record for a KG or CRM write.
+The problem: loop outputs are indexed by iteration order, not by the original record's ID.
+```javascript
+// Code step: pair loop outputs with source records
+const sourceItems = input.sourceItems;        // original array
+const loopOutputs = input.loopOutputs;        // same-length array of results
+return sourceItems.map((item, index) => ({
+  ...item,                                     // original fields
+  ...loopOutputs[index],                       // enriched fields
+  sourceId: item.id,                           // explicit ID link
+}));
+```
+Place this code step after the loop completion gate, before the write step.
+---
+## N+1: when to loop vs when to batch
+A loop that calls an LLM or enrichment API once per item is an N+1 pattern. For 100 items: 100 API calls, 100 credit charges, 100× the latency.
+**Ask: does the API support batch input?**
+```yaml
+# Wrong (N+1): one LLM call per item
+- id: classify-each
+  type: loop
+  over: "{{input.emails}}"
+  step:
+    type: ai-action
+    prompt: "Classify this email: {{currentItem.body}}"
+# Correct (batch): one LLM call for all items
+- id: classify-all
+  type: ai-action
+  prompt: |
+    Classify each of these emails. Return a JSON array in the same order.
+    Emails: {{input.emails}}
+  responseStructure:
+    classifications: "array of { id: string, category: string, priority: string }"
+```
+Not every step supports batching — enrichment APIs often don't. But AI steps almost always do. Default to batch for AI classification, extraction, and scoring over lists.
+---
+## Fire-and-forget anti-pattern
+```yaml
+# Wrong: loop dispatches child workflows with no completion tracking
+- id: dispatch-scoring
+  type: loop
+  over: "{{input.candidates}}"
+  step:
+    type: call-workflow
+    workflowId: score-candidate
+    input: "{{currentItem}}"
+- id: aggregate-scores   # starts immediately — child workflows haven't finished
+  type: ai-action
+  prompt: "Aggregate these scores: {{steps.dispatch-scoring.outputs}}"
+```
+When the loop calls child workflows, completion tracking is especially important — child workflow execution time varies. Always add a `loop_completion` gate before aggregating.
+---
+## One-line rule
+> Always gate the step that consumes loop output on `loop_completion` with `onCriteriaFail: "wait"` — loops run asynchronously and downstream steps will read partial data without it.

package/patterns/v1/05-child-workflow-contracts.md ADDED Viewed

@@ -0,0 +1,151 @@
+# 05 — Child workflow contracts: composable workflows with typed returns
+**Problem**: Monolithic workflows become unmaintainable, and child workflows called from orchestrators fail silently because their return contracts aren't defined — the calling workflow gets `undefined` for every field it tries to read.
+**Why it fails silently**: A child workflow that ends with a `milestone` step instead of a `return` step completes successfully from the platform's perspective. The calling workflow receives no data and no error. Every field reference like `{{steps.call-child.score}}` resolves to empty string.
+---
+## The milestone vs return mistake
+```yaml
+# Wrong: child workflow ends with milestone
+- id: score-company
+  type: ai-action
+  prompt: "Score this company 0-100..."
+  responseStructure:
+    score: "number"
+    decision: "string"
+- id: done           # ← milestone = terminal, no data returned to caller
+  type: milestone
+  name: "Complete"
+```
+The calling orchestrator runs `call-workflow` and gets back nothing. `{{steps.call-child.score}}` is empty. No error is raised.
+---
+## Anti-pattern
+```yaml
+# Wrong: child workflow
+steps:
+  - id: enrich
+    ...
+  - id: score
+    ...
+  - id: done           # milestone doesn't return data
+    type: milestone
+# Calling orchestrator:
+- id: call-child
+  type: call-workflow
+  input: { companyUrl: "{{input.url}}" }
+- id: use-result
+  type: ai-action
+  prompt: "Based on score {{steps.call-child.score}}..."
+  # ^ always empty — milestone returned nothing
+```
+---
+## Correct pattern
+Child workflows must end with a `return` step that explicitly declares what they return:
+```yaml
+# Correct: child workflow
+steps:
+  - id: enrich
+    ...
+  - id: score
+    type: ai-action
+    responseStructure:
+      score: "number 0-100"
+      decision: "invest | pass | monitor"
+      summary: "string"
+  - id: return-results    # ← return step, not milestone
+    type: return
+    returnConfig:
+      fields:
+        - name: score          # the name the caller uses
+          stepId: score        # which step produced it
+          field: score         # which field from that step
+        - name: decision
+          stepId: score
+          field: decision
+        - name: summary
+          stepId: score
+          field: summary
+# Calling orchestrator:
+- id: call-child
+  type: call-workflow
+  input: { companyUrl: "{{input.url}}" }
+- id: use-result
+  type: ai-action
+  prompt: "Based on score {{steps.call-child.score}}, decision: {{steps.call-child.decision}}..."
+  # ^ now populated correctly
+```
+---
+## Designing a return contract
+A return contract is the interface between the child workflow and its callers. Treat it like a function signature:
+1. **Be explicit**: list every field the caller might need — don't assume they'll dig into nested objects
+2. **Use flat field names**: `score` not `scoringCard.total_score` — callers reference these as template variables
+3. **Match names to caller expectations**: if the orchestrator uses `{{steps.call-child.decision}}`, the return contract must export a field named `decision`
+4. **Version changes carefully**: adding fields is safe; renaming or removing fields breaks every orchestrator that calls this child
+```yaml
+# Comprehensive return contract
+returnConfig:
+  fields:
+    - { name: score,           stepId: score-step,   field: total_score }
+    - { name: decision,        stepId: score-step,   field: decision }
+    - { name: summary,         stepId: score-step,   field: executive_summary }
+    - { name: teamEvaluations, stepId: eval-team,    field: evaluations }
+    - { name: rawData,         stepId: enrich,       field: companyProfile }
+```
+---
+## The god-workflow anti-pattern
+A single workflow with 25+ steps that handles intake, enrichment, scoring, routing, outreach, and CRM sync:
+```
+trigger → fetch → enrich → score → route → draft-email → approve → send →
+update-crm → update-kg → notify-slack → generate-report → archive → done
+```
+Problems:
+- A failure in step 12 requires re-running steps 1-11
+- You can't reuse the scoring logic in another context
+- Testing requires running the entire pipeline end-to-end
+- A single developer change can break the entire flow
+**Break it into composable child workflows:**
+```
+Orchestrator:
+  trigger → call: enrich-workflow → call: score-workflow → call: route-workflow → done
+enrich-workflow:    fetch → enrich → return { profile }
+score-workflow:     receive profile → score → return { score, decision }
+route-workflow:     receive decision → route → draft → approve → send → return { sent }
+```
+Each child workflow can be tested independently, retried independently, and reused by other orchestrators.
+---
+## One-line rule
+> Child workflows must end with a `return` step (not `milestone`) with an explicit `returnConfig.fields` list — milestone completes silently with no data returned to the caller.