@kairos-sdk/core 0.3.2 → 0.4.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,7 +8,7 @@
8
8
 
9
9
  ![Kairos SDK Demo](demo.gif)
10
10
 
11
- Kairos turns plain-English workflow descriptions into validated, deployable n8n workflow JSON. Use it as an **MCP server** (connect to Claude Code, Claude Desktop, or any MCP host — your LLM generates, Kairos validates and deploys, no Anthropic API key needed) or as a **TypeScript SDK** for programmatic control (calls Claude internally with a specialized prompt). Either way, workflows pass through a **23-rule structural validator** with automatic correction, and a local workflow library with **hybrid retrieval** (TF-IDF + node fingerprinting + outcome history + cluster reranking) injects past failure patterns into future generations. With a seeded template library, Kairos achieves **100% first-try structural validation pass rate** across 20 benchmark prompts (meaning the generated JSON is structurally valid on the first attempt — runtime behavior depends on your credentials and node configuration).
11
+ Kairos turns plain-English workflow descriptions into validated, deployable n8n workflow JSON. Use it as an **MCP server** (connect to Claude Code, Claude Desktop, or any MCP host — your LLM generates, Kairos validates and deploys, no Anthropic API key needed) or as a **TypeScript SDK** for programmatic control (calls Claude internally with a specialized prompt). Either way, workflows pass through a **26-rule structural validator** with automatic correction, and a local workflow library with **hybrid retrieval** (TF-IDF + node fingerprinting + outcome history + cluster reranking) injects past failure patterns into future generations. With a seeded template library, Kairos achieves **100% first-try structural validation pass rate** across 20 benchmark prompts (meaning the generated JSON is structurally valid on the first attempt — runtime behavior depends on your credentials and node configuration).
12
12
 
13
13
  ```ts
14
14
  import { Kairos } from '@kairos-sdk/core'
@@ -27,11 +27,21 @@ console.log(result.workflowId) // deployed workflow ID
27
27
  console.log(result.credentialsNeeded) // what the user still needs to configure
28
28
  ```
29
29
 
30
+ ### What Kairos does and does not do
31
+
32
+ | Kairos does | Kairos does not guarantee (yet) |
33
+ |---|---|
34
+ | Generates valid n8n workflow JSON | Perfect business logic |
35
+ | Validates structure before deploy (23 rules) | Correct credentials |
36
+ | Syncs node types from your live instance | Runtime success for every API |
37
+ | Learns from prior successful builds | That every workflow matches intent perfectly |
38
+ | Works through MCP, SDK, or CLI | Full replacement for human review |
39
+
30
40
  ---
31
41
 
32
42
  ## Use as MCP Server (no code required)
33
43
 
34
- Connect Kairos to any MCP-compatible host — Claude Code, Claude Desktop, ChatGPT, Cursor, or any agent that supports the Model Context Protocol. Your host LLM generates the workflow using Kairos's specialized context, then Kairos validates and deploys it. No Anthropic API key needed — no double-LLM calls, no wasted tokens. Kairos auto-syncs your n8n instance's node types so the catalog always matches your exact setup.
44
+ Connect Kairos to any MCP-compatible host — Claude Code, Claude Desktop, Cursor, or any agent that supports the Model Context Protocol. Your host LLM generates the workflow using Kairos's specialized context, then Kairos validates and deploys it. No Anthropic API key needed — no double-LLM calls, no wasted tokens. Kairos auto-syncs your n8n instance's node types so the catalog always matches your exact setup.
35
45
 
36
46
  ### Setup
37
47
 
@@ -85,7 +95,7 @@ The MCP server does **not** call an LLM internally. Instead, it gives your host
85
95
 
86
96
  1. **Host LLM calls `kairos_prompt`** — gets the n8n system prompt, node catalog, library matches, and failure patterns
87
97
  2. **Host LLM generates the workflow JSON** using that context (no separate API call)
88
- 3. **Host LLM calls `kairos_validate`** — checks the JSON against 23 structural rules
98
+ 3. **Host LLM calls `kairos_validate`** — checks the JSON against 26 structural rules
89
99
  4. If invalid, the host LLM fixes the issues and validates again
90
100
  5. **Host LLM calls `kairos_deploy`** — sends the validated workflow to n8n
91
101
 
@@ -98,7 +108,7 @@ This means Kairos works with **any LLM** — Claude, GPT, Gemini, Llama, or anyt
98
108
  | Tool | Description |
99
109
  |------|-------------|
100
110
  | `kairos_prompt` | Returns the specialized system prompt, node catalog, library matches, and failure patterns for a given description |
101
- | `kairos_validate` | Validates workflow JSON against 23 structural rules — returns errors and warnings |
111
+ | `kairos_validate` | Validates workflow JSON against 26 structural rules — returns errors and warnings |
102
112
  | `kairos_search` | Searches the local workflow library for similar past builds |
103
113
  | `kairos_sync` | Manually refresh the node catalog from your n8n instance (auto-runs on first `kairos_prompt` call) |
104
114
 
@@ -170,7 +180,7 @@ console.log(deployed.workflowId) // now live in n8n
170
180
 
171
181
  ## Benchmark Results
172
182
 
173
- Tested against 20 workflow prompts of varying complexity (simple triggers, multi-step conditional logic, AI agents with memory). Results measure **structural validation pass rate** — whether the generated workflow passes all 23 validator rules, not end-to-end execution correctness.
183
+ Tested against 20 workflow prompts of varying complexity (simple triggers, multi-step conditional logic, AI agents with memory). Results measure **structural validation pass rate** — whether the generated workflow passes all 26 validator rules, not end-to-end execution correctness.
174
184
 
175
185
  ### Before vs After: Template-Seeded Library
176
186
 
@@ -182,7 +192,7 @@ Tested against 20 workflow prompts of varying complexity (simple triggers, multi
182
192
  | Avg generation time | 30.6s | **20.7s** | -32% |
183
193
  | Failures | 0 | 0 | — |
184
194
 
185
- The baseline run used Claude with the 22-rule validator and correction loop but no library. The seeded run used the same validator plus a library of 105 workflows (16 organic + 89 ingested from the n8n community). Template seeding eliminated the correction loop entirely and cut generation time by a third.
195
+ The baseline run used Claude with the 26-rule validator and correction loop but no library. The seeded run used the same validator plus a library of 105 workflows (16 organic + 89 ingested from the n8n community). The broader local development library now contains 286+ generated/ingested workflows. Template seeding eliminated the correction loop entirely and cut generation time by a third.
186
196
 
187
197
  > **Note:** These results confirm that generated workflows are structurally valid and deployable to n8n. They do not verify runtime execution correctness, credential configuration, or whether the workflow output matches user intent.
188
198
 
@@ -195,7 +205,7 @@ The baseline run used Claude with the 22-rule validator and correction loop but
195
205
  1. **Search** — Kairos searches its local workflow library for similar past builds. Matching workflows and their failure patterns are pulled into context.
196
206
  2. **Warn** — Known failure patterns (from library matches and global telemetry rates) are injected into the system prompt so Claude avoids repeating known mistakes.
197
207
  3. **Generate** — Your description is sent to Claude with a detailed system prompt, forcing a `generate_workflow` tool call that produces structured n8n workflow JSON.
198
- 4. **Validate** — The workflow is checked against **23 structural rules** covering node IDs, types, versions, names, positions, connections, forbidden fields, trigger presence, AI connection direction, cycle detection, webhook pairing, and required parameters.
208
+ 4. **Validate** — The workflow is checked against **26 structural rules** covering node IDs, types, versions, names, positions, connections, forbidden fields, trigger presence, AI connection direction, cycle detection, webhook pairing, and required parameters.
199
209
  5. **Correct** — If validation fails, the specific rule violations are sent back to Claude for correction (up to 3 attempts, with tighter temperature on the final try).
200
210
  6. **Strip** — Forbidden server-assigned fields (`id`, `createdAt`, `updatedAt`, etc.) are stripped before deployment.
201
211
  7. **Deploy** — The validated workflow is posted to your n8n instance via REST API.
@@ -205,7 +215,7 @@ The baseline run used Claude with the 22-rule validator and correction loop but
205
215
 
206
216
  1. **Prompt** — Your LLM calls `kairos_prompt`, which searches the library and returns the specialized system prompt, node catalog, library matches, and failure patterns.
207
217
  2. **Generate** — Your LLM generates the workflow JSON itself using that context. No separate API call.
208
- 3. **Validate** — Your LLM calls `kairos_validate`, which checks the JSON against the same 23 structural rules.
218
+ 3. **Validate** — Your LLM calls `kairos_validate`, which checks the JSON against the same 26 structural rules.
209
219
  4. **Correct** — If validation fails, your LLM fixes the issues and calls `kairos_validate` again.
210
220
  5. **Deploy** — Your LLM calls `kairos_deploy`, which strips forbidden fields and posts the workflow to n8n.
211
221
  6. **Record** — The deployed workflow is saved to the local library for future retrieval.
@@ -214,7 +224,7 @@ The baseline run used Claude with the 22-rule validator and correction loop but
214
224
 
215
225
  ## Validator Rules
216
226
 
217
- The 22-rule validator is the core of what makes Kairos reliable. In baseline testing (no library), Claude needed the correction loop 45% of the time. Each rule targets a specific class of error:
227
+ The 26-rule validator is the core of what makes Kairos reliable. In baseline testing (no library), Claude needed the correction loop 45% of the time. Each rule targets a specific class of error:
218
228
 
219
229
  | Rule | Severity | What it checks |
220
230
  |------|----------|----------------|
@@ -241,6 +251,9 @@ The 22-rule validator is the core of what makes Kairos reliable. In baseline tes
241
251
  | 21 | warn | Webhook with responseMode="responseNode" has respondToWebhook |
242
252
  | 22 | warn | Required parameters present for known node types |
243
253
  | 23 | warn | Node type is recognized in the registry (unknown types may not exist in n8n) |
254
+ | 24 | warn | No deprecated `$node["..."]` accessor syntax in expressions |
255
+ | 25 | warn | No `$json.items[n]` array access (n8n flattens items automatically) |
256
+ | 26 | warn | Node references use `.first()` or `.all()` (bare `$('Node').json` throws at runtime) |
244
257
 
245
258
  Errors block deployment. Warnings are recorded and fed back into the prompt for future builds.
246
259
 
@@ -359,6 +372,9 @@ try {
359
372
  for (const issue of err.issues) {
360
373
  console.error(`[Rule ${issue.rule}] ${issue.message}`)
361
374
  }
375
+ // Attempt metadata and warned rules are also available
376
+ console.log(err.attemptMetadata) // per-attempt timing, tokens, issues
377
+ console.log(err.warnedRules) // which pattern rules were warned about
362
378
  } else if (err instanceof GenerationError) {
363
379
  // Anthropic API call failed (auth, quota, timeout)
364
380
  console.error(err.message, err.cause)
@@ -376,7 +392,7 @@ try {
376
392
  |---|---|
377
393
  | `GenerationError` | Anthropic API call failed |
378
394
  | `ResponseParseError` | Claude responded but produced no usable tool call |
379
- | `ValidationError` | Workflow failed 22-rule validation after max retries |
395
+ | `ValidationError` | Workflow failed 26-rule validation after max retries (carries `.attemptMetadata` and `.warnedRules`) |
380
396
  | `ProviderError` | Network/auth failure talking to n8n |
381
397
  | `ApiError` | n8n returned a 4xx or 5xx (carries `.statusCode`) |
382
398
  | `GuardError` | Input validation failed (empty description) or `delete()` called without `{ confirm: true }` |
@@ -397,6 +413,10 @@ kairos build "Monitor a webhook and log payloads" --dry-run
397
413
  # Seed library with n8n community templates
398
414
  kairos sync-templates --max 200
399
415
 
416
+ # View pattern analysis
417
+ kairos patterns
418
+ kairos patterns --days 60 --json
419
+
400
420
  # Manage workflows
401
421
  kairos list
402
422
  kairos get <workflow-id>
@@ -438,7 +458,22 @@ telemetry: '/path/to/telemetry/dir'
438
458
 
439
459
  Each event includes timestamp, session ID, token counts, validation issues, and duration — useful for benchmarking and analyzing the correction loop.
440
460
 
441
- Kairos also reads telemetry data to compute **per-rule failure rates** across all builds. Rules that fail frequently (>= 15% of builds) are automatically surfaced as warnings in the generation prompt, helping Claude avoid systemic issues. Failure rates use distinct session counting to avoid inflation from retry loops, and results are cached for 5 minutes.
461
+ ### Pattern Learning
462
+
463
+ When telemetry is enabled, Kairos runs a **pattern analyzer** that learns from every build — successes and failures. The analyzer produces `patterns.json` which is fed back into future generations:
464
+
465
+ - **Composite scoring** — patterns are scored using `rawConfidence × impact × recency × (1 + stickinessBoost)`, so frequent, recent, sticky failures rank highest
466
+ - **Stickiness detection** — rules that persist across consecutive failed retry attempts (the LLM can't self-correct) get a scoring boost
467
+ - **State lifecycle** — patterns progress through `draft → confirmed → resolved`, with per-rule resolved thresholds (5 clean builds) and 90-day TTL on resolved patterns
468
+ - **Regression detection** — if a resolved rule starts failing again, it's flagged as regressed and prioritized in the prompt
469
+ - **Warning effectiveness** — tracks whether warning the LLM about a rule actually prevented the failure, with per-rule pass/fail rates
470
+ - **Schema migration** — pattern data auto-migrates across versions (currently v2) so no accumulated knowledge is lost on upgrades
471
+ - **Rule co-occurrence** — identifies pairs of rules that commonly fail together (e.g., rules 5+17 always break at the same time)
472
+ - **Session depth analysis** — tracks how many attempts each session needed (e.g., 80% are 1-attempt, 15% need 2, 5% need all 3)
473
+ - **Warning cap** — max 10 patterns in the LLM prompt, prioritized: regressed > confirmed > drafts
474
+ - **Analysis history** — each analysis run appends a summary to `pattern-history.jsonl` for trend tracking over time
475
+
476
+ Run `kairos patterns` to view the current analysis, or `kairos patterns --json` for raw output.
442
477
 
443
478
  For CLI usage, set `KAIROS_TELEMETRY=true` in your environment.
444
479