@kindlm/core 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +108 -0
  2. package/package.json +3 -2
package/README.md ADDED
@@ -0,0 +1,108 @@
1
+ # KindLM
2
+
3
+ ![CI](https://github.com/petr-kin/kindlm/actions/workflows/ci.yml/badge.svg)
4
+
5
+ Behavioral regression testing for AI agents. Test what your agents **do** — not just what they say.
6
+
7
+ ## Why KindLM?
8
+
9
+ LLM evals measure text quality. KindLM tests **behavior** — the tool calls your agent makes, the decisions it takes, and whether it leaks PII or violates compliance rules. It runs in CI so regressions never ship.
10
+
11
+ ## Features
12
+
13
+ - **Tool call assertions** — verify agents call the right tools with the right arguments, in the right order
14
+ - **Schema validation** — structured output checked against JSON Schema (AJV)
15
+ - **PII detection** — catch leaked SSNs, credit cards, emails, phone numbers, IBANs
16
+ - **LLM-as-judge** — score responses against natural-language criteria (0.0–1.0)
17
+ - **Drift detection** — semantic + field-level comparison against saved baselines
18
+ - **Keyword guards** — require or forbid specific phrases in output
19
+ - **Latency & cost budgets** — fail tests that exceed time or token-cost thresholds
20
+ - **EU AI Act compliance** — generate Annex IV documentation from test results
21
+ - **CI-native** — exit code 0/1, JUnit XML reporter, GitHub Actions ready
22
+
23
+ ## Supported Providers
24
+
25
+ | Provider | Example config |
26
+ |----------|---------------|
27
+ | OpenAI | `openai:gpt-4o` |
28
+ | Anthropic | `anthropic:claude-sonnet-4-5-20250929` |
29
+ | Ollama | `ollama:llama3` |
30
+ | Google Gemini | `google:gemini-2.0-flash` |
31
+ | AWS Bedrock | `bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0` |
32
+ | Azure OpenAI | `azure:my-gpt4o-deployment` |
33
+
34
+ ## Quick Start
35
+
36
+ ```bash
37
+ npm install -g @kindlm/cli
38
+ kindlm init
39
+ ```
40
+
41
+ Edit the generated `kindlm.yaml`:
42
+
43
+ ```yaml
44
+ version: "1"
45
+ defaults:
46
+ provider: openai:gpt-4o
47
+ temperature: 0
48
+ runs: 3
49
+
50
+ suites:
51
+ - name: refund-agent
52
+ system_prompt: "You are a refund support agent."
53
+ tests:
54
+ - name: looks-up-order
55
+ input: "I want to return order #12345"
56
+ assert:
57
+ - type: tool_called
58
+ value: lookup_order
59
+ - type: no_pii
60
+ - type: judge
61
+ criteria: "Response is empathetic and professional"
62
+ threshold: 0.8
63
+ ```
64
+
65
+ Run your tests:
66
+
67
+ ```bash
68
+ kindlm test
69
+ ```
70
+
71
+ Output:
72
+
73
+ ```
74
+ refund-agent
75
+ ✓ looks-up-order (3/3 runs passed)
76
+ ✓ tool_called: lookup_order
77
+ ✓ no_pii
78
+ ✓ judge: 0.92 ≥ 0.8
79
+
80
+ 1 suite, 1 test, 3 assertions — all passed
81
+ ```
82
+
83
+ ## CI Integration
84
+
85
+ ```yaml
86
+ # .github/workflows/test.yml
87
+ - run: npm install -g @kindlm/cli
88
+ - run: kindlm test --reporter junit --output results.xml
89
+ ```
90
+
91
+ ## Repository Layout
92
+
93
+ ```
94
+ packages/
95
+ core/ @kindlm/core — Business logic, zero I/O dependencies
96
+ cli/ @kindlm/cli — CLI entry point
97
+ cloud/ @kindlm/cloud — Cloudflare Workers API + D1 database
98
+ docs/ Technical specs and documentation
99
+ site/ Documentation website (Next.js)
100
+ ```
101
+
102
+ ## Documentation
103
+
104
+ Full docs: [kindlm.dev](https://kindlm.dev) | Source: [`docs/`](./docs/)
105
+
106
+ ## License
107
+
108
+ MIT (core + CLI) | AGPL (cloud)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kindlm/core",
3
- "version": "0.2.0",
3
+ "version": "0.2.1",
4
4
  "type": "module",
5
5
  "license": "MIT",
6
6
  "description": "Core engine for KindLM — behavioral regression testing for AI agents",
@@ -36,7 +36,8 @@
36
36
  "types": "./dist/index.d.ts",
37
37
  "sideEffects": false,
38
38
  "files": [
39
- "dist"
39
+ "dist",
40
+ "README.md"
40
41
  ],
41
42
  "scripts": {
42
43
  "build": "tsup",