npm - @princetheprogrammerbtw/husk - Versions diffs - 0.1.1 → 0.2.0 - Mend

@princetheprogrammerbtw/husk 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -3,13 +3,16 @@
 > The agent harness that gives your LLM memory, hands, and a nervous system.
 [![npm version](https://img.shields.io/npm/v/%40princetheprogrammerbtw%2Fhusk.svg)](https://www.npmjs.com/package/@princetheprogrammerbtw/husk)
+[![npm downloads](https://img.shields.io/npm/dm/%40princetheprogrammerbtw%2Fhusk.svg)](https://www.npmjs.com/package/@princetheprogrammerbtw/husk)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![Node](https://img.shields.io/node/v/%40princetheprogrammerbtw%2Fhusk.svg)](https://nodejs.org)
-[![CI](https://github.com/10xdev4u-alt/husk/actions/workflows/ci.yml/badge.svg)](./.github/workflows/ci.yml)
+[![CI](https://github.com/10xdev4u-alt/husk/actions/workflows/ci.yml/badge.svg)](https://github.com/10xdev4u-alt/husk/actions/workflows/ci.yml)
+[![GitHub stars](https://img.shields.io/github/stars/10xdev4u-alt/husk.svg)](https://github.com/10xdev4u-alt/husk/stargazers)
+[![Bundle size](https://img.shields.io/bundlephobia/minzip/%40princetheprogrammerbtw%2Fhusk.svg)](https://bundlephobia.com/package/@princetheprogrammerbtw/husk)
 ## What is Husk?
-Most LLM calls are a brain in a jar — they can think, but can't act, remember, verify their own work, or show you what they did. **Husk** is the body, hands, memory, and nervous system you wrap around any LLM (Claude, GPT, Gemini, local models) to turn it into a real agent.
+Most LLM calls are a **brain in a jar** — they can think, but can't act, remember, verify their own work, or show you what they did. **Husk** is the body, hands, memory, and nervous system you wrap around any LLM (Claude, GPT, Gemini, local models) to turn it into a real agent.
 ```ts
 import { Agent, AnthropicProvider, Read, Write, Edit, Bash, Grep, FileStore } from '@princetheprogrammerbtw/husk';
@@ -31,16 +34,27 @@ const result = await agent.run('Review src/core/agent.ts');
 console.log(result.output);
 ```
+## Why Husk?
+| You're used to… | Husk gives you… |
+|---|---|
+| One-shot LLM calls with no memory | Persistent file-backed or in-memory memory across calls |
+| Hand-rolled tool-calling loops | A small, typed event stream you can subscribe to |
+| Tied to one provider's SDK | Provider-agnostic core; swap Anthropic ↔ OpenAI in one line |
+| Reinventing agent loops in every project | Drop-in `Agent` class with stop conditions, parallel tool execution, and error recovery |
+| No observability into what the model actually did | Typed events for every iteration, tool call, and provider response |
 ## Features
 - 🧠 **Provider-agnostic** — Anthropic, OpenAI, more coming. Bring your own model.
-- 🛠️ **5 built-in tools** — `Read`, `Write`, `Edit`, `Bash` (with safety denylist), `Grep` (ripgrep with grep fallback)
+- 🛠️ **5 built-in tools** — `Read`, `Write`, `Edit`, `Bash` (with safety denylist for `rm -rf /`, fork bombs, etc.), `Grep` (ripgrep with grep fallback)
 - 💾 **Memory** — `InMemoryStore` for sessions, `FileStore` for persistence
 - 👀 **Observability** — typed event emitter, drop in any logger or tracer
 - 🧭 **Steering** — system prompts, numbered rules, few-shot examples
 - 🤝 **Sub-agents** — compose agents inside agents (see [multi-agent example](./examples/03-multi-agent))
-- 📦 **Batteries included** — 35KB ESM bundle, full TypeScript types
+- 📦 **Batteries included** — 35KB ESM bundle, 26KB d.ts, zero runtime deps except the provider SDKs
 - 🖥️ **CLI** — `husk run "<prompt>"` for one-shot invocations
+- 🔒 **Type-safe** — strict TypeScript, no `any`, full type definitions shipped
 ## Install
@@ -50,6 +64,8 @@ npm install @princetheprogrammerbtw/husk
 pnpm add @princetheprogrammerbtw/husk
 # or
 bun add @princetheprogrammerbtw/husk
+# or
+yarn add @princetheprogrammerbtw/husk
 ```
 You'll also need an API key for the provider you choose:
@@ -61,7 +77,7 @@ export OPENAI_API_KEY=sk-...           # for GPT
 ## Quickstart
-The smallest possible agent:
+The smallest possible agent — model, prompt, done:
 ```ts
 import { Agent, AnthropicProvider } from '@princetheprogrammerbtw/husk';
@@ -74,6 +90,41 @@ const result = await agent.run('What is the capital of France? Answer in one sen
 console.log(result.output); // "Paris"
 ```
+A more realistic agent — with tools, memory, and steering:
+```ts
+import {
+  Agent, AnthropicProvider, Read, Write, Edit, Bash, Grep,
+  FileStore, InMemoryStore,
+} from '@princetheprogrammerbtw/husk';
+const agent = new Agent({
+  model: new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY }),
+  tools: [Read, Write, Edit, Bash, Grep],
+  memory: new FileStore({ path: './.husk/memory' }),
+  steering: {
+    systemPrompt: 'You are a careful code reviewer.',
+    rules: [
+      'Read the file in full before commenting.',
+      'Cite specific line numbers for every finding.',
+    ],
+  },
+});
+const result = await agent.run('Review src/core/agent.ts');
+```
+Swapping to OpenAI is a one-line change:
+```ts
+import { OpenAIProvider } from '@princetheprogrammerbtw/husk';
+const agent = new Agent({
+  model: new OpenAIProvider({ apiKey: process.env.OPENAI_API_KEY }),
+  // ...same config otherwise
+});
+```
 ## CLI
 ```bash
@@ -84,9 +135,11 @@ husk run "Summarize README.md" --provider openai --model gpt-5
 husk run --help
 ```
+The CLI wraps the same `Agent` class — flags map directly to `AgentConfig` fields.
 ## Examples
-Three worked examples in the `examples/` directory:
+Three worked examples in the [`examples/`](./examples) directory:
 - **[01-hello-agent](./examples/01-hello-agent)** — minimal agent, no tools
 - **[02-code-reviewer](./examples/02-code-reviewer)** — full tool set + steering for code review
@@ -96,9 +149,10 @@ Run any example with `bun run examples/0X-name/index.ts`.
 ## Documentation
-- **[Learning Journal](./LEARNING.md)** — design decisions, trade-offs, and lessons learned
-- **[Changelog](./CHANGELOG.md)** — release history
-- **[Contributing](./CONTRIBUTING.md)** — how to contribute
+- 📓 **[Learning Journal](./LEARNING.md)** — design decisions, trade-offs, and lessons learned while building
+- 📋 **[Changelog](./CHANGELOG.md)** — release history
+- 🤝 **[Contributing](./CONTRIBUTING.md)** — how to contribute
+- 🏗️ **[Architecture](#architecture)** — the module layout, below
 ## Architecture
@@ -111,15 +165,26 @@ src/
 └── index.ts       # public API surface
 ```
-Every piece composes through a typed event stream. The agent loop is ~150 lines. Provider adapters are the only files that know about provider-specific wire formats.
+Every piece composes through a **typed event stream**. The agent loop is ~150 lines. Provider adapters are the only files that know about provider-specific wire formats. Tools are plain objects implementing a 4-field interface — register by passing an array to the Agent.
 ## Roadmap
 - **v0.1.0** ✅ Core loop, Anthropic + OpenAI, 5 built-in tools, memory, observability, CLI
+- **v0.1.1** ✅ CLI shebang fix, version bump
 - **v0.2.0** Eval runner, OTel export, Ollama adapter
 - **v0.3.0** Vector memory, hosted dashboard
 - **v1.0.0** Stable API, marketplace, enterprise features
+## Contributing
+PRs welcome! See [CONTRIBUTING.md](./CONTRIBUTING.md) for the dev setup, scripts, and commit conventions.
+The project follows Conventional Commits. Every commit body explains *why*, not what — the diff already shows what.
+## Show your support
+If Husk saves you time, ⭐️ the [GitHub repo](https://github.com/10xdev4u-alt/husk) — it helps others find the project. Issues, PRs, and feedback all welcome.
 ## License
-MIT © 2026 princetheprogrammerbtw
+MIT © 2026 [princetheprogrammerbtw](https://github.com/10xdev4u-alt)

package/dist/index.d.ts CHANGED Viewed

@@ -525,6 +525,48 @@ declare class OpenAIProvider implements Provider {
     chat(request: ChatRequest): Promise<ChatResponse>;
 }
+/**
+ * Husk — Ollama provider adapter.
+ *
+ * Wraps Ollama's OpenAI-compatible Chat Completions API. Because Ollama
+ * exposes the exact same wire format as OpenAI, we can reuse the OpenAI
+ * adapter internally — only the default model name, base URL, and the
+ * provider 'name' field differ.
+ *
+ * Why this exists: local models (llama3.2, deepseek-r1, qwen2.5, etc.)
+ * are a first-class use case. Privacy, cost, and offline-ability all
+ * matter. Ollama is the dominant local-model runtime and uses the
+ * OpenAI API surface, so the adapter is a thin shell.
+ *
+ * Defaults:
+ *   - model: 'llama3.2' (override via constructor)
+ *   - baseURL: 'http://localhost:11434/v1' (override for remote Ollama)
+ *   - apiKey: 'ollama' (Ollama ignores the value but the OpenAI SDK
+ *     requires a non-empty string)
+ *
+ * Usage:
+ *   const agent = new Agent({ model: new OllamaProvider() });
+ *   const result = await agent.run('Explain quantum entanglement');
+ *
+ * For a list of models: `ollama list` (in your terminal).
+ */
+interface OllamaProviderOptions {
+    /** Model id (run `ollama list` to see what's pulled locally). Default: 'llama3.2'. */
+    readonly model?: string;
+    /** Ollama server URL. Default: 'http://localhost:11434/v1'. */
+    readonly baseURL?: string;
+    /** API key — Ollama ignores this but the OpenAI SDK requires it. Default: 'ollama'. */
+    readonly apiKey?: string;
+}
+declare class OllamaProvider implements Provider {
+    readonly name = "ollama";
+    readonly model: string;
+    private readonly inner;
+    constructor(options?: OllamaProviderOptions);
+    chat(request: Parameters<Provider['chat']>[0]): ReturnType<Provider['chat']>;
+}
 /**
  * Husk — tool registry helpers.
  *
@@ -688,6 +730,232 @@ interface GrepInput {
 }
 declare const Grep: ToolDefinition<GrepInput>;
+/**
+ * Husk — eval runner types and API.
+ *
+ * The eval runner lets users assert that an agent's output meets
+ * expectations. Three primitives:
+ *
+ *   1. EvalCase — an input + the expected outcome (an assertion or a set of them)
+ *   2. Assertion — a function that takes the agent's result and returns pass/fail
+ *   3. EvalSuite — a named collection of eval cases, runnable as a unit
+ *
+ * The design choice: assertions are plain async functions, not a DSL.
+ * Users can use the 4 built-ins (equals, contains, matches, fn) or
+ * write their own. The DSL is intentionally tiny — a heavy DSL
+ * (think Jest matchers) is a maintainability trap.
+ *
+ * Example:
+ *
+ *   const suite = defineSuite({
+ *     name: 'hello-agent',
+ *     cases: [
+ *       {
+ *         name: 'answers geography',
+ *         input: 'What is the capital of France? Answer in one word.',
+ *         assertions: [
+ *           contains('Paris'),
+ *           matches(/^[A-Z][a-z]+$/),  // single capitalized word
+ *         ],
+ *       },
+ *     ],
+ *   });
+ *
+ *   const results = await runSuite(suite, () => new Agent({ model: ... }));
+ *   console.log(`${results.passed}/${results.total} passed`);
+ */
+/**
+ * A function that checks whether an agent's output meets a criterion.
+ * Returns a pass/fail with an optional message explaining the failure.
+ */
+type Assertion = (result: AgentResult) => AssertionResult | Promise<AssertionResult>;
+interface AssertionResult {
+    /** Whether the assertion passed. */
+    readonly pass: boolean;
+    /** Human-readable name shown in eval reports. */
+    readonly name: string;
+    /** Optional message — required when pass is false to explain why. */
+    readonly message?: string;
+}
+/** Output exactly equals the expected string. */
+declare function equals(expected: string): Assertion;
+/** Output contains the expected substring (case-sensitive). */
+declare function contains(needle: string): Assertion;
+/** Output matches the expected regex. */
+declare function matches(pattern: RegExp): Assertion;
+/** Output passes a custom predicate. Use this for shape-based checks. */
+declare function fn(name: string, predicate: (output: string) => boolean, message?: string): Assertion;
+/** Output does NOT contain the given substring. */
+declare function notContains(needle: string): Assertion;
+/** Output length is within bounds. */
+declare function lengthBetween(min: number, max: number): Assertion;
+interface EvalCase {
+    /** Human-readable name shown in eval reports. */
+    readonly name: string;
+    /** The input to pass to agent.run(). */
+    readonly input: string;
+    /** Assertions to run on the result. All must pass for the case to pass. */
+    readonly assertions: readonly Assertion[];
+    /**
+     * Optional max iterations override. Lets you cap runaway agents per-case
+     * without affecting other cases in the suite.
+     */
+    readonly maxIterations?: number;
+}
+interface EvalSuite {
+    /** Suite name shown in reports. */
+    readonly name: string;
+    /** Cases in this suite, run sequentially. */
+    readonly cases: readonly EvalCase[];
+}
+interface CaseResult {
+    readonly caseName: string;
+    readonly passed: boolean;
+    readonly assertionResults: readonly AssertionResult[];
+    readonly agentResult: AgentResult;
+    readonly durationMs: number;
+}
+interface SuiteResult {
+    readonly suiteName: string;
+    readonly results: readonly CaseResult[];
+    readonly passed: number;
+    readonly total: number;
+    readonly durationMs: number;
+}
+/**
+ * Husk — eval runner.
+ *
+ * Takes an EvalSuite + a factory that returns an Agent, runs each
+ * case sequentially, applies the assertions, and reports results.
+ *
+ * Why a factory (not an Agent instance): each case might want its
+ * own agent configuration. The factory pattern gives the user full
+ * control without forcing a specific shape.
+ *
+ * Why sequential (not parallel): LLM calls compete for rate limits
+ * and cost $$$. Sequential gives predictable billing and easier
+ * debugging. Parallel mode is a v0.3.0 addition.
+ *
+ * Failure handling: an agent run that throws an error is reported
+ * as a case failure (not a runner crash). The error message is
+ * included in the assertion results so the user can see what broke.
+ */
+/**
+ * A factory that produces a fresh Agent per case. Called once per
+ * case so each case can have isolated memory, config, etc.
+ */
+type AgentFactory = () => Agent | Promise<Agent>;
+interface RunSuiteOptions {
+    /** Stop on first failing case. Default: false (run all cases regardless). */
+    readonly failFast?: boolean;
+    /** Custom logger for runner-level events. Default: silent. */
+    readonly onCaseStart?: (caseName: string) => void;
+    readonly onCaseEnd?: (result: CaseResult) => void;
+}
+declare function runSuite(suite: EvalSuite, factory: AgentFactory, options?: RunSuiteOptions): Promise<SuiteResult>;
+/**
+ * Build a suite with less boilerplate. Equivalent to constructing
+ * the object inline, but reads more clearly at the call site.
+ */
+declare function defineSuite(suite: {
+    name: string;
+    cases: readonly EvalCase[];
+}): EvalSuite;
+/**
+ * Husk — observability types (tracer interface).
+ *
+ * A minimal, OTel-inspired tracer interface. Husk's events are mapped
+ * to spans by the mapper in ./tracer.ts. Users can plug in the real
+ * @opentelemetry/api tracer via the adapter (see ./otel-adapter.ts)
+ * or any other compatible backend.
+ *
+ * Design choice: we don't depend on @opentelemetry/api directly. The
+ * interface here is a strict subset of OTel's Span interface (just
+ * what's needed for agent observability). Keeping the dep out of
+ * Husk's core means users who don't need OTel pay nothing for it.
+ *
+ * For users who want full OTel:
+ *   import { trace } from '@opentelemetry/api';
+ *   import { toOtelTracer } from '@princetheprogrammerbtw/husk/otel-adapter';
+ *   agent.onAny(toOtelTracer(trace.getTracer('husk')).onEvent);
+ */
+type SpanKind = 'internal' | 'client' | 'server';
+interface SpanContext {
+    /** Unique trace id (all spans in one agent.run share this). */
+    readonly traceId: string;
+    /** Unique span id. */
+    readonly spanId: string;
+    /** Parent span id, if any. */
+    readonly parentSpanId?: string;
+}
+interface SpanOptions {
+    readonly name: string;
+    readonly kind?: SpanKind;
+    readonly attributes?: Readonly<Record<string, unknown>>;
+    readonly startTimeNs?: bigint;
+}
+interface Span {
+    readonly context: SpanContext;
+    /** Record an event (timestamped annotation) on the span. */
+    addEvent(name: string, attributes?: Record<string, unknown>): void;
+    /** Set or update an attribute on the span. */
+    setAttribute(key: string, value: string | number | boolean | null): void;
+    /** Record an exception. */
+    recordException(err: Error): void;
+    /** Mark the span as failed. */
+    setStatus(status: 'ok' | 'error', message?: string): void;
+    /** End the span. Must be called exactly once. */
+    end(endTimeNs?: bigint): void;
+}
+interface Tracer {
+    /**
+     * Start a new span. If parent is provided, the new span becomes a
+     * child of it. Returns the new span; caller is responsible for
+     * calling .end() on it.
+     */
+    startSpan(options: SpanOptions, parent?: SpanContext): Span;
+}
+/**
+ * A tracer that does nothing. Used when no real tracer is configured.
+ * Zero overhead — every method is a no-op, so the cost is one virtual
+ * call per event.
+ */
+declare class NoopTracer implements Tracer {
+    startSpan(_options: SpanOptions, _parent?: SpanContext): Span;
+}
+/**
+ * Husk — agent event → tracer mapper.
+ *
+ * Translates the typed AgentEvent stream into tracer spans. The top-
+ * level 'agent:start' begins a trace, each iteration becomes a child
+ * span, and tool calls become their own spans under the iteration.
+ *
+ * Design: spans are created in startSpanOrder. Tool spans nest under
+ * the iteration span. The end of the agent run ends the trace span.
+ *
+ * Usage:
+ *   const mapper = new EventTracer(myTracer);
+ *   agent.onAny(mapper.onEvent.bind(mapper));
+ *   await agent.run(...);  // emits spans to myTracer
+ */
+declare class EventTracer {
+    private readonly tracer;
+    private traceSpan;
+    private iterationSpan;
+    private toolSpans;
+    constructor(tracer: Tracer);
+    /**
+     * Bind as an event handler: `agent.onAny(tracer.onEvent.bind(tracer))`
+     */
+    onEvent: AgentEventHandler;
+}
 /**
  * Husk — public API entry point.
  *
@@ -699,4 +967,4 @@ declare const Grep: ToolDefinition<GrepInput>;
  */
 declare const VERSION = "0.1.0";
-export { Agent, type AgentConfig, type AgentEvent, AgentEventEmitter, type AgentEventHandler, type AgentResult, AnthropicProvider, type AnthropicProviderOptions, Bash, type BashInput, type ChatChunk, type ChatRequest, type ChatResponse, ConsoleLogger, type ContentBlock, Edit, type EditInput, type Example, FileStore, type FileStoreOptions, Grep, type GrepInput, InMemoryStore, type JSONSchema, type JSONSchemaField, type LogLevel, type Logger, type MemoryStore, type Message, type MessageContent, OpenAIProvider, type OpenAIProviderOptions, type Provider, Read, type ReadInput, type Role, type SteeringConfig, type StopReason, type TextBlock, type TokenUsage, type ToolContext, type ToolDefinition, type ToolResult, type ToolResultBlock, type ToolUseBlock, VERSION, Write, type WriteInput, arrayField, booleanField, buildExampleMessages, buildSystemPrompt, defineTool, integerField, logEventsTo, numberField, objectField, objectSchema, stringField };
+export { Agent, type AgentConfig, type AgentEvent, AgentEventEmitter, type AgentEventHandler, type AgentFactory, type AgentResult, AnthropicProvider, type AnthropicProviderOptions, type Assertion, type AssertionResult, Bash, type BashInput, type CaseResult, type ChatChunk, type ChatRequest, type ChatResponse, ConsoleLogger, type ContentBlock, Edit, type EditInput, type EvalCase, type EvalSuite, EventTracer, type Example, FileStore, type FileStoreOptions, Grep, type GrepInput, InMemoryStore, type JSONSchema, type JSONSchemaField, type LogLevel, type Logger, type MemoryStore, type Message, type MessageContent, NoopTracer, OllamaProvider, type OllamaProviderOptions, OpenAIProvider, type OpenAIProviderOptions, type Provider, Read, type ReadInput, type Role, type RunSuiteOptions, type Span, type SpanContext, type SpanKind, type SpanOptions, type SteeringConfig, type StopReason, type SuiteResult, type TextBlock, type TokenUsage, type ToolContext, type ToolDefinition, type ToolResult, type ToolResultBlock, type ToolUseBlock, type Tracer, VERSION, Write, type WriteInput, arrayField, booleanField, buildExampleMessages, buildSystemPrompt, contains, defineSuite, defineTool, equals, fn, integerField, lengthBetween, logEventsTo, matches, notContains, numberField, objectField, objectSchema, runSuite, stringField };