npm - @denizokcu/haze - Versions diffs - 0.1.0 → 0.2.0 - Mend

@denizokcu/haze 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/CHANGELOG.md +17 -0
package/README.md +14 -13
package/dist/cli/commands/chat.js +1 -1
package/dist/cli/commands/formatters.js +19 -3
package/dist/cli/commands/streaming.js +7 -5
package/dist/core/agent/compaction.js +3 -1
package/dist/core/goal/completionPolicy.d.ts +2 -1
package/dist/core/goal/completionPolicy.js +17 -10
package/dist/core/safety/bashClassifier.d.ts +10 -0
package/dist/core/safety/bashClassifier.js +51 -0
package/dist/core/subagent/subagentRunner.d.ts +1 -1
package/dist/core/subagent/subagentRunner.js +9 -8
package/dist/core/validation/outputParser.d.ts +12 -0
package/dist/core/validation/outputParser.js +79 -0
package/dist/llm/hazeTools.d.ts +19 -7
package/dist/llm/hazeTools.js +66 -26
package/dist/llm/systemPrompt.js +72 -34
package/dist/llm/toolResultTypes.d.ts +38 -0
package/dist/llm/toolResultTypes.js +9 -0
package/dist/skills/builder/SkillBuilder.js +6 -8
package/dist/ui/components/TextInput.d.ts +2 -1
package/dist/ui/components/TextInput.js +95 -7
package/package.json +2 -1

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,23 @@
 ## Unreleased
+## 0.2.0 - 2026-06-07
+- Improved coding-loop reliability with stronger continuation behavior after failed edits, failed validation, missing validation, tool-budget interruptions, and incomplete assistant responses.
+- Added structured bash command classification for read-only, mutating, destructive, network, validation, and unknown commands, with cwd, duration, timeout, and classification metadata in bash results.
+- Added validation-output parsing for common test, typecheck, lint, and build commands, including failed files, failed tests, diagnostics, summaries, and suggested next steps.
+- Added shared structured tool result types and more specific file-edit failure reason codes so edit recovery can reread the affected file and retry with better guidance.
+- Reworked the system prompt, subagent prompt, compaction prompt, and generated-skill guidance around autonomous expert developer workflows with concise final status reporting.
+- Removed hard-coded `temperature: 0` from model calls so providers/models that reject temperature options can run without warning workarounds.
+- Removed bash confirmation gates, including for destructive classifications; Haze now assumes expert users know what they asked for and relies on transparent tool output rather than permission prompts.
+- Improved chat input editing with wrapped multi-line display, vertical cursor movement across wrapped lines, and better cursor mapping for compacted paste blocks.
+- Added and updated tests for bash classification, bash execution behavior, validation parsing, edit recovery, system-prompt guidance, and skill generation.
+## 0.1.1 - 2026-06-07
+- Bundled ripgrep with `@vscode/ripgrep` and updated the `grep` tool to use the package-provided binary path, removing the requirement for users to install `rg` separately or expose it on `PATH`.
+- Updated release documentation and site copy for the 0.1.1 patch release.
 ## 0.1.0 - 2026-06-07
 - Added ripgrep-backed `grep` for fast workspace search with regex, glob, context-line, case-insensitive, and result-limit options.

package/README.md CHANGED Viewed

@@ -2,16 +2,17 @@
 A minimal LLM harness for your terminal.
-## What's new in 0.1.0
+## What's new in 0.2.0
-Haze 0.1.0 is the foundation release: the agent can now *find*, *delegate*, and *show its work* without turning your terminal into soup.
+Haze 0.2.0 is a reliability release for the everyday coding loop: inspect, edit, validate, and report what happened without getting timid or noisy.
-- `grep` gives the model fast, targeted codebase search with regex, globs, context lines, and `.gitignore` awareness — no more brute-force file spelunking.
-- Subagents let Haze fan out independent investigations into fresh contexts, then fold the result back into the main turn as a concise summary.
-- File edits now render compact, colorized inline diffs with one context line around the change; big diffs stay summarized so signal beats scrollback.
-- Long-turn handling is calmer: truncated model output and tool-heavy loops recover more gracefully.
+- The agent loop is more persistent after failed edits, failed validation, missing validation, and tool-heavy turns. Haze now pushes toward a concrete final status instead of stopping at a vague recap.
+- Bash execution now includes command classification, working directory, duration, timeout state, and parsed validation summaries for common test/typecheck/lint/build output.
+- File-tool failures carry structured reason codes and recovery hints, making exact-edit failures easier for the model to repair with a fresh read and targeted retry.
+- The system and subagent prompts now assume expert users: relevant commands should run directly, including mutating shell workflows, while blockers are reserved for concrete tool failures or real ambiguity.
+- The chat input wraps across multiple visible lines and supports vertical cursor movement, which makes longer prompts and pasted context easier to edit.
-The result is a more capable agent loop while keeping the core small and inspectable. Haze gives an AI model transparent local tools — read, search, edit, write, list, and run commands — plus focused delegation when work can split safely. Tiny spell, sharper goblin.
+The result is a sharper supervised coding loop while keeping the core small and inspectable. Haze gives an AI model transparent local tools — read, search, edit, write, list, and run commands — plus focused delegation when work can split safely. Tiny spell, steadier goblin.
 Haze works with OpenAI-compatible providers, including OpenRouter and local endpoints. Use `/provider` to choose or add one, then `/model` to select a model.
@@ -24,7 +25,7 @@ Haze works with OpenAI-compatible providers, including OpenRouter and local endp
  |_| |_|\__,_/___\___|
 ```
-Haze keeps guardrails light. The LLM can work from the terminal with freedoms close to yours, while trying to stay scoped to the current project. Watch the tool calls. Keep your hands near the wheel. Progress.
+Haze keeps guardrails light. The LLM can work from the terminal with freedoms close to yours, while trying to stay scoped to the current project. It is aimed at developers who want an expert-oriented tool, not a permission dialog factory. Watch the tool calls. Keep your hands near the wheel. Progress.
 ## Getting started
@@ -77,7 +78,7 @@ Open a project and ask for work:
 create a calculator in calc-app in ruby with add subtract multiply divide
 ```
-Haze will inspect, search, write files, run commands, and show compact tool activity inline. Small file edits include a colorized line diff with one context line before and after the change; large diffs stay summarized so the transcript does not become a wall of noise. Sessions are saved by default so you can resume the latest workspace conversation with `haze --continue` or `/resume`.
+Haze will inspect, search, write files, run commands, and show compact tool activity inline. Small file edits include a colorized line diff with one context line before and after the change; large diffs stay summarized so the transcript does not become a wall of noise. Bash validation output is summarized when possible so failures point at the relevant files, tests, or diagnostics. Sessions are saved by default so you can resume the latest workspace conversation with `haze --continue` or `/resume`.
 Use `/` to discover commands and skills. `Tab` completes the top suggestion.
@@ -194,10 +195,10 @@ Haze exposes a deliberately small toolset:
 - `editFile` — unique text replacements, with line-number-prefix tolerance for common model mistakes.
 - `replaceLines` — line-range edits when exact replacements are awkward; slightly-too-large EOF ranges are clamped.
 - `writeFile` — create files and parent directories.
-- `bash` — run tests, builds, git commands, and inspections.
+- `bash` — run tests, builds, git commands, inspections, scripts, installs, and other shell workflows with command classification metadata.
 - `skill_*` — load Markdown skill instructions on demand.
-Tool calls are grouped in the transcript so you can see what happened without reading a novella. Successful targeted file edits show a compact diff with colored additions/removals and one context line around the change when the diff is small; larger diffs are summarized with a pointer to `git diff`. File-tool failures return structured recovery hints instead of mystery stack traces.
+Tool calls are grouped in the transcript so you can see what happened without reading a novella. Successful targeted file edits show a compact diff with colored additions/removals and one context line around the change when the diff is small; larger diffs are summarized with a pointer to `git diff`. File-tool failures return structured reason codes and recovery hints instead of mystery stack traces. Bash validation commands can return parsed summaries with failed files, failed tests, diagnostics, and suggested next steps.
 ## Subagents
@@ -223,8 +224,8 @@ Use `AGENTS.md` for project conventions, commands, architecture notes, and thing
 - File tools are restricted to the current workspace.
 - File tools follow `.gitignore` by default.
 - Ignored files require an explicit override.
-- Bash mutations are discouraged by the tool contract.
-- Destructive actions should require explicit user confirmation.
+- Bash commands are classified and shown with working-directory metadata, but Haze does not use command confirmation gates.
+- Mutating and destructive commands can run when they are relevant to the user's request; this is intentional for expert users.
 - Haze is powerful enough to help and dumb enough to deserve supervision. Ideal software, basically.
 ## Local development

package/dist/cli/commands/chat.js CHANGED Viewed

@@ -721,7 +721,7 @@ function ChatScreen({ debug = false, version, continueSession = false, noSession
     ];
     return _jsxs(Box, { flexDirection: "column", children: [_jsx(Static, { items: staticItems, children: item => item.kind === 'header'
                     ? _jsx(Header, { subtitle: item.subtitle, version: version }, item.key)
-                    : _jsx(MessageView, { message: item.message, width: width }, item.key) }), activeLiveMessages.length > 0 && _jsx(Box, { flexDirection: "column", flexShrink: 0, children: activeLiveMessages.map((message, index) => _jsx(MessageView, { message: message, width: width }, messageKey(message, index))) }), debug && debugLogs.length > 0 && _jsxs(Box, { flexDirection: "column", flexShrink: 0, marginBottom: 1, borderStyle: "round", borderColor: theme.muted, paddingX: 1, children: [_jsx(Text, { color: theme.muted, bold: true, children: "Debug" }), debugLogs.map((line, index) => _jsxs(Text, { color: theme.muted, children: ["\u2022 ", line] }, index))] }), queuedFollowUps.length > 0 && _jsxs(Box, { flexDirection: "column", flexShrink: 0, marginBottom: 1, children: [_jsx(Text, { color: theme.muted, children: "Queued follow-ups:" }), queuedFollowUps.map((item, index) => _jsxs(Text, { color: theme.muted, dimColor: true, children: ["  ", index + 1, ". ", item] }, `${index}-${item}`))] }), busy && _jsx(Box, { flexShrink: 0, marginBottom: 1, children: _jsxs(Text, { children: [_jsxs(Text, { color: theme.orange, bold: true, children: [_jsx(Spinner, { type: "dots" }), " ", busyLabel] }), _jsx(Text, { color: theme.muted, dimColor: true, children: " \u00B7 type to queue follow-up \u00B7 esc to interrupt" })] }) }), goalText && _jsx(Box, { flexShrink: 0, children: _jsxs(Text, { wrap: "truncate-end", children: [_jsx(Text, { color: theme.blue, bold: true, children: "Goal:" }), _jsxs(Text, { color: "white", children: [" ", goalRequest] }), goalStatusText ? _jsxs(Text, { color: theme.orange, children: [" \u00B7 ", goalStatusText] }) : null] }) }), _jsx(Box, { borderStyle: "round", borderColor: theme.deepPurple, paddingX: 1, flexShrink: 0, children: _jsx(Box, { flexGrow: 1, minWidth: 0, children: _jsx(TextInput, { placeholder: placeholder, disabled: busy && mode !== 'chat', mask: mode === 'providerAddKey', historyItems: inputHistory, recordHistory: mode === 'chat', suggestions: inputSuggestions, suggestionMode: mode === 'provider' || mode === 'providerAction' || mode === 'model' ? 'always' : 'slash', submitOnEmpty: mode === 'providerAddKey', onHistoryAdd: persistInputHistory, onCancel: cancelThinking, onEscape: () => {
+                    : _jsx(MessageView, { message: item.message, width: width }, item.key) }), activeLiveMessages.length > 0 && _jsx(Box, { flexDirection: "column", flexShrink: 0, children: activeLiveMessages.map((message, index) => _jsx(MessageView, { message: message, width: width }, messageKey(message, index))) }), debug && debugLogs.length > 0 && _jsxs(Box, { flexDirection: "column", flexShrink: 0, marginBottom: 1, borderStyle: "round", borderColor: theme.muted, paddingX: 1, children: [_jsx(Text, { color: theme.muted, bold: true, children: "Debug" }), debugLogs.map((line, index) => _jsxs(Text, { color: theme.muted, children: ["\u2022 ", line] }, index))] }), queuedFollowUps.length > 0 && _jsxs(Box, { flexDirection: "column", flexShrink: 0, marginBottom: 1, children: [_jsx(Text, { color: theme.muted, children: "Queued follow-ups:" }), queuedFollowUps.map((item, index) => _jsxs(Text, { color: theme.muted, dimColor: true, children: ["  ", index + 1, ". ", item] }, `${index}-${item}`))] }), busy && _jsx(Box, { flexShrink: 0, marginBottom: 1, children: _jsxs(Text, { children: [_jsxs(Text, { color: theme.orange, bold: true, children: [_jsx(Spinner, { type: "dots" }), " ", busyLabel] }), _jsx(Text, { color: theme.muted, dimColor: true, children: " \u00B7 type to queue follow-up \u00B7 esc to interrupt" })] }) }), goalText && _jsx(Box, { flexShrink: 0, children: _jsxs(Text, { wrap: "truncate-end", children: [_jsx(Text, { color: theme.blue, bold: true, children: "Goal:" }), _jsxs(Text, { color: "white", children: [" ", goalRequest] }), goalStatusText ? _jsxs(Text, { color: theme.orange, children: [" \u00B7 ", goalStatusText] }) : null] }) }), _jsx(Box, { borderStyle: "round", borderColor: theme.deepPurple, paddingX: 1, flexShrink: 0, children: _jsx(Box, { flexGrow: 1, minWidth: 0, children: _jsx(TextInput, { placeholder: placeholder, disabled: busy && mode !== 'chat', mask: mode === 'providerAddKey', historyItems: inputHistory, recordHistory: mode === 'chat', suggestions: inputSuggestions, suggestionMode: mode === 'provider' || mode === 'providerAction' || mode === 'model' ? 'always' : 'slash', submitOnEmpty: mode === 'providerAddKey', width: Math.max(20, width - 4), onHistoryAdd: persistInputHistory, onCancel: cancelThinking, onEscape: () => {
                             if (busy)
                                 cancelThinking();
                             else

package/dist/cli/commands/formatters.js CHANGED Viewed

@@ -50,8 +50,17 @@ export function toolResultSummary(event) {
         const count = output.totalMatches;
         return count === 0 ? 'no matches' : `${count} match${count === 1 ? '' : 'es'}`;
     }
-    if (typeof output?.code === 'number')
-        return `exited with code ${output.code}`;
+    if (typeof output?.validationSummary === 'object' && output.validationSummary != null && 'summaryText' in output.validationSummary) {
+        const summary = output.validationSummary;
+        const next = typeof summary.suggestedNextStep === 'string' ? `; next: ${summary.suggestedNextStep}` : '';
+        return `${String(summary.summaryText)}${next}`;
+    }
+    if (typeof output?.code === 'number') {
+        const risk = typeof output.classification?.riskLevel === 'string'
+            ? ` (${output.classification.riskLevel})`
+            : '';
+        return `exited with code ${output.code}${risk}`;
+    }
     if (typeof output?.status === 'string' && typeof output?.summary === 'string') {
         const summary = output.summary.split('\n')[0] ?? '';
         const preview = summary.length > 120 ? `${summary.slice(0, 120).trimEnd()}…` : summary;
@@ -69,7 +78,8 @@ export function toolResultSummary(event) {
             }
             return 'completed';
         }
-        return typeof output.error === 'string' ? `failed: ${compact(output.error)}` : 'failed';
+        const reason = typeof output.reasonCode === 'string' ? ` (${output.reasonCode})` : '';
+        return typeof output.error === 'string' ? `failed${reason}: ${compact(output.error)}` : `failed${reason}`;
     }
     return 'completed';
 }
@@ -82,7 +92,13 @@ export function toolOutputDetails(value) {
     const output = value;
     const stdout = output.stdout?.text?.trim();
     const stderr = output.stderr?.text?.trim();
+    const meta = [
+        output.cwd ? `cwd: ${output.cwd}` : '',
+        output.classification?.riskLevel ? `classification: ${output.classification.riskLevel}${output.classification.reason ? ` — ${output.classification.reason}` : ''}` : '',
+        output.validationSummary?.summaryText ? `validation: ${output.validationSummary.summaryText}${output.validationSummary.suggestedNextStep ? `\nnext: ${output.validationSummary.suggestedNextStep}` : ''}` : '',
+    ].filter(Boolean).join('\n');
     const parts = [
+        meta,
         stdout ? `stdout:\n${compact(stdout, 1200)}` : '',
         stderr ? `stderr:\n${compact(stderr, 1200)}` : '',
     ].filter(Boolean);

package/dist/cli/commands/streaming.js CHANGED Viewed

@@ -126,7 +126,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
         const likelyPlanImplementationRequest = isPlanImplementationRequest(value);
         const likelyActionRequest = isActionRequest(value);
         const likelyValidationRequest = isValidationRequest(value);
-        const planImplementationGuidance = 'When implementing a plan file, first identify the concrete required checklist items and compare them with the current files. Do not edit source or tests when the required behavior is already present. Implement the smallest clearly required phase or required items, skip optional/design-question items unless explicitly requested, add tests rather than exploratory one-off scripts where possible, use file tools (not bash) for any file changes, run validation once after code/test edits, then update plan status with file tools if requested. Do not call unresolved optional scope a blocker.';
+        const planImplementationGuidance = 'Haze internal guidance for implementing plan files. The original user request remains authoritative. First identify the concrete required checklist items and compare them with the current files. Do not edit source or tests when the required behavior is already present. Implement the smallest clearly required phase or required items, skip optional/design-question items unless explicitly requested, add tests rather than exploratory one-off scripts where possible, prefer file tools for source changes, run validation once after code/test edits, then update plan status with file tools if requested. Do not call unresolved optional scope a blocker.';
         const requestMessages = retryingExistingRequest
             ? callbacks.getConversation()
             : likelyPlanImplementationRequest
@@ -151,6 +151,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
         let completionContinuationCount = 0;
         const maxCompletionContinuations = COMPLETION_CONTINUATION_LIMIT;
         let editRecoveryPath;
+        let editRecoveryReasonCode;
         let editRecoveryReadSatisfied = false;
         const toolSummaries = [];
         const visibleAssistantTexts = new Set();
@@ -294,6 +295,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
             if (!ok && ['editFile', 'replaceLines', 'writeFile'].includes(event.toolCall.toolName)) {
                 editFileFailed = true;
                 editRecoveryPath = path;
+                editRecoveryReasonCode = typeof event.output === 'object' && event.output != null && 'reasonCode' in event.output && typeof event.output.reasonCode === 'string' ? event.output.reasonCode : undefined;
                 editRecoveryReadSatisfied = false;
             }
             if (ok && ['listFiles', 'readFile'].includes(event.toolCall.toolName))
@@ -305,6 +307,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
                 mutatingToolSucceeded = true;
                 if (!path || path === editRecoveryPath) {
                     editRecoveryPath = undefined;
+                    editRecoveryReasonCode = undefined;
                     editRecoveryReadSatisfied = false;
                     editFileFailed = false;
                 }
@@ -329,7 +332,6 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
             ];
             const followUp = streamText({
                 model: activeModel,
-                temperature: 0,
                 maxOutputTokens: DEFAULT_MAX_OUTPUT_TOKENS,
                 system: buildSystemPrompt(contextFiles),
                 messages: continuationMessages,
@@ -363,7 +365,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
                             activeTools: ['readFile'],
                             messages: [
                                 ...messages,
-                                { role: 'user', content: `A previous edit failed for ${editRecoveryPath}. Before any further edit or bash inspection, call readFile on exactly ${editRecoveryPath}. Bash/cat does not satisfy this recovery step.` },
+                                { role: 'user', content: `A previous edit failed for ${editRecoveryPath}${editRecoveryReasonCode ? ` (${editRecoveryReasonCode})` : ''}. Before any further edit or bash inspection, call readFile on exactly ${editRecoveryPath}. Bash/cat does not satisfy this recovery step.` },
                             ],
                         };
                     }
@@ -436,7 +438,6 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
         let lastFinishReason;
         const result = streamText({
             model: activeModel,
-            temperature: 0,
             maxOutputTokens: DEFAULT_MAX_OUTPUT_TOKENS,
             system: buildSystemPrompt(contextFiles),
             messages: requestMessages,
@@ -467,7 +468,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
                         activeTools: ['readFile'],
                         messages: [
                             ...messages,
-                            { role: 'user', content: `A previous edit failed for ${editRecoveryPath}. Before any further edit or bash inspection, call readFile on exactly ${editRecoveryPath}. Bash/cat does not satisfy this recovery step.` },
+                            { role: 'user', content: `A previous edit failed for ${editRecoveryPath}${editRecoveryReasonCode ? ` (${editRecoveryReasonCode})` : ''}. Before any further edit or bash inspection, call readFile on exactly ${editRecoveryPath}. Bash/cat does not satisfy this recovery step.` },
                         ],
                     };
                 }
@@ -597,6 +598,7 @@ export async function runAgentTurn(value, displayValue, contextFiles, callbacks,
             validationToolFailed,
             editFileFailed,
             editRecoveryPath,
+            editRecoveryReasonCode,
         });
         let decision = decideCompletion(combinedAssistantText);
         async function runCompletionLoop(seedConversation, seedText) {

package/dist/core/agent/compaction.js CHANGED Viewed

@@ -18,7 +18,9 @@ export function compactModelMessages(messages, options = {}) {
         return text ? `- ${message.role}: ${text.slice(0, 500)}` : '';
     }).filter(Boolean).join('\n');
     const summary = [
-        'Compacted prior Haze conversation. Continue preserving the user goal, constraints, decisions, files touched, validation results, and unresolved next steps from this summary.',
+        'Compacted prior Haze conversation. Treat this as continuity context, not a new user request.',
+        'Preserve especially: current user goal and success condition; explicit user constraints/preferences/decisions; files created/changed/read; validation commands and pass/fail results; blockers or pending product decisions; exact next action if work was unfinished.',
+        'Do not treat older tool outputs as current unless the recent conversation confirms they still apply.',
         options.instructions ? `User compaction instructions: ${options.instructions}` : undefined,
         '',
         'Older context summary:',

package/dist/core/goal/completionPolicy.d.ts CHANGED Viewed

@@ -13,6 +13,7 @@ export interface CompletionPolicyInput {
     validationToolFailed: boolean;
     editFileFailed: boolean;
     editRecoveryPath?: string;
+    editRecoveryReasonCode?: string;
 }
 export interface CompletionDecision {
     needsActionContinuation: boolean;
@@ -25,4 +26,4 @@ export interface CompletionDecision {
 export declare function completionDecision(input: CompletionPolicyInput): CompletionDecision;
 export declare function toolLoopBudgetPrompt(): string;
 export declare function postContinuationPrompt(): string;
-export declare function noTextAfterToolPrompt(allowTools: boolean): "Continue the original request now. If it asks for a change, edit or write the necessary files. If it asks to run or verify tests, run the command. Do not provide only a retrospective summary unless blocked." | "Continue from the tool result and answer my original request. Do not call tools. Summarize only current-turn changes and validation; do not recap unrelated earlier tasks.";
+export declare function noTextAfterToolPrompt(allowTools: boolean): "Continue the original request now. If it asks for a change, edit or write the necessary files. If it asks to run or verify tests, run the command. Do not provide only a retrospective summary unless blocked or needing a user decision." | "Continue from the tool result and answer my original request. Do not call tools. Use the final status template for implementation-like requests; summarize only current-turn changes and validation; do not recap unrelated earlier tasks.";

package/dist/core/goal/completionPolicy.js CHANGED Viewed

@@ -44,23 +44,30 @@ export function completionDecision(input) {
         && !requestCompletedByTools
         && !input.validationToolSucceeded
         && !assistantReportsBlocker;
+    const stateLines = [
+        `User goal: ${input.request}`,
+        input.editRecoveryPath ? `Edit recovery path: ${input.editRecoveryPath}` : undefined,
+        input.editRecoveryReasonCode ? `Edit failure reason: ${input.editRecoveryReasonCode}` : undefined,
+        input.mutatingToolSucceeded ? 'Files changed in this turn: yes' : 'Files changed in this turn: no',
+        input.validationToolSucceeded ? 'Validation status: passed' : input.validationToolFailed ? 'Validation status: failed' : 'Validation status: not run',
+    ].filter((line) => line !== undefined).join('\n');
     let continuationPrompt;
     if (input.editFileFailed) {
-        continuationPrompt = 'Your editFile attempt failed. Use the latest readFile line-numbered output and replaceLines to complete the requested change. Continue with any remaining tests or validation if relevant. Do not stop with a summary.';
+        continuationPrompt = `State:\n${stateLines}\n\nRequired next action: call readFile on the exact edit recovery path first. Then use the latest line-numbered output with replaceLines, or a corrected editFile call, to complete the requested change. Continue with relevant validation if practical. Do not stop with a summary while tools are available.`;
     }
     else if (input.validationToolFailed && input.mutatingToolSucceeded) {
-        continuationPrompt = 'Validation failed after files changed in this task. Inspect the failure output, fix failures that are plausibly caused by the current change, then rerun the relevant validation once. If the failure is clearly unrelated or environment-specific, summarize the blocker instead of expanding scope.';
+        continuationPrompt = `State:\n${stateLines}\n\nRequired next action: Validation failed after files changed in this task. Use the validation summary/output to inspect the first relevant failure, make one focused fix if it is plausibly caused by this change, then rerun the same relevant validation once. If it is an environment/dependency/unrelated failure, finish with Status: blocked or Status: partial and concrete evidence.`;
     }
     else if (needsValidationContinuation) {
         continuationPrompt = changedActionNeedsValidation
-            ? 'Files changed for this request, but no validation has run yet. Continue by running the smallest relevant test/check command you can identify from the project. If no practical validation exists, state that concrete blocker briefly instead of claiming the goal is complete.'
-            : 'You have not run the requested validation yet. Continue now by running the appropriate test/check command. Summarize only after the command finishes.';
+            ? `State:\n${stateLines}\n\nRequired next action: files changed for this request, but no validation has run. Run the smallest relevant test/typecheck/build command you can identify. If no practical validation exists, finish with the final status template and say why validation was not run.`
+            : `State:\n${stateLines}\n\nRequired next action: run the requested validation now. Summarize only after the command finishes.`;
     }
     else if (input.mutatingToolSucceeded && assistantAdmitsIncomplete) {
-        continuationPrompt = 'Your previous response says the current request is incomplete. Continue now with the remaining edits and validation for this same request. Do not summarize a plan unless blocked.';
+        continuationPrompt = `State:\n${stateLines}\n\nRequired next action: your previous response described unfinished work. Continue with the remaining in-scope edits and validation for this same request. Do not summarize a plan unless concretely blocked.`;
     }
     else if (needsActionContinuation) {
-        continuationPrompt = 'You inspected files but have not made the requested change yet. Continue now by editing or writing the necessary files. Do not summarize a plan unless blocked.';
+        continuationPrompt = `State:\n${stateLines}\n\nRequired next action: you inspected files but have not made the requested change yet. Edit or write the necessary files now. Do not summarize a plan unless concretely blocked.`;
     }
     return {
         needsActionContinuation,
@@ -72,13 +79,13 @@ export function completionDecision(input) {
     };
 }
 export function toolLoopBudgetPrompt() {
-    return 'Tool slice reached for this model step. Do not output XML, JSON tool-call syntax, <tool_call> blocks, or function-call markup. If the current request is complete, summarize only current-turn changes and validation. If the requested change is incomplete, state the next concrete unfinished action briefly so Haze can continue autonomously in a fresh tool slice. Do not claim tools are unavailable, recap unrelated earlier tasks, or provide a generic remains list.';
+    return 'Tool slice reached for this model step. Do not output XML, JSON tool-call syntax, <tool_call> blocks, or function-call markup. If the current request is complete, answer with the final status template using only current-turn changes and validation evidence. If incomplete, state the single next concrete unfinished action so Haze can continue autonomously in a fresh tool slice. Do not claim tools are unavailable, recap unrelated earlier tasks, or provide a generic remains list.';
 }
 export function postContinuationPrompt() {
-    return 'Your previous response still described unfinished work, missing validation, or a tool-budget issue. If any tools are still available, complete the remaining edit or run the final validation now. Only call something a blocker if a concrete tool failure prevents progress.';
+    return 'Your previous response still described unfinished work, missing validation, or a tool-budget issue. If tools are available, complete the remaining edit or run the final validation now. Only call something blocked for a concrete tool failure, missing dependency/permission, or unavoidable ambiguity.';
 }
 export function noTextAfterToolPrompt(allowTools) {
     return allowTools
-        ? 'Continue the original request now. If it asks for a change, edit or write the necessary files. If it asks to run or verify tests, run the command. Do not provide only a retrospective summary unless blocked.'
-        : 'Continue from the tool result and answer my original request. Do not call tools. Summarize only current-turn changes and validation; do not recap unrelated earlier tasks.';
+        ? 'Continue the original request now. If it asks for a change, edit or write the necessary files. If it asks to run or verify tests, run the command. Do not provide only a retrospective summary unless blocked or needing a user decision.'
+        : 'Continue from the tool result and answer my original request. Do not call tools. Use the final status template for implementation-like requests; summarize only current-turn changes and validation; do not recap unrelated earlier tasks.';
 }

package/dist/core/safety/bashClassifier.d.ts ADDED Viewed

@@ -0,0 +1,10 @@
+export type BashRiskLevel = 'read_only' | 'mutating' | 'destructive' | 'network' | 'unknown';
+export type BashTrait = 'reads_files' | 'writes_files' | 'deletes_files' | 'installs_dependencies' | 'runs_tests' | 'runs_build' | 'uses_network' | 'changes_git_state' | 'changes_permissions';
+export type BashClassification = {
+    riskLevel: BashRiskLevel;
+    traits: BashTrait[];
+    confidence: 'high' | 'medium' | 'low';
+    reason: string;
+};
+export declare function classifyBashCommand(command: string): BashClassification;
+export declare function isValidationClassification(classification: BashClassification): boolean;

package/dist/core/safety/bashClassifier.js ADDED Viewed

@@ -0,0 +1,51 @@
+function has(command, pattern) {
+    return pattern.test(command);
+}
+function uniq(values) {
+    return [...new Set(values)];
+}
+export function classifyBashCommand(command) {
+    const trimmed = command.trim();
+    const traits = [];
+    const lower = trimmed.toLowerCase();
+    const complex = /[`$()]|\b(eval|xargs|sh\s+-c|bash\s+-c)\b/.test(trimmed);
+    if (!trimmed) {
+        return { riskLevel: 'unknown', traits: [], confidence: 'high', reason: 'empty command' };
+    }
+    if (has(lower, /(^|[;&|]\s*)(rm\b|rm\s+-|git\s+reset\s+--hard\b|git\s+clean\b|git\s+restore\s+\.|git\s+checkout\s+--\b)/) || has(lower, /push\b.*--force|drop\s+database|truncate\s+table/)) {
+        if (has(lower, /\brm\b|git\s+clean|git\s+restore|git\s+checkout\s+--|drop\s+database|truncate\s+table/))
+            traits.push('deletes_files');
+        if (has(lower, /\bgit\b/))
+            traits.push('changes_git_state');
+        return { riskLevel: 'destructive', traits: uniq(traits), confidence: complex ? 'medium' : 'high', reason: 'command can delete files or irreversibly change repository state' };
+    }
+    if (has(lower, /(^|[;&|]\s*)(curl\b|wget\b|scp\b|ssh\b|npm\s+(install|i|add)\b|pnpm\s+(install|add)\b|yarn\s+(add|install)\b|pip\s+install\b|brew\s+install\b)/)) {
+        traits.push('uses_network');
+        if (has(lower, /\b(npm|pnpm|yarn|pip|brew)\b/))
+            traits.push('installs_dependencies', 'writes_files');
+        return { riskLevel: has(lower, /\b(curl|wget|scp|ssh)\b/) && !has(lower, /\binstall|\badd\b/) ? 'network' : 'mutating', traits: uniq(traits), confidence: complex ? 'medium' : 'high', reason: 'command uses the network or installs dependencies' };
+    }
+    if (has(trimmed, /(^|\s)(>|>>)(\s|\S)/) || has(lower, /(^|[;&|]\s*)(sed\s+-i|perl\s+-pi|tee\b|chmod\b|mv\b|cp\b|mkdir\b|touch\b|git\s+(add|commit|merge|rebase|checkout|restore)\b)/) || has(trimmed, /\b(File\.write|writeFileSync|writeFile|appendFileSync|appendFile)\b/)) {
+        traits.push('writes_files');
+        if (has(lower, /\bchmod\b/))
+            traits.push('changes_permissions');
+        if (has(lower, /\bgit\b/))
+            traits.push('changes_git_state');
+        return { riskLevel: 'mutating', traits: uniq(traits), confidence: complex ? 'medium' : 'high', reason: 'command can modify files or repository state' };
+    }
+    if (has(lower, /(^|[;&|]\s*)(npm\s+test|npm\s+run\s+(test|typecheck|lint|build)|pnpm\s+(test|run\s+(test|typecheck|lint|build))|yarn\s+(test|run\s+(test|typecheck|lint|build))|vitest\b|jest\b|tsc\b|eslint\b)/)) {
+        if (has(lower, /test|vitest|jest/))
+            traits.push('runs_tests');
+        if (has(lower, /build|tsc|typecheck|lint|eslint/))
+            traits.push('runs_build');
+        return { riskLevel: 'read_only', traits: uniq(traits), confidence: complex ? 'medium' : 'high', reason: 'validation command' };
+    }
+    if (has(lower, /(^|[;&|]\s*)(git\s+(status|diff|log|show|branch)\b|rg\b|grep\b|find\b|ls\b|pwd\b|cat\b|head\b|tail\b|node\s+--version|npm\s+--version|which\b)/)) {
+        traits.push('reads_files');
+        return { riskLevel: complex ? 'unknown' : 'read_only', traits: uniq(traits), confidence: complex ? 'low' : 'high', reason: complex ? 'read-like command with complex shell syntax' : 'read-only inspection command' };
+    }
+    return { riskLevel: 'unknown', traits: [], confidence: 'low', reason: 'command did not match known safe patterns' };
+}
+export function isValidationClassification(classification) {
+    return classification.traits.includes('runs_tests') || classification.traits.includes('runs_build');
+}

package/dist/core/subagent/subagentRunner.d.ts CHANGED Viewed

@@ -28,6 +28,6 @@ export declare function createSubagentTool(options: {
     contextFiles: ContextFile[];
 }): import("ai").Tool<{
     task: string;
-    tools?: ("editFile" | "replaceLines" | "writeFile" | "listFiles" | "readFile" | "grep" | "bash")[] | undefined;
+    tools?: ("listFiles" | "editFile" | "replaceLines" | "writeFile" | "readFile" | "grep" | "bash")[] | undefined;
     maxSteps?: number | undefined;
 }, SubagentResult>;

package/dist/core/subagent/subagentRunner.js CHANGED Viewed

@@ -2,15 +2,17 @@ import { streamText, stepCountIs, tool } from 'ai';
 import { z } from 'zod';
 import { buildSystemPrompt } from '../../llm/systemPrompt.js';
 import { hazeTools } from '../../llm/hazeTools.js';
-const SUBAGENT_SYSTEM_PROMPT = `You are a focused subagent. Complete the assigned task using all tools available to you, then return a clear summary.
+const SUBAGENT_SYSTEM_PROMPT = `You are a focused subagent for a professional developer workflow. Complete the assigned task autonomously using the available tools, then return a clear summary.
 Rules:
-- Use whatever tools you need. You have full access to file tools and bash.
-- If the task requires creating or modifying files, do it directly — do not ask for permission.
-- If a tool fails, read the file again and retry with the correct content.
-- After completing the task, summarize what you did, what files you created or changed, and any important findings.
+- Use whatever tools you need within the assigned scope. You have full access to file tools and bash.
+- If the task requires creating or modifying files, do it directly with file tools or an efficient non-destructive shell command; do not ask for permission for ordinary edits.
+- Assume expert users understand what they asked for. Do not ask for command confirmations; run relevant commands directly.
+- Destructive and ordinary professional workflows should proceed without extra ceremony when they are relevant to the assigned task.
+- If a file edit tool fails, read the exact file again and retry once with current content or line numbers.
+- After completing the task, summarize what you did, what files you created or changed, validation run, and important findings.
 - If you cannot complete the task, explain exactly what blocked you and what you tried.
-- Your summary is all the parent agent will see. Be specific: include file paths, function names, and concrete results.`;
+- Your summary is all the parent agent will see. Be specific: include file paths, function names, command results, and concrete next steps.`;
 const ALL_TOOLS = ['listFiles', 'readFile', 'grep', 'bash', 'editFile', 'replaceLines', 'writeFile'];
 const STEP_LIMIT = 25;
 const MAX_SUMMARY = 4000;
@@ -56,7 +58,6 @@ export async function runSubagent(task, options) {
     try {
         const result = streamText({
             model: options.model,
-            temperature: 0,
             maxOutputTokens: 4096,
             system: `${SUBAGENT_SYSTEM_PROMPT}\n\n${buildSystemPrompt(options.contextFiles)}`,
             messages: [{ role: 'user', content: task }],
@@ -71,7 +72,7 @@ export async function runSubagent(task, options) {
                     return {
                         toolChoice: 'none',
                         messages: [
-                            { role: 'user', content: 'You have done enough tool work. Summarize what you found or did right now.' },
+                            { role: 'user', content: 'Tool budget reached for this subtask. Summarize what you found or changed, validation evidence, and the exact remaining action if incomplete. Do not claim tools are unavailable.' },
                         ],
                     };
                 }

package/dist/core/validation/outputParser.d.ts ADDED Viewed

@@ -0,0 +1,12 @@
+import type { BashClassification } from '../safety/bashClassifier.js';
+import type { ValidationSummary } from '../../llm/toolResultTypes.js';
+export declare function parseValidationOutput(input: {
+    command: string;
+    code: number | null;
+    stdout: string;
+    stderr: string;
+    timedOut?: boolean;
+    stdoutTruncated?: boolean;
+    stderrTruncated?: boolean;
+    classification?: BashClassification;
+}): ValidationSummary;

package/dist/core/validation/outputParser.js ADDED Viewed

@@ -0,0 +1,79 @@
+function uniq(values) {
+    return [...new Set(values.filter(Boolean))];
+}
+function inferKind(command, classification) {
+    const lower = command.toLowerCase();
+    if (/typecheck|\btsc\b/.test(lower))
+        return 'typecheck';
+    if (/\beslint\b|\blint\b/.test(lower))
+        return 'lint';
+    if (/\bbuild\b/.test(lower) || classification?.traits.includes('runs_build'))
+        return 'build';
+    if (/\b(test|vitest|jest)\b/.test(lower) || classification?.traits.includes('runs_tests'))
+        return 'test';
+    return 'generic';
+}
+export function parseValidationOutput(input) {
+    const text = `${input.stdout}\n${input.stderr}`;
+    const lines = text.split(/\r?\n/);
+    const diagnostics = [];
+    const failedTests = [];
+    const failedFiles = [];
+    const kind = inferKind(input.command, input.classification);
+    const status = input.timedOut ? 'timed_out' : input.code === 0 ? 'passed' : input.code == null ? 'unknown' : 'failed';
+    for (const line of lines) {
+        const ts = line.match(/^(.+?\.(?:ts|tsx|js|jsx|mts|cts))\((\d+),(\d+)\):\s+(error|warning)\s+TS\d+:\s+(.+)$/);
+        if (ts) {
+            const [, file, lineNo, column, severity, message] = ts;
+            diagnostics.push({ file, line: Number(lineNo), column: Number(column), severity: severity === 'warning' ? 'warning' : 'error', message: message ?? '' });
+            failedFiles.push(file ?? '');
+            continue;
+        }
+        const eslint = line.match(/^(.+?\.(?:ts|tsx|js|jsx|mts|cts))\s*$/);
+        if (eslint) {
+            const currentFile = eslint[1] ?? '';
+            const next = lines[lines.indexOf(line) + 1];
+            if (next && /^\s*\d+:\d+\s+/.test(next))
+                failedFiles.push(currentFile);
+        }
+        const eslintDiag = line.match(/^\s*(\d+):(\d+)\s+(error|warning)\s+(.+?)(?:\s{2,}\S+)?$/);
+        if (eslintDiag) {
+            const [, lineNo, column, severity, message] = eslintDiag;
+            diagnostics.push({ line: Number(lineNo), column: Number(column), severity: severity === 'warning' ? 'warning' : 'error', message: message ?? '' });
+            continue;
+        }
+        const vitestFile = line.match(/^\s*(?:FAIL|FAILED|✓|✗|❯)?\s*([^\s]+\.(?:test|spec)\.(?:ts|tsx|js|jsx))/i);
+        if (vitestFile)
+            failedFiles.push(vitestFile[1] ?? '');
+        const testName = line.match(/^\s*(?:FAIL|✗|×|●|-)\s+(.+)$/);
+        if (testName && !/^(FAIL|FAILED)\s+\S+\.(?:test|spec)\./i.test(line.trim()))
+            failedTests.push((testName[1] ?? '').trim());
+        const genericFile = line.match(/([^\s()]+\.(?:ts|tsx|js|jsx|mts|cts)):(\d+):(\d+)/);
+        if (genericFile) {
+            const [, file, lineNo, column] = genericFile;
+            failedFiles.push(file ?? '');
+            diagnostics.push({ file, line: Number(lineNo), column: Number(column), severity: /warn/i.test(line) ? 'warning' : 'error', message: line.trim() });
+        }
+    }
+    const uniqueFiles = uniq(failedFiles).slice(0, 10);
+    const uniqueTests = uniq(failedTests).slice(0, 10);
+    const diagCount = diagnostics.length;
+    const rawOutputTruncated = Boolean(input.stdoutTruncated || input.stderrTruncated);
+    let summaryText;
+    if (status === 'passed')
+        summaryText = `${kind} passed`;
+    else if (status === 'timed_out')
+        summaryText = `${kind} timed out`;
+    else if (uniqueTests.length > 0)
+        summaryText = `${kind} failed: ${uniqueTests.length} failed test${uniqueTests.length === 1 ? '' : 's'}${uniqueFiles.length ? ` in ${uniqueFiles.join(', ')}` : ''}`;
+    else if (diagCount > 0)
+        summaryText = `${kind} failed: ${diagCount} diagnostic${diagCount === 1 ? '' : 's'}${uniqueFiles.length ? ` in ${uniqueFiles.join(', ')}` : ''}`;
+    else
+        summaryText = `${kind} ${status}`;
+    const suggestedNextStep = status === 'failed'
+        ? uniqueFiles.length > 0
+            ? `Inspect ${uniqueFiles.slice(0, 3).join(', ')} and fix the first relevant failure.`
+            : 'Inspect the command output and fix the first relevant failure.'
+        : undefined;
+    return { kind, status, failedFiles: uniqueFiles, failedTests: uniqueTests, diagnostics: diagnostics.slice(0, 20), summaryText, suggestedNextStep, rawOutputTruncated };
+}

package/dist/llm/hazeTools.d.ts CHANGED Viewed

@@ -1,9 +1,4 @@
-type ToolDiffLine = {
-    type: 'add' | 'remove' | 'context';
-    oldLine?: number;
-    newLine?: number;
-    text: string;
-};
+import type { ToolDiffLine, ToolFailureReasonCode } from './toolResultTypes.js';
 export declare const hazeTools: {
     listFiles: import("ai").Tool<{
         path: string;
@@ -16,8 +11,11 @@ export declare const hazeTools: {
         toolName: string;
         path: string | undefined;
         error: string;
+        reasonCode: ToolFailureReasonCode | undefined;
         recoverable: boolean;
         suggestedNextStep: string;
+        recoveryTool: string | undefined;
+        recoveryInput: unknown;
     } | {
         ok: true;
         duplicateSkipped: true;
@@ -47,8 +45,11 @@ export declare const hazeTools: {
         toolName: string;
         path: string | undefined;
         error: string;
+        reasonCode: ToolFailureReasonCode | undefined;
         recoverable: boolean;
         suggestedNextStep: string;
+        recoveryTool: string | undefined;
+        recoveryInput: unknown;
     } | {
         ok: true;
         duplicateSkipped: true;
@@ -85,8 +86,11 @@ export declare const hazeTools: {
         toolName: string;
         path: string | undefined;
         error: string;
+        reasonCode: ToolFailureReasonCode | undefined;
         recoverable: boolean;
         suggestedNextStep: string;
+        recoveryTool: string | undefined;
+        recoveryInput: unknown;
     } | {
         ok: true;
         duplicateSkipped: true;
@@ -117,8 +121,11 @@ export declare const hazeTools: {
         toolName: string;
         path: string | undefined;
         error: string;
+        reasonCode: ToolFailureReasonCode | undefined;
         recoverable: boolean;
         suggestedNextStep: string;
+        recoveryTool: string | undefined;
+        recoveryInput: unknown;
     } | {
         ok: true;
         duplicateSkipped: true;
@@ -148,8 +155,11 @@ export declare const hazeTools: {
         toolName: string;
         path: string | undefined;
         error: string;
+        reasonCode: ToolFailureReasonCode | undefined;
         recoverable: boolean;
         suggestedNextStep: string;
+        recoveryTool: string | undefined;
+        recoveryInput: unknown;
     } | {
         ok: true;
         duplicateSkipped: true;
@@ -173,8 +183,11 @@ export declare const hazeTools: {
         toolName: string;
         path: string | undefined;
         error: string;
+        reasonCode: ToolFailureReasonCode | undefined;
         recoverable: boolean;
         suggestedNextStep: string;
+        recoveryTool: string | undefined;
+        recoveryInput: unknown;
     } | {
         ok: true;
         duplicateSkipped: true;
@@ -197,4 +210,3 @@ export declare const hazeTools: {
     }, unknown>;
 };
 export type HazeTools = typeof hazeTools;
-export {};