npm - @workermill/agent - Versions diffs - 0.4.7 → 0.5.0 - Mend

@workermill/agent 0.4.7 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/dist/plan-validator.d.ts CHANGED Viewed

@@ -39,7 +39,7 @@ export interface CriticResult {
         suggestedChanges?: string[];
     }>;
 }
-declare const AUTO_APPROVAL_THRESHOLD = 75;
+declare const AUTO_APPROVAL_THRESHOLD = 80;
 /**
  * Parse execution plan JSON from raw Claude CLI output.
  * Mirrors server-side parseExecutionPlan() in planning-agent-local.ts.

package/dist/plan-validator.js CHANGED Viewed

@@ -16,7 +16,7 @@ import { generateText } from "./providers.js";
 // CONSTANTS
 // ============================================================================
 const MAX_TARGET_FILES = 5;
-const AUTO_APPROVAL_THRESHOLD = 75;
+const AUTO_APPROVAL_THRESHOLD = 80;
 // ============================================================================
 // PLAN PARSING
 // ============================================================================
@@ -100,7 +100,7 @@ Review this execution plan against the PRD:
 1. **Missing Requirements** - Does the plan cover what the PRD asks for?
 2. **Vague Instructions** - Will the worker know what to do?
 3. **Security Issues** - Only for tasks involving auth, user data, or external input
-4. **Unrealistic Scope** - Any step targeting >3 files MUST score below 85 (auto-rejection threshold). Each step should modify at most 3 files. If a step needs more, split it into multiple steps first.
+4. **Unrealistic Scope** - Any step targeting >3 files MUST score below 80 (auto-rejection threshold). Each step should modify at most 3 files. If a step needs more, split it into multiple steps first.
 5. **Missing Operational Steps** - If the PRD requires deployment, provisioning, migrations, or running commands, does the plan include operational steps? Writing code is not the same as deploying it.
 6. **Overlapping File Scope** - If two or more steps share the same targetFiles, this causes parallel merge conflicts. Steps MUST NOT overlap on targetFiles. Deduct 10 points per shared file across steps.
@@ -117,7 +117,7 @@ Respond with ONLY a JSON object (no markdown, no explanation):
 {"approved": boolean, "score": number, "risks": ["risk1", "risk2"], "suggestions": ["suggestion1", "suggestion2"], "storyFeedback": [{"storyId": "step-0", "feedback": "specific feedback", "suggestedChanges": ["change1"]}]}
 Rules:
-- approved = true if score >= 85 AND plan is right-sized for task
+- approved = true if score >= 80 AND plan is right-sized for task
 - risks = specific issues (empty array if none)
 - suggestions = actionable improvements (empty array if none)
 - storyFeedback = per-step feedback (optional, only for steps that need changes)`;

package/dist/planner.js CHANGED Viewed

@@ -390,42 +390,65 @@ function runAnalyst(name, claudePath, model, prompt, repoPath, env, timeoutMs =
     });
 }
 /** Analyst prompt templates */
-const CODEBASE_ANALYST_PROMPT = `You are analyzing a codebase to help plan a development task.
-Use Glob and Read to explore the repository structure.
-Report:
-1. Key directories and their purposes
-2. Frameworks, languages, and patterns used
-3. Existing test patterns and locations
-4. CI/CD configuration
-5. Key configuration files (.env, tsconfig, etc.)
-Keep your report under 2000 words. Focus on facts, not opinions.`;
+const CODEBASE_ANALYST_PROMPT = `You are a codebase analyst. Your job is to explore this repository using tools and report what you find.
+IMPORTANT: You MUST use tools to explore the repository. Do NOT guess or make assumptions.
+Step 1: Run Glob with pattern "**/*" to see the top-level directory structure.
+Step 2: Read key files: package.json, tsconfig.json, README.md, .env.example, or equivalents.
+Step 3: Run Glob on src/ or the main source directory to understand the code layout.
+Step 4: Read 2-3 representative source files to understand patterns and frameworks.
+After exploring, write a report covering:
+1. Directory structure and organization
+2. Languages, frameworks, and key dependencies (from package.json, requirements.txt, etc.)
+3. Existing test files and testing patterns (search for test/, __tests__, *.test.*, *.spec.*)
+4. CI/CD configuration (search for .github/workflows/, Jenkinsfile, etc.)
+5. Configuration files and environment setup
+Keep your report under 2000 words. Only report facts you verified with tools.`;
 function makeRequirementsAnalystPrompt(task) {
-    return `Given this task description:
+    return `You are a requirements analyst. Analyze the following task and the repository to identify what needs to be built.
-Title: ${task.summary}
+Task: ${task.summary}
 ${task.description ? `\nDescription:\n${task.description}` : ""}
-Analyze the requirements and report:
-1. Explicit acceptance criteria (what MUST be done)
-2. Implicit requirements (what's assumed but not stated)
-3. Ambiguities that could lead to wrong implementation
-4. Affected components based on the requirement scope
-5. Suggested personas for each component
+IMPORTANT: You MUST use tools to understand the existing codebase before analyzing requirements.
+Step 1: Run Glob with pattern "**/*" to see what already exists in the repository.
+Step 2: Read any existing README, docs, or configuration to understand the current state.
+Step 3: Search for any code related to the task requirements using Grep.
+After exploring, write a report covering:
+1. Explicit acceptance criteria — what MUST be built based on the description
+2. Implicit requirements — what's assumed but not stated (auth, error handling, etc.)
+3. What already exists vs what needs to be created (based on your file exploration)
+4. Ambiguities that could lead to wrong implementation
+5. Suggested components/modules and which persona should own each
 Keep your report under 1500 words.`;
 }
 function makeRiskAssessorPrompt(task) {
-    return `You are assessing risks for a development task on this codebase.
-The task: ${task.summary}
+    return `You are a risk assessor. Your job is to search this repository for potential risks and blockers for a development task.
+Task: ${task.summary}
 ${task.description ? `\nDescription:\n${task.description}` : ""}
-Use Grep and Read to check for potential blockers.
-Report:
-1. Files likely to be modified (search for relevant code)
-2. Files that are heavily coupled (imports/dependencies)
-3. Existing tests that may need updating
-4. Environment/config dependencies
-5. Migration or deployment considerations
-Keep your report under 1500 words.`;
+IMPORTANT: You MUST use tools to search the codebase. Do NOT guess file paths or make assumptions.
+Step 1: Run Glob with pattern "**/*" to see the full repository structure.
+Step 2: Use Grep to search for code related to the task (relevant keywords, APIs, components).
+Step 3: Read files that are likely to be modified or affected by this task.
+Step 4: Search for existing tests (Grep for "test", "spec", "describe", "it(") to find test coverage.
+After exploring, write a report covering:
+1. Specific files that will need to be modified (exact paths from your search)
+2. Files with heavy coupling or shared dependencies (imports you found)
+3. Existing tests that will need updating (exact file paths)
+4. Environment, config, or migration requirements
+5. Deployment or infrastructure risks
+Keep your report under 1500 words. Only report facts you verified with tools.`;
 }
 /**
  * Run team planning: spawn 3 parallel analyst agents, then synthesize

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@workermill/agent",
-  "version": "0.4.7",
+  "version": "0.5.0",
   "description": "WorkerMill Remote Agent - Run AI workers locally with your Claude Max subscription",
   "type": "module",
   "main": "./dist/index.js",