npm - prjct-cli - Versions diffs - 1.7.1 → 1.7.3 - Mend

prjct-cli 1.7.1 → 1.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/CHANGELOG.md +59 -0
package/core/__tests__/agentic/injection-validator.test.ts +255 -0
package/core/__tests__/workflow/state-machine.test.ts +216 -0
package/core/agentic/injection-validator.ts +192 -0
package/core/agentic/prompt-builder.ts +35 -30
package/core/schemas/state.ts +6 -1
package/core/storage/state-storage.ts +298 -30
package/core/utils/next-steps.ts +2 -0
package/core/workflow/state-machine.ts +23 -10
package/dist/bin/prjct.mjs +534 -210
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,64 @@
 # Changelog
+## [1.7.3] - 2026-02-07
+### Bug Fixes
+- add Zod validation and token budgets for prompt injection (PRJ-282) (#142)
+## [1.7.3] - 2026-02-07
+### Bug Fixes
+- **Validate auto-injected state in prompt builder (PRJ-282)**: Added `safeInject()` validation utility, token-aware truncation via `InjectionBudgetTracker`, and domain-based skill filtering to prevent oversized or irrelevant content in LLM prompts. Replaced hardcoded character limits with configurable token budgets.
+### Implementation Details
+- Created `core/agentic/injection-validator.ts` with `safeInject()`, `safeInjectString()`, `truncateToTokenBudget()`, `estimateTokens()`, `filterSkillsByDomains()`, and `InjectionBudgetTracker` class
+- Wired validation into `prompt-builder.ts`: auto-context truncation, agent/skill token budgets, cumulative state budget tracking
+- Skills filtered by detected task domains before injection to reduce token waste
+- 33 new unit tests covering all validation, filtering, and truncation paths
+### Test Plan
+#### For QA
+1. Run `bun test` — all 526 tests pass (33 new)
+2. Verify `safeInject()` returns fallback on corrupt data
+3. Verify `filterSkillsByDomains()` excludes irrelevant skills
+4. Verify `InjectionBudgetTracker` enforces cumulative limits
+#### For Users
+- No user-facing changes — validation is automatic
+- Breaking changes: None
+## [1.7.2] - 2026-02-07
+### Bug Fixes
+- add missing state machine transitions and dead-end states (PRJ-280) (#141)
+## [1.7.2] - 2026-02-07
+### Bug Fix
+- **Fix state machine completeness: missing transitions and dead-end states (PRJ-280)**: Added missing transitions (`completed → pause`, `paused → ship`, `completed → reopen`), subtask states (`skipped`, `blocked` with reason tracking), migrated `previousTask` to `pausedTasks[]` array with max limit (5) and staleness detection (30 days), and enforced all transitions through the state machine at the storage level.
+### Implementation Details
+Added `reopen` command to `WorkflowCommand` type. Updated `getCurrentState()` to detect paused state from `pausedTasks[]` array and legacy `previousTask`. `failSubtask()` now advances to the next subtask instead of halting. New `skipSubtask(reason)` and `blockSubtask(blocker)` methods mark subtasks and advance. `pauseTask()` pushes onto a `pausedTasks[]` array (max 5), `resumeTask()` pops from array or by ID. `getPausedTasksFromState()` handles backward compat by migrating legacy `previousTask` format. All storage mutation methods (`startTask`, `completeTask`, `pauseTask`, `resumeTask`) validate transitions through the state machine before executing.
+### Test Plan
+#### For QA
+1. Verify `completed → pause`, `paused → ship`, and `completed → reopen` transitions work
+2. Start a task with subtasks, call `failSubtask()` — verify it records reason AND advances to next subtask
+3. Call `skipSubtask(reason)` and `blockSubtask(blocker)` — verify they record reasons and advance
+4. Pause 3+ tasks — verify `pausedTasks[]` array stores all, respects max limit of 5
+5. State.json with old `previousTask` format — verify auto-migration into array
+6. Attempt invalid transition (e.g., `done` from `idle`) — verify error thrown at storage level
+#### For Users
+**What changed:** Workflow supports reopening completed tasks, shipping paused tasks directly, and multiple paused tasks. Subtask failures auto-advance instead of halting.
+**Breaking changes:** `previousTask` deprecated in favor of `pausedTasks[]`. Backward compat maintained via auto-migration.
 ## [1.7.1] - 2026-02-07
 ### Bug Fixes

package/core/__tests__/agentic/injection-validator.test.ts ADDED Viewed

@@ -0,0 +1,255 @@
+/**
+ * Injection Validator Tests
+ * Tests for safeInject, truncation, skill filtering, and budget tracking.
+ */
+import { describe, expect, it } from 'bun:test'
+import { z } from 'zod'
+import {
+  DEFAULT_BUDGETS,
+  estimateTokens,
+  filterSkillsByDomains,
+  InjectionBudgetTracker,
+  safeInject,
+  safeInjectString,
+  truncateToTokenBudget,
+} from '../../agentic/injection-validator'
+// =============================================================================
+// safeInject
+// =============================================================================
+describe('safeInject', () => {
+  const schema = z.object({ name: z.string(), value: z.number() })
+  const fallback = { name: 'unknown', value: 0 }
+  it('returns validated data on valid input', () => {
+    const data = { name: 'test', value: 42 }
+    expect(safeInject(data, schema, fallback)).toEqual(data)
+  })
+  it('returns fallback on invalid input', () => {
+    const data = { name: 123, value: 'bad' }
+    expect(safeInject(data, schema, fallback)).toEqual(fallback)
+  })
+  it('returns fallback on null input', () => {
+    expect(safeInject(null, schema, fallback)).toEqual(fallback)
+  })
+  it('returns fallback on undefined input', () => {
+    expect(safeInject(undefined, schema, fallback)).toEqual(fallback)
+  })
+  it('strips extra fields via Zod', () => {
+    const data = { name: 'test', value: 42, extra: 'ignored' }
+    const result = safeInject(data, schema, fallback)
+    expect(result.name).toBe('test')
+    expect(result.value).toBe(42)
+  })
+})
+// =============================================================================
+// safeInjectString
+// =============================================================================
+describe('safeInjectString', () => {
+  const schema = z.object({ count: z.number() })
+  const formatter = (d: { count: number }) => `Items: ${d.count}`
+  it('returns formatted string on valid input', () => {
+    expect(safeInjectString({ count: 5 }, schema, formatter, 'N/A')).toBe('Items: 5')
+  })
+  it('returns fallback string on invalid input', () => {
+    expect(safeInjectString({ count: 'bad' }, schema, formatter, 'N/A')).toBe('N/A')
+  })
+  it('returns fallback string on null', () => {
+    expect(safeInjectString(null, schema, formatter, 'no data')).toBe('no data')
+  })
+})
+// =============================================================================
+// truncateToTokenBudget
+// =============================================================================
+describe('truncateToTokenBudget', () => {
+  it('returns text unchanged if within budget', () => {
+    const text = 'short text'
+    expect(truncateToTokenBudget(text, 100)).toBe(text)
+  })
+  it('truncates text that exceeds budget', () => {
+    const text = 'a'.repeat(500) // 500 chars = ~125 tokens
+    const result = truncateToTokenBudget(text, 50) // 50 tokens = 200 chars
+    expect(result.length).toBeLessThan(500)
+    expect(result).toContain('truncated')
+    expect(result).toContain('~50 tokens')
+  })
+  it('truncates to exact char limit', () => {
+    const text = 'a'.repeat(100)
+    const result = truncateToTokenBudget(text, 10) // 10 tokens = 40 chars
+    expect(result.startsWith('a'.repeat(40))).toBe(true)
+  })
+  it('handles empty string', () => {
+    expect(truncateToTokenBudget('', 100)).toBe('')
+  })
+  it('handles zero budget', () => {
+    const result = truncateToTokenBudget('some text', 0)
+    expect(result).toContain('truncated')
+  })
+})
+// =============================================================================
+// estimateTokens
+// =============================================================================
+describe('estimateTokens', () => {
+  it('estimates tokens at ~4 chars per token', () => {
+    expect(estimateTokens('a'.repeat(100))).toBe(25)
+  })
+  it('rounds up partial tokens', () => {
+    expect(estimateTokens('abc')).toBe(1) // 3/4 = 0.75, ceil = 1
+  })
+  it('handles empty string', () => {
+    expect(estimateTokens('')).toBe(0)
+  })
+})
+// =============================================================================
+// filterSkillsByDomains
+// =============================================================================
+describe('filterSkillsByDomains', () => {
+  const skills = [
+    { name: 'react-patterns', content: 'React component patterns and hooks' },
+    { name: 'api-design', content: 'RESTful API design and endpoint patterns' },
+    { name: 'jest-testing', content: 'Jest test patterns and assertions' },
+    { name: 'docker-deploy', content: 'Docker and Kubernetes deployment' },
+    { name: 'general-coding', content: 'General coding best practices' },
+  ]
+  it('returns all skills when no domains detected', () => {
+    expect(filterSkillsByDomains(skills, [])).toEqual(skills)
+  })
+  it('returns all skills when skills array is empty', () => {
+    expect(filterSkillsByDomains([], ['frontend'])).toEqual([])
+  })
+  it('filters to frontend-relevant skills', () => {
+    const result = filterSkillsByDomains(skills, ['frontend'])
+    expect(result.some((s) => s.name === 'react-patterns')).toBe(true)
+    expect(result.some((s) => s.name === 'docker-deploy')).toBe(false)
+  })
+  it('filters to backend-relevant skills', () => {
+    const result = filterSkillsByDomains(skills, ['backend'])
+    expect(result.some((s) => s.name === 'api-design')).toBe(true)
+    expect(result.some((s) => s.name === 'react-patterns')).toBe(false)
+  })
+  it('filters to testing-relevant skills', () => {
+    const result = filterSkillsByDomains(skills, ['testing'])
+    expect(result.some((s) => s.name === 'jest-testing')).toBe(true)
+  })
+  it('supports multiple domains', () => {
+    const result = filterSkillsByDomains(skills, ['frontend', 'testing'])
+    expect(result.some((s) => s.name === 'react-patterns')).toBe(true)
+    expect(result.some((s) => s.name === 'jest-testing')).toBe(true)
+    expect(result.some((s) => s.name === 'docker-deploy')).toBe(false)
+  })
+  it('matches domain name itself as keyword', () => {
+    const customSkills = [{ name: 'devops-helper', content: 'general devops tools' }]
+    const result = filterSkillsByDomains(customSkills, ['devops'])
+    expect(result).toHaveLength(1)
+  })
+  it('is case insensitive', () => {
+    const result = filterSkillsByDomains(skills, ['Frontend'])
+    expect(result.some((s) => s.name === 'react-patterns')).toBe(true)
+  })
+})
+// =============================================================================
+// InjectionBudgetTracker
+// =============================================================================
+describe('InjectionBudgetTracker', () => {
+  it('tracks cumulative token usage', () => {
+    const tracker = new InjectionBudgetTracker({ totalPrompt: 100 })
+    tracker.addSection('a'.repeat(200), 100) // 200 chars = 50 tokens
+    expect(tracker.totalUsed).toBe(50)
+    expect(tracker.remaining).toBe(50)
+  })
+  it('truncates sections to per-section budget', () => {
+    const tracker = new InjectionBudgetTracker({ totalPrompt: 1000 })
+    const result = tracker.addSection('a'.repeat(500), 50) // budget: 50 tokens = 200 chars
+    expect(result.length).toBeLessThan(500)
+  })
+  it('returns empty string when total budget exhausted', () => {
+    const tracker = new InjectionBudgetTracker({ totalPrompt: 10 })
+    tracker.addSection('a'.repeat(100), 50) // uses all 10 tokens of total budget
+    const result = tracker.addSection('more content', 50)
+    expect(result).toBe('')
+  })
+  it('fits content to remaining total budget', () => {
+    const tracker = new InjectionBudgetTracker({ totalPrompt: 30 })
+    tracker.addSection('a'.repeat(80), 30) // 80 chars = 20 tokens
+    // Remaining: 10 tokens
+    const result = tracker.addSection('b'.repeat(200), 100) // wants 100 tokens, only 10 left
+    expect(result.length).toBeLessThan(200)
+    expect(tracker.remaining).toBe(0)
+  })
+  it('uses default budgets when none provided', () => {
+    const tracker = new InjectionBudgetTracker()
+    expect(tracker.config.totalPrompt).toBe(DEFAULT_BUDGETS.totalPrompt)
+    expect(tracker.config.autoContext).toBe(DEFAULT_BUDGETS.autoContext)
+  })
+  it('allows partial budget overrides', () => {
+    const tracker = new InjectionBudgetTracker({ totalPrompt: 5000 })
+    expect(tracker.config.totalPrompt).toBe(5000)
+    expect(tracker.config.autoContext).toBe(DEFAULT_BUDGETS.autoContext) // unchanged
+  })
+  it('remaining never goes negative', () => {
+    const tracker = new InjectionBudgetTracker({ totalPrompt: 5 })
+    tracker.addSection('a'.repeat(1000), 500)
+    expect(tracker.remaining).toBe(0)
+  })
+})
+// =============================================================================
+// DEFAULT_BUDGETS
+// =============================================================================
+describe('DEFAULT_BUDGETS', () => {
+  it('has all required fields', () => {
+    expect(DEFAULT_BUDGETS.autoContext).toBeGreaterThan(0)
+    expect(DEFAULT_BUDGETS.agentContent).toBeGreaterThan(0)
+    expect(DEFAULT_BUDGETS.skillContent).toBeGreaterThan(0)
+    expect(DEFAULT_BUDGETS.stateData).toBeGreaterThan(0)
+    expect(DEFAULT_BUDGETS.memories).toBeGreaterThan(0)
+    expect(DEFAULT_BUDGETS.totalPrompt).toBeGreaterThan(0)
+  })
+  it('totalPrompt is larger than individual budgets', () => {
+    expect(DEFAULT_BUDGETS.totalPrompt).toBeGreaterThan(DEFAULT_BUDGETS.autoContext)
+    expect(DEFAULT_BUDGETS.totalPrompt).toBeGreaterThan(DEFAULT_BUDGETS.agentContent)
+    expect(DEFAULT_BUDGETS.totalPrompt).toBeGreaterThan(DEFAULT_BUDGETS.skillContent)
+    expect(DEFAULT_BUDGETS.totalPrompt).toBeGreaterThan(DEFAULT_BUDGETS.stateData)
+  })
+})

package/core/__tests__/workflow/state-machine.test.ts ADDED Viewed

@@ -0,0 +1,216 @@
+/**
+ * State Machine Tests
+ *
+ * Tests for workflow state machine transitions:
+ * - All valid transitions work
+ * - Invalid transitions are rejected
+ * - New transitions: completed→paused, paused→shipped, completed→reopen
+ * - getCurrentState detects paused tasks from pausedTasks array
+ */
+import { describe, expect, it } from 'bun:test'
+import type { WorkflowCommand, WorkflowState } from '../../workflow/state-machine'
+import { WorkflowStateMachine } from '../../workflow/state-machine'
+const sm = new WorkflowStateMachine()
+// =============================================================================
+// getCurrentState
+// =============================================================================
+describe('getCurrentState', () => {
+  it('returns idle when no task and no paused tasks', () => {
+    expect(sm.getCurrentState({ currentTask: null })).toBe('idle')
+    expect(sm.getCurrentState({})).toBe('idle')
+  })
+  it('returns working for in_progress status', () => {
+    expect(sm.getCurrentState({ currentTask: { status: 'in_progress' } })).toBe('working')
+    expect(sm.getCurrentState({ currentTask: { status: 'working' } })).toBe('working')
+  })
+  it('returns completed for completed/done status', () => {
+    expect(sm.getCurrentState({ currentTask: { status: 'completed' } })).toBe('completed')
+    expect(sm.getCurrentState({ currentTask: { status: 'done' } })).toBe('completed')
+  })
+  it('returns shipped for shipped status', () => {
+    expect(sm.getCurrentState({ currentTask: { status: 'shipped' } })).toBe('shipped')
+  })
+  it('returns paused when currentTask has paused status', () => {
+    expect(sm.getCurrentState({ currentTask: { status: 'paused' } })).toBe('paused')
+  })
+  it('returns paused when no currentTask but pausedTasks array has entries', () => {
+    expect(sm.getCurrentState({ currentTask: null, pausedTasks: [{ id: '1' }] })).toBe('paused')
+  })
+  it('returns paused when no currentTask but legacy previousTask is paused', () => {
+    expect(sm.getCurrentState({ currentTask: null, previousTask: { status: 'paused' } })).toBe(
+      'paused'
+    )
+  })
+  it('returns idle when no currentTask and empty pausedTasks', () => {
+    expect(sm.getCurrentState({ currentTask: null, pausedTasks: [] })).toBe('idle')
+  })
+  it('returns working for unknown status when task exists', () => {
+    expect(sm.getCurrentState({ currentTask: { status: 'active' } })).toBe('working')
+    expect(sm.getCurrentState({ currentTask: {} })).toBe('working')
+  })
+})
+// =============================================================================
+// canTransition - valid transitions
+// =============================================================================
+describe('canTransition - valid', () => {
+  const validTransitions: [WorkflowState, WorkflowCommand][] = [
+    // idle
+    ['idle', 'task'],
+    ['idle', 'next'],
+    // working
+    ['working', 'done'],
+    ['working', 'pause'],
+    // paused
+    ['paused', 'resume'],
+    ['paused', 'task'],
+    ['paused', 'ship'], // NEW: fast-track ship
+    // completed
+    ['completed', 'ship'],
+    ['completed', 'task'],
+    ['completed', 'next'],
+    ['completed', 'pause'], // NEW: reopen for review
+    ['completed', 'reopen'], // NEW: reopen for rework
+    // shipped
+    ['shipped', 'task'],
+    ['shipped', 'next'],
+  ]
+  for (const [state, command] of validTransitions) {
+    it(`${state} → ${command} is valid`, () => {
+      const result = sm.canTransition(state, command)
+      expect(result.valid).toBe(true)
+      expect(result.error).toBeUndefined()
+    })
+  }
+})
+// =============================================================================
+// canTransition - invalid transitions
+// =============================================================================
+describe('canTransition - invalid', () => {
+  const invalidTransitions: [WorkflowState, WorkflowCommand][] = [
+    ['idle', 'done'],
+    ['idle', 'pause'],
+    ['idle', 'resume'],
+    ['idle', 'ship'],
+    ['idle', 'reopen'],
+    ['working', 'task'],
+    ['working', 'ship'],
+    ['working', 'resume'],
+    ['working', 'next'],
+    ['working', 'reopen'],
+    ['paused', 'done'],
+    ['paused', 'pause'],
+    ['paused', 'reopen'],
+    ['shipped', 'done'],
+    ['shipped', 'pause'],
+    ['shipped', 'resume'],
+    ['shipped', 'ship'],
+    ['shipped', 'reopen'],
+  ]
+  for (const [state, command] of invalidTransitions) {
+    it(`${state} → ${command} is invalid`, () => {
+      const result = sm.canTransition(state, command)
+      expect(result.valid).toBe(false)
+      expect(result.error).toBeDefined()
+      expect(result.suggestion).toBeDefined()
+    })
+  }
+})
+// =============================================================================
+// getNextState
+// =============================================================================
+describe('getNextState', () => {
+  it('task → working', () => {
+    expect(sm.getNextState('idle', 'task')).toBe('working')
+    expect(sm.getNextState('paused', 'task')).toBe('working')
+    expect(sm.getNextState('completed', 'task')).toBe('working')
+  })
+  it('done → completed', () => {
+    expect(sm.getNextState('working', 'done')).toBe('completed')
+  })
+  it('pause → paused', () => {
+    expect(sm.getNextState('working', 'pause')).toBe('paused')
+    expect(sm.getNextState('completed', 'pause')).toBe('paused')
+  })
+  it('resume → working', () => {
+    expect(sm.getNextState('paused', 'resume')).toBe('working')
+  })
+  it('ship → shipped', () => {
+    expect(sm.getNextState('completed', 'ship')).toBe('shipped')
+    expect(sm.getNextState('paused', 'ship')).toBe('shipped')
+  })
+  it('reopen → working', () => {
+    expect(sm.getNextState('completed', 'reopen')).toBe('working')
+  })
+  it('next preserves current state', () => {
+    expect(sm.getNextState('idle', 'next')).toBe('idle')
+    expect(sm.getNextState('completed', 'next')).toBe('completed')
+  })
+})
+// =============================================================================
+// getValidCommands
+// =============================================================================
+describe('getValidCommands', () => {
+  it('idle allows task, next', () => {
+    expect(sm.getValidCommands('idle')).toEqual(['task', 'next'])
+  })
+  it('working allows done, pause', () => {
+    expect(sm.getValidCommands('working')).toEqual(['done', 'pause'])
+  })
+  it('paused allows resume, task, ship', () => {
+    expect(sm.getValidCommands('paused')).toEqual(['resume', 'task', 'ship'])
+  })
+  it('completed allows ship, task, next, pause, reopen', () => {
+    expect(sm.getValidCommands('completed')).toEqual(['ship', 'task', 'next', 'pause', 'reopen'])
+  })
+  it('shipped allows task, next', () => {
+    expect(sm.getValidCommands('shipped')).toEqual(['task', 'next'])
+  })
+})
+// =============================================================================
+// formatNextSteps
+// =============================================================================
+describe('formatNextSteps', () => {
+  it('includes reopen in completed state steps', () => {
+    const steps = sm.formatNextSteps('completed')
+    expect(steps.some((s) => s.includes('reopen'))).toBe(true)
+  })
+  it('includes ship in paused state steps', () => {
+    const steps = sm.formatNextSteps('paused')
+    expect(steps.some((s) => s.includes('ship'))).toBe(true)
+  })
+})