npm - @harness-engineering/cli - Versions diffs - 1.2.0 → 1.3.0 - Mend

@harness-engineering/cli 1.2.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

package/dist/agents/commands/gemini-cli/harness/refactoring.toml DELETED Viewed

@@ -1,209 +0,0 @@
-# Generated by harness generate-slash-commands. Do not edit.
-description = "Safe refactoring with validation before and after changes"
-prompt = """
-<context>
-Cognitive mode: meticulous-implementer
-Type: flexible
-</context>
-<objective>
-Safe refactoring with validation before and after changes
-</objective>
-<execution_context>
---- SKILL.md (agents/skills/claude-code/harness-refactoring/SKILL.md) ---
-# Harness Refactoring
-> Safe refactoring with constraint verification at every step. Change structure without changing behavior, with harness checks as your safety net.
-## When to Use
-- When improving code structure, readability, or maintainability without changing behavior
-- When reducing duplication (DRY refactoring)
-- When moving code to the correct architectural layer
-- When splitting large files or functions into smaller, focused ones
-- When renaming for clarity across the codebase
-- After completing a feature (post-TDD cleanup beyond single-cycle refactoring)
-- NOT when adding new behavior (use harness-tdd instead)
-- NOT when fixing bugs (use harness-tdd — write a failing test first)
-- NOT when the test suite is already failing — fix the tests before refactoring
-## Process
-### Iron Rule
-**All tests must pass BEFORE you start refactoring and AFTER every single change.**
-If tests are not green before you start, you are not refactoring — you are debugging. Fix the tests first. If tests break during refactoring, undo the last change immediately. Do not try to fix forward.
-### Phase 1: Prepare — Verify Starting State
-1. **Run the full test suite.** Every test must pass. Record the count of passing tests — this number must not decrease at any point.
-2. **Run `harness validate`** and **`harness check-deps`**. Both must pass. You are establishing a clean baseline. If either reports issues, fix those first (that is a separate task, not part of this refactoring).
-3. **Identify the refactoring target.** Be specific: which file, function, class, or module? What is wrong with the current structure? What will be better after refactoring?
-4. **Plan the steps.** Break the refactoring into the smallest possible individual changes. Each step should be independently committable and verifiable. If you cannot describe a step in one sentence, it is too large.
-### Phase 2: Execute — One Small Change at a Time
-For EACH step in the plan:
-1. **Make ONE small change.** Examples of "one small change":
-   - Rename one variable or function
-   - Extract one function from a larger function
-   - Move one function to a different file
-   - Inline one unnecessary abstraction
-   - Replace one conditional with polymorphism
-   - Remove one instance of duplication
-2. **Run the full test suite.** All tests must pass. If any test fails:
-   - **STOP immediately.**
-   - **Undo the change** (git checkout the file or revert manually).
-   - **Analyze why it broke.** Either the change was not purely structural (it changed behavior) or the tests are coupled to implementation details.
-   - **Try a smaller step** or a different approach.
-3. **Run `harness validate` and `harness check-deps`.** Both must pass. A refactoring that fixes code structure but violates architectural constraints is not safe.
-4. **Commit the step.** Each step gets its own commit. The commit message describes the structural change: "extract validateInput from processOrder" or "move UserRepository to data-access layer."
-5. **Repeat** for the next step in the plan.
-### Phase 3: Verify — Confirm the Refactoring is Complete
-1. **Run the full test suite one final time.** Same number of passing tests as Phase 1.
-2. **Run `harness validate` and `harness check-deps` one final time.** Clean output.
-3. **Review the cumulative diff.** Does the final state match the intended improvement? Is the code genuinely better, or just different?
-4. **If the refactoring introduced no improvement,** revert the entire sequence. Refactoring for its own sake is churn.
-## Common Refactoring Patterns
-### Extract Function
-**When:** A function is doing too many things, or a block of code is reused in multiple places.
-**How:** Identify the block. Ensure all variables it uses are either parameters or local. Cut the block into a new function with a descriptive name. Replace the original block with a call to the new function.
-**Harness guidance:** If the extracted function belongs in a different layer, move it there AND update the import. Run `harness check-deps` to verify the new import respects layer boundaries.
-### Move to Layer
-**When:** Code is in the wrong architectural layer (e.g., business logic in a UI component, database queries in a service).
-**How:** Create the function in the correct layer. Update all callers to import from the new location. Delete the old function. Run `harness check-deps` after each step.
-**Harness guidance:** This is where `harness check-deps` is most valuable. Moving code between layers changes the dependency graph. The tool will tell you immediately if the move created a violation.
-### Split File
-**When:** A file has grown too large or contains unrelated responsibilities.
-**How:** Identify the cohesive groups within the file. Create new files, one per group. Move functions/classes to their new files. Update the original file to re-export from the new files (for backward compatibility) or update all callers.
-**Harness guidance:** Run `harness validate` after splitting to ensure the new files follow naming conventions and are properly structured. Run `harness check-deps` to verify no new boundary violations.
-### Inline Abstraction
-**When:** An abstraction (class, interface, wrapper function) adds complexity without value. It has only one implementation, is never extended, and obscures what the code actually does.
-**How:** Replace uses of the abstraction with the concrete implementation. Delete the abstraction. Run tests.
-**Harness guidance:** Removing an abstraction may expose a layer violation that the abstraction was hiding. Run `harness check-deps` to check.
-### Rename for Clarity
-**When:** A name is misleading, ambiguous, or no longer reflects what the code does.
-**How:** Use your editor's rename/refactor tool to change the name everywhere it appears. If the name is part of a public API, check for external consumers first.
-**Harness guidance:** Run `harness check-docs` after renaming to detect documentation that still uses the old name. AGENTS.md, inline comments, and doc pages may all need updating.
-## Harness Integration
-- **`harness validate`** — Run before starting, after each step, and at the end. Catches structural issues, naming violations, and configuration drift.
-- **`harness check-deps`** — Run after each step, especially when moving code between files or layers. Catches dependency violations introduced by structural changes.
-- **`harness check-docs`** — Run after renaming or moving public APIs. Catches documentation that references old names or locations.
-- **`harness cleanup`** — Run after completing a refactoring sequence. Detects dead code that the refactoring may have created (unused exports, orphaned files).
-## Success Criteria
-- All tests pass before, during, and after refactoring (same count, same results)
-- `harness validate` passes at every step
-- `harness check-deps` passes at every step
-- Each step is an atomic commit with a clear structural description
-- The code is measurably better after refactoring (clearer names, less duplication, correct layering, smaller functions)
-- No behavioral changes were introduced (the test suite is the proof)
-- No dead code was left behind (run `harness cleanup` to verify)
-## Examples
-### Example: Moving business logic out of a UI component
-**Target:** `src/components/OrderSummary.tsx` contains a `calculateDiscount()` function with complex business rules. This logic belongs in the service layer.
-**Step 1:** Create `src/services/discount-service.ts` with the `calculateDiscount` function copied from the component.
-- Run tests: pass
-- Run `harness check-deps`: pass (new file, no violations)
-- Commit: "extract calculateDiscount to discount-service"
-**Step 2:** Update `OrderSummary.tsx` to import `calculateDiscount` from `discount-service` instead of using the local function.
-- Run tests: pass
-- Run `harness check-deps`: pass (UI importing from service is allowed)
-- Commit: "update OrderSummary to use discount-service"
-**Step 3:** Delete the original `calculateDiscount` function from `OrderSummary.tsx`.
-- Run tests: pass
-- Run `harness check-deps`: pass
-- Run `harness cleanup`: no dead code detected
-- Commit: "remove duplicate calculateDiscount from OrderSummary"
-**Final verification:** 3 steps, 3 commits, all tests green throughout, all harness checks passing. The business logic is now in the correct layer.
-## Escalation
-- **When tests fail during refactoring and you cannot figure out why:** Revert to the last green commit. The test failure means the change was not purely structural. Analyze the test to understand what behavioral assumption it depends on, then plan a different approach.
-- **When `harness check-deps` fails after a move:** The code you moved may have dependencies that are not allowed in its new layer. You may need to refactor the moved code itself (remove forbidden imports) before it can live in the new layer.
-- **When a refactoring requires changing tests:** This is a warning sign. If the tests need to change, the refactoring may be changing behavior. The only valid reason to change tests during refactoring is if the tests were testing implementation details (not behavior) — and in that case, fix the tests first as a separate step before refactoring.
-- **When the refactoring scope keeps growing:** Stop. Commit what you have (if it is clean). Re-plan with a smaller scope. Large refactorings should be broken into multiple sessions, each leaving the code in a better state.
---- skill.yaml (agents/skills/claude-code/harness-refactoring/skill.yaml) ---
-name: harness-refactoring
-version: "1.0.0"
-description: Safe refactoring with validation before and after changes
-cognitive_mode: meticulous-implementer
-triggers:
-  - manual
-  - on_refactor
-platforms:
-  - claude-code
-  - gemini-cli
-tools:
-  - Bash
-  - Read
-  - Write
-  - Edit
-  - Glob
-  - Grep
-cli:
-  command: harness skill run harness-refactoring
-  args:
-    - name: path
-      description: Project root path
-      required: false
-mcp:
-  tool: run_skill
-  input:
-    skill: harness-refactoring
-    path: string
-type: flexible
-state:
-  persistent: false
-  files: []
-depends_on: []
-</execution_context>
-<process>
-1. Try: invoke mcp__harness__run_skill with skill: "harness-refactoring"
-2. If MCP unavailable: follow the SKILL.md workflow provided above directly
-3. Pass through any arguments provided by the user
-</process>
-"""

package/dist/agents/commands/gemini-cli/harness/skill-authoring.toml DELETED Viewed

@@ -1,350 +0,0 @@
-# Generated by harness generate-slash-commands. Do not edit.
-description = "Create and maintain harness skills following the rich skill format"
-prompt = """
-<context>
-Cognitive mode: constructive-architect
-Type: flexible
-</context>
-<objective>
-Create and maintain harness skills following the rich skill format
-</objective>
-<execution_context>
---- SKILL.md (agents/skills/claude-code/harness-skill-authoring/SKILL.md) ---
-# Harness Skill Authoring
-> Create and extend harness skills following the rich skill format. Define purpose, choose type, write skill.yaml and SKILL.md with all required sections, validate, and test.
-## When to Use
-- Creating a new skill for a team's recurring workflow
-- Extending an existing skill with new phases, gates, or examples
-- Converting an informal process ("how we do code reviews") into a formal harness skill
-- When a team notices they repeat the same multi-step process and wants to codify it
-- NOT when running an existing skill (use the skill directly)
-- NOT when listing or discovering skills (use `harness skill list`)
-- NOT when the process is a one-off task that will not recur
-## Process
-### Phase 1: DEFINE — Establish the Skill's Purpose
-1. **Identify the recurring process.** What does the team do repeatedly? Name it. Describe it in one sentence. This becomes the skill's `description` in `skill.yaml` and the blockquote summary in `SKILL.md`.
-2. **Define the scope boundary.** A good skill does one thing well. If the process has distinct phases that could be done independently, consider whether it should be multiple skills. Signs of a skill that is too broad: more than 6 phases, multiple unrelated triggers, trying to serve two different audiences.
-3. **Identify the trigger conditions.** When should this skill activate?
-   - `manual` — Only when explicitly invoked
-   - `on_new_feature` — When starting a new feature
-   - `on_bug_fix` — When fixing a bug
-   - `on_pr_review` — When reviewing a pull request
-   - `on_project_init` — When initializing or entering a project
-   - Multiple triggers are fine if the skill genuinely applies to all of them
-4. **Determine required tools.** What tools does the skill need? Common sets:
-   - Read-only analysis: `Read`, `Glob`, `Grep`
-   - Code modification: `Read`, `Write`, `Edit`, `Glob`, `Grep`, `Bash`
-   - Full workflow: all of the above plus specialized tools
-### Phase 2: CHOOSE TYPE — Rigid or Flexible
-1. **Choose rigid when:**
-   - The process has strict ordering that must not be violated
-   - Skipping steps causes real damage (data loss, security holes, broken deployments)
-   - Compliance or policy requires auditability of each step
-   - The process includes mandatory checkpoints where human approval is needed
-   - Examples: TDD cycle, deployment pipeline, security audit, database migration
-2. **Choose flexible when:**
-   - The process has recommended steps but the order can adapt to context
-   - The agent should use judgment about which steps to emphasize
-   - Different situations call for different subsets of the process
-   - The process is more about guidelines than rigid procedure
-   - Examples: code review, onboarding, brainstorming, project initialization
-3. **Key difference in SKILL.md:** Rigid skills require `## Gates` and `## Escalation` sections. Flexible skills may omit them (though they can include them if useful).
-### Phase 3: WRITE SKILL.YAML — Define Metadata
-1. **Create the skill directory** under `agents/skills/<platform>/<skill-name>/`.
-2. **Write `skill.yaml`** with all required fields:
-```yaml
-name: <skill-name> # Kebab-case, matches directory name
-version: '1.0.0' # Semver
-description: <one-line summary> # What this skill does
-triggers:
-  - <trigger-1>
-  - <trigger-2>
-platforms:
-  - claude-code # Which agent platforms support this skill
-tools:
-  - <tool-1> # Tools the skill requires
-  - <tool-2>
-cli:
-  command: harness skill run <skill-name>
-  args:
-    - name: <arg-name>
-      description: <arg-description>
-      required: <true|false>
-mcp:
-  tool: run_skill
-  input:
-    skill: <skill-name>
-type: <rigid|flexible>
-state:
-  persistent: <true|false> # Does this skill maintain state across sessions?
-  files:
-    - <state-file-path> # List state files if persistent
-depends_on:
-  - <prerequisite-skill> # Skills that must be available (not necessarily run first)
-```
-3. **Validate the YAML.** Ensure proper indentation, correct field names, and valid values. The `name` field must match the directory name exactly.
-### Phase 4: WRITE SKILL.MD — Author the Skill Content
-1. **Start with the heading and summary:**
-```markdown
-# <Skill Name>
-> <One-sentence description of what the skill does and why.>
-```
-2. **Write `## When to Use`.** Include both positive (when TO use) and negative (when NOT to use) conditions. Be specific. Negative conditions prevent misapplication and point to the correct alternative skill.
-3. **Write `## Process`.** This is the core of the skill. Guidelines for writing good process sections:
-   - **Use phases to organize.** Group related steps into named phases (e.g., ASSESS, IMPLEMENT, VERIFY). Each phase should have a clear purpose and completion criteria.
-   - **Number every step.** Steps within a phase are numbered. This makes them referenceable ("go back to Phase 2, step 3").
-   - **Be prescriptive about actions.** Say "Run `harness validate`" not "consider validating." Say "Read the file" not "you might want to read the file."
-   - **Include decision points.** When the process branches, state the conditions clearly: "If X, do A. If Y, do B."
-   - **State what NOT to do.** Prohibitions prevent common mistakes: "Do not proceed to Phase 3 if validation fails."
-   - **For rigid skills:** Add an Iron Law at the top — the one inviolable principle. Then define phases with mandatory ordering and explicit gates between them.
-   - **For flexible skills:** Describe the recommended flow but acknowledge that adaptation is expected. Focus on outcomes rather than exact commands.
-4. **Write `## Harness Integration`.** List every harness CLI command the skill uses, with a brief description of when to use it. This section connects the skill to the harness toolchain.
-5. **Write `## Success Criteria`.** Define how to know the skill was executed well. Each criterion should be observable and verifiable — not subjective.
-6. **Write `## Examples`.** At least one concrete example showing the full process from start to finish. Use realistic project names, file paths, and commands. Show both the commands and their expected outputs.
-7. **For rigid skills, write `## Gates`.** Gates are hard stops — conditions that must be true to proceed. Each gate should state what happens if violated. Format: "**<condition> = <consequence>.**"
-8. **For rigid skills, write `## Escalation`.** Define when to stop and ask for help. Each escalation condition should describe the symptom, the likely cause, and what to report.
-### Phase 5: VALIDATE — Verify the Skill
-1. **Run `harness skill validate`** to check:
-   - `skill.yaml` has all required fields and valid values
-   - `SKILL.md` has all required sections (`## When to Use`, `## Process`, `## Harness Integration`, `## Success Criteria`, `## Examples`)
-   - Rigid skills have `## Gates` and `## Escalation` sections
-   - The `name` in `skill.yaml` matches the directory name
-   - Referenced tools exist
-   - Referenced dependencies exist
-2. **Fix any validation errors.** Common issues:
-   - Missing required section in `SKILL.md`
-   - `name` field does not match directory name
-   - Invalid trigger name
-   - Missing `type` field in `skill.yaml`
-3. **Test by running the skill:** `harness skill run <name>`. Verify it loads correctly and the process instructions make sense in context.
-### Skill Quality Checklist
-Evaluate every skill along two dimensions:
-|                             | **Clear activation**                | **Ambiguous activation**                | **Missing activation** |
-| --------------------------- | ----------------------------------- | --------------------------------------- | ---------------------- |
-| **Specific implementation** | Good skill                          | Wasted — good instructions nobody finds | Broken                 |
-| **Vague implementation**    | Trap — agents activate but flounder | Bad skill                               | Empty shell            |
-| **Missing implementation**  | Stub                                | Stub                                    | Does not exist         |
-- **Good skill** = clear activation + specific implementation. The agent knows when to use it and exactly what to do.
-- **Clear activation + vague implementation** = trap. The skill fires correctly but the agent has no concrete instructions, leading to inconsistent results.
-- **Ambiguous activation + specific implementation** = wasted. Great instructions that never get used because the agent does not know when to activate the skill.
-Use this checklist as a final quality gate before declaring a skill complete.
-## Harness Integration
-- **`harness skill validate`** — Validate a skill's `skill.yaml` and `SKILL.md` against the schema and structure requirements.
-- **`harness skill run <name>`** — Execute a skill to test it in context.
-- **`harness skill list`** — List all available skills, useful for checking that a new skill appears after creation.
-- **`harness add skill <name> --type <type>`** — Scaffold a new skill directory with template files (alternative to manual creation).
-## Success Criteria
-- `skill.yaml` exists with all required fields and passes schema validation
-- `SKILL.md` exists with all required sections filled with substantive content (not placeholders)
-- The skill name in `skill.yaml` matches the directory name
-- `harness skill validate` passes with zero errors
-- The process section has clear, numbered, actionable steps organized into phases
-- When to Use includes both positive and negative conditions
-- At least one concrete example demonstrates the full process
-- Rigid skills include Gates and Escalation sections with specific conditions and consequences
-- The skill can be loaded and run with `harness skill run <name>`
-## Examples
-### Example: Creating a Flexible Skill for Database Migration Review
-**DEFINE:**
-```
-Process: The team reviews database migrations before applying them.
-Scope: Review only — not creating or applying migrations.
-Triggers: manual (invoked when a migration PR is opened).
-Tools: Read, Glob, Grep, Bash.
-```
-**CHOOSE TYPE:** Flexible — the review steps can vary based on migration complexity. Some migrations need data impact analysis, others do not.
-**WRITE skill.yaml:**
-```yaml
-name: review-db-migration
-version: '1.0.0'
-description: Review database migration files for safety and correctness
-triggers:
-  - manual
-platforms:
-  - claude-code
-tools:
-  - Read
-  - Glob
-  - Grep
-  - Bash
-cli:
-  command: harness skill run review-db-migration
-  args:
-    - name: migration-file
-      description: Path to the migration file to review
-      required: true
-mcp:
-  tool: run_skill
-  input:
-    skill: review-db-migration
-type: flexible
-state:
-  persistent: false
-  files: []
-depends_on: []
-```
-**WRITE SKILL.md:**
-```markdown
-# Review Database Migration
-> Review database migration files for safety, correctness, and
-> reversibility before they are applied to any environment.
-## When to Use
-- When a new migration file has been created and needs review
-- When a migration PR is opened
-- NOT when writing migrations (write first, then review)
-- NOT when applying migrations to environments (that is a deployment concern)
-## Process
-### Phase 1: ANALYZE — Understand the Migration
-1. Read the migration file completely...
-   [... full process content ...]
-## Harness Integration
-- `harness validate` — Verify project health after migration review
-  [... etc ...]
-```
-**VALIDATE:**
-```bash
-harness skill validate review-db-migration  # Pass
-harness skill run review-db-migration       # Loads correctly
-```
-### Example: Creating a Rigid Skill for Release Deployment
-**DEFINE:**
-```
-Process: Deploy a release to production. Strict ordering — cannot skip steps.
-Triggers: manual.
-Tools: Bash, Read, Glob.
-```
-**CHOOSE TYPE:** Rigid — skipping the smoke test or rollback verification step could cause production outages. Mandatory checkpoints for human approval before each environment promotion.
-**WRITE SKILL.md (key rigid sections):**
-```markdown
-## Gates
-- **Tests must pass before build.** If the test suite fails, do not
-  proceed to build. Fix the tests first.
-- **Staging must be verified before production.** If staging smoke tests
-  fail, do not promote to production. Roll back staging and investigate.
-- **Human approval required at each promotion.** Use [checkpoint:human-verify]
-  before promoting from staging to production. No auto-promotion.
-## Escalation
-- **When staging smoke tests fail on a test that passed locally:**
-  Report: "Smoke test [name] fails in staging but passes locally.
-  Likely cause: environment-specific configuration or data difference.
-  Need to investigate before proceeding."
-- **When rollback verification fails:** This is critical. Report immediately:
-  "Rollback to version [X] failed. Current state: [description].
-  Manual intervention required."
-```
---- skill.yaml (agents/skills/claude-code/harness-skill-authoring/skill.yaml) ---
-name: harness-skill-authoring
-version: "1.0.0"
-description: Create and maintain harness skills following the rich skill format
-cognitive_mode: constructive-architect
-triggers:
-  - manual
-platforms:
-  - claude-code
-  - gemini-cli
-tools:
-  - Bash
-  - Read
-  - Write
-  - Edit
-  - Glob
-  - Grep
-cli:
-  command: harness skill run harness-skill-authoring
-  args:
-    - name: path
-      description: Project root path
-      required: false
-mcp:
-  tool: run_skill
-  input:
-    skill: harness-skill-authoring
-    path: string
-type: flexible
-state:
-  persistent: false
-  files: []
-depends_on: []
-</execution_context>
-<process>
-1. Try: invoke mcp__harness__run_skill with skill: "harness-skill-authoring"
-2. If MCP unavailable: follow the SKILL.md workflow provided above directly
-3. Pass through any arguments provided by the user
-</process>
-"""