npm - @rely-ai/caliber - Versions diffs - 1.12.16 → 1.12.18 - Mend

@rely-ai/caliber 1.12.16 → 1.12.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/dist/bin.js +39 -23
package/package.json +1 -1

package/dist/bin.js CHANGED Viewed

@@ -1693,41 +1693,46 @@ CoreSetup schema:
 }
 IMPORTANT: Do NOT generate full skill content. Only output skill topic names and descriptions.
-Skills will be generated separately. Generate 3-6 skill topics per target platform based on project complexity. Each topic should cover a distinct tool, workflow, or domain.
+Skills will be generated separately. Generate 3-6 skill topics per target platform based on project complexity.
-Skill topic description MUST include WHAT it does + WHEN to use it with specific trigger phrases.
-Example: "Manages database migrations. Use when user says 'run migration', 'create migration', 'db schema change', or modifies files in db/migrations/."
+Skills serve two purposes:
+1. **Codify repeating patterns** \u2014 look at the codebase for patterns that developers repeat: how to create a new API route, how to write to the database, how to add a new page/component, how to write tests. These are the most valuable skills because they ensure every developer (and every LLM session) follows the same patterns.
+2. **Enforce team consistency** \u2014 skills act as executable coding standards. When multiple developers each work with their own LLM sessions on the same codebase, skills ensure everyone writes code the same way \u2014 same file structure, same error handling, same naming conventions, same patterns.
+Derive skill topics from actual code in the project. Look at existing files for the patterns being used, then create skills that replicate those patterns for new work.
+Skill topic description MUST follow this formula: [What it does] + [When to use it] + [Key capabilities].
+Include specific trigger phrases users would actually say. Also include negative triggers to prevent over-triggering.
+Example: "Creates a new API endpoint following the project's route pattern. Handles request validation, error responses, and DB queries. Use when user says 'add endpoint', 'new route', 'create API', or adds files to src/routes/. Do NOT use for modifying existing routes."
 The "fileDescriptions" object MUST include a one-liner for every file that will be created or modified.
 The "deletions" array should list files that should be removed (e.g. stale configs). Omit if empty.
-SCORING CRITERIA \u2014 your output is scored deterministically. Optimize for 100/100:
+SCORING CRITERIA \u2014 your output is scored deterministically against the actual filesystem. Optimize for 100/100:
 Existence (25 pts):
-- CLAUDE.md exists (6 pts) \u2014 always generate for claude/both targets
+- CLAUDE.md exists (6 pts) \u2014 always generate for claude targets
 - AGENTS.md exists (6 pts) \u2014 always generate for codex target
-- Skills configured (8 pts) \u2014 3 skill topics = full points
+- Skills configured (8 pts) \u2014 generate 3+ skill topics for full points
 - For "both" target: .cursor/rules/ exist (3+3 pts), cross-platform parity (2 pts)
 Quality (25 pts):
-- Build/test/lint commands documented (8 pts) \u2014 include actual commands from the project
-- Concise context files (6 pts) \u2014 keep CLAUDE.md under 100 lines (200=4pts, 300=3pts, 500+=0pts)
-- No vague instructions (4 pts) \u2014 avoid "follow best practices", "write clean code", "ensure quality"
+- Executable content (8 pts) \u2014 include 3+ code blocks with project commands (3 blocks = full points)
+- Concise config (6 pts) \u2014 total tokens across ALL config files must be under 2000 for full points (5000=4pts, 8000+=low)
+- Concrete instructions (4 pts) \u2014 every line should reference specific files, paths, or code in backticks. Avoid generic prose.
 - No directory tree listings (3 pts) \u2014 do NOT include tree-style file listings
-- No contradictions (2 pts) \u2014 consistent tool/style recommendations
+- Structured with headings (2 pts) \u2014 use at least 3 ## sections and bullet lists
-Coverage (20 pts):
-- Dependency coverage (10 pts) \u2014 CRITICAL: the exact dependency list is provided in your input. Mention AT LEAST 85% by name in CLAUDE.md. Full points at 85%+.
-- Service/MCP coverage (6 pts) \u2014 reference detected services
-- MCP completeness (4 pts) \u2014 full points if no external services detected
+Grounding (20 pts) \u2014 CRITICAL:
+- Project grounding (12 pts) \u2014 reference the project's actual directories and files by name. The scoring checks which project dirs/files appear in your config. Mention key directories from the file tree.
+- Reference density (8 pts) \u2014 use backticks and inline code extensively. Every file path, command, or identifier should be in backticks. Higher density of specific references = higher score.
 Accuracy (15 pts) \u2014 CRITICAL:
-- Documented commands exist (6 pts) \u2014 ONLY reference scripts from the provided package.json. Use the exact package manager.
-- Documented paths exist (4 pts) \u2014 ONLY reference file paths from the provided file tree.
-- Config freshness (5 pts) \u2014 config must match current code state
+- References valid (8 pts) \u2014 ONLY reference file paths that exist in the provided file tree. Every path in backticks is validated against the filesystem.
+- Config drift (7 pts) \u2014 config must match current code state
 Freshness & Safety (10 pts):
-- No secrets (4 pts), Permissions (2 pts \u2014 handled by caliber)
+- No secrets (4 pts), Permissions (2 pts \u2014 handled by caliber), Freshness (4 pts \u2014 handled by caliber)
 Bonus (5 pts): Hooks (2 pts), AGENTS.md (1 pt), OpenSkills format (2 pts) \u2014 handled by caliber
@@ -1739,20 +1744,31 @@ var SKILL_GENERATION_PROMPT = `You generate a single skill file for a coding age
 Given project context and a skill topic, produce a focused SKILL.md body.
+Purpose: Skills codify repeating patterns from the codebase so every developer and every LLM session produces consistent code. Study the existing code to extract the exact patterns used, then write instructions that replicate those patterns for new work.
 Structure:
 1. A heading with the skill name
-2. "## Instructions" \u2014 clear, numbered steps. Be specific: include exact commands, file paths, parameter names from the project.
-3. "## Examples" \u2014 at least one example showing: User says \u2192 Actions taken \u2192 Result
-4. "## Troubleshooting" (optional) \u2014 common errors and fixes
+2. "## Critical" (if applicable) \u2014 put the most important rules and constraints FIRST. Things the agent must never skip, validation that must happen before any action, or project-specific constraints.
+3. "## Instructions" \u2014 clear, numbered steps derived from actual patterns in the codebase. Each step MUST:
+   - Include exact file paths, naming conventions, imports, and boilerplate from existing code
+   - Have a validation gate: "Verify X before proceeding to the next step"
+   - Specify dependencies: "This step uses the output from Step N"
+4. "## Examples" \u2014 at least one example showing: User says \u2192 Actions taken \u2192 Result. The example should mirror how existing code in the project is structured.
+5. "## Common Issues" (required) \u2014 specific error messages and their fixes. Not "check your config" but "If you see 'Connection refused on port 5432': 1. Verify postgres is running: docker ps | grep postgres 2. Check .env has correct DATABASE_URL"
 Rules:
 - Max 150 lines. Focus on actionable instructions, not documentation prose.
+- Study existing code in the project context to extract the real patterns being used. A skill for "create API route" should show the exact file structure, imports, error handling, and naming that existing routes use.
+- Be specific and actionable. GOOD: "Run \`pnpm test -- --filter=api\` to verify". BAD: "Validate the data before proceeding."
+- Never use ambiguous language. Instead of "handle errors properly", write "Wrap the DB call in try/catch. On failure, return { error: string, code: number } matching the ErrorResponse type in \`src/types.ts\`."
 - Reference actual commands, paths, and packages from the project context provided.
 - Do NOT include YAML frontmatter \u2014 it will be generated separately.
-- Be specific to THIS project \u2014 avoid generic advice.
+- Be specific to THIS project \u2014 avoid generic advice. The skill should produce code that looks identical to what's already in the codebase.
+Description field formula: [What it does] + [When to use it with trigger phrases] + [Key capabilities]. Include negative triggers ("Do NOT use for X") to prevent over-triggering.
 Return ONLY a JSON object:
-{"name": "string (kebab-case)", "description": "string (what + when)", "content": "string (markdown body)"}`;
+{"name": "string (kebab-case)", "description": "string (what + when + capabilities + negative triggers)", "content": "string (markdown body)"}`;
 var REFINE_SYSTEM_PROMPT = `You are an expert at modifying coding agent configurations (Claude Code, Cursor, and Codex).
 You will receive the current AgentSetup JSON and a user request describing what to change.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@rely-ai/caliber",
-  "version": "1.12.16",
+  "version": "1.12.18",
   "description": "Analyze your codebase and generate optimized AI agent configs (CLAUDE.md, .cursorrules, skills) — no API key needed",
   "type": "module",
   "bin": {