@rely-ai/caliber 1.12.16 → 1.12.18
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/bin.js +39 -23
- package/package.json +1 -1
package/dist/bin.js
CHANGED
|
@@ -1693,41 +1693,46 @@ CoreSetup schema:
|
|
|
1693
1693
|
}
|
|
1694
1694
|
|
|
1695
1695
|
IMPORTANT: Do NOT generate full skill content. Only output skill topic names and descriptions.
|
|
1696
|
-
Skills will be generated separately. Generate 3-6 skill topics per target platform based on project complexity.
|
|
1696
|
+
Skills will be generated separately. Generate 3-6 skill topics per target platform based on project complexity.
|
|
1697
1697
|
|
|
1698
|
-
|
|
1699
|
-
|
|
1698
|
+
Skills serve two purposes:
|
|
1699
|
+
1. **Codify repeating patterns** \u2014 look at the codebase for patterns that developers repeat: how to create a new API route, how to write to the database, how to add a new page/component, how to write tests. These are the most valuable skills because they ensure every developer (and every LLM session) follows the same patterns.
|
|
1700
|
+
2. **Enforce team consistency** \u2014 skills act as executable coding standards. When multiple developers each work with their own LLM sessions on the same codebase, skills ensure everyone writes code the same way \u2014 same file structure, same error handling, same naming conventions, same patterns.
|
|
1701
|
+
|
|
1702
|
+
Derive skill topics from actual code in the project. Look at existing files for the patterns being used, then create skills that replicate those patterns for new work.
|
|
1703
|
+
|
|
1704
|
+
Skill topic description MUST follow this formula: [What it does] + [When to use it] + [Key capabilities].
|
|
1705
|
+
Include specific trigger phrases users would actually say. Also include negative triggers to prevent over-triggering.
|
|
1706
|
+
Example: "Creates a new API endpoint following the project's route pattern. Handles request validation, error responses, and DB queries. Use when user says 'add endpoint', 'new route', 'create API', or adds files to src/routes/. Do NOT use for modifying existing routes."
|
|
1700
1707
|
|
|
1701
1708
|
The "fileDescriptions" object MUST include a one-liner for every file that will be created or modified.
|
|
1702
1709
|
The "deletions" array should list files that should be removed (e.g. stale configs). Omit if empty.
|
|
1703
1710
|
|
|
1704
|
-
SCORING CRITERIA \u2014 your output is scored deterministically. Optimize for 100/100:
|
|
1711
|
+
SCORING CRITERIA \u2014 your output is scored deterministically against the actual filesystem. Optimize for 100/100:
|
|
1705
1712
|
|
|
1706
1713
|
Existence (25 pts):
|
|
1707
|
-
- CLAUDE.md exists (6 pts) \u2014 always generate for claude
|
|
1714
|
+
- CLAUDE.md exists (6 pts) \u2014 always generate for claude targets
|
|
1708
1715
|
- AGENTS.md exists (6 pts) \u2014 always generate for codex target
|
|
1709
|
-
- Skills configured (8 pts) \u2014 3 skill topics
|
|
1716
|
+
- Skills configured (8 pts) \u2014 generate 3+ skill topics for full points
|
|
1710
1717
|
- For "both" target: .cursor/rules/ exist (3+3 pts), cross-platform parity (2 pts)
|
|
1711
1718
|
|
|
1712
1719
|
Quality (25 pts):
|
|
1713
|
-
-
|
|
1714
|
-
- Concise
|
|
1715
|
-
-
|
|
1720
|
+
- Executable content (8 pts) \u2014 include 3+ code blocks with project commands (3 blocks = full points)
|
|
1721
|
+
- Concise config (6 pts) \u2014 total tokens across ALL config files must be under 2000 for full points (5000=4pts, 8000+=low)
|
|
1722
|
+
- Concrete instructions (4 pts) \u2014 every line should reference specific files, paths, or code in backticks. Avoid generic prose.
|
|
1716
1723
|
- No directory tree listings (3 pts) \u2014 do NOT include tree-style file listings
|
|
1717
|
-
-
|
|
1724
|
+
- Structured with headings (2 pts) \u2014 use at least 3 ## sections and bullet lists
|
|
1718
1725
|
|
|
1719
|
-
|
|
1720
|
-
-
|
|
1721
|
-
-
|
|
1722
|
-
- MCP completeness (4 pts) \u2014 full points if no external services detected
|
|
1726
|
+
Grounding (20 pts) \u2014 CRITICAL:
|
|
1727
|
+
- Project grounding (12 pts) \u2014 reference the project's actual directories and files by name. The scoring checks which project dirs/files appear in your config. Mention key directories from the file tree.
|
|
1728
|
+
- Reference density (8 pts) \u2014 use backticks and inline code extensively. Every file path, command, or identifier should be in backticks. Higher density of specific references = higher score.
|
|
1723
1729
|
|
|
1724
1730
|
Accuracy (15 pts) \u2014 CRITICAL:
|
|
1725
|
-
-
|
|
1726
|
-
-
|
|
1727
|
-
- Config freshness (5 pts) \u2014 config must match current code state
|
|
1731
|
+
- References valid (8 pts) \u2014 ONLY reference file paths that exist in the provided file tree. Every path in backticks is validated against the filesystem.
|
|
1732
|
+
- Config drift (7 pts) \u2014 config must match current code state
|
|
1728
1733
|
|
|
1729
1734
|
Freshness & Safety (10 pts):
|
|
1730
|
-
- No secrets (4 pts), Permissions (2 pts \u2014 handled by caliber)
|
|
1735
|
+
- No secrets (4 pts), Permissions (2 pts \u2014 handled by caliber), Freshness (4 pts \u2014 handled by caliber)
|
|
1731
1736
|
|
|
1732
1737
|
Bonus (5 pts): Hooks (2 pts), AGENTS.md (1 pt), OpenSkills format (2 pts) \u2014 handled by caliber
|
|
1733
1738
|
|
|
@@ -1739,20 +1744,31 @@ var SKILL_GENERATION_PROMPT = `You generate a single skill file for a coding age
|
|
|
1739
1744
|
|
|
1740
1745
|
Given project context and a skill topic, produce a focused SKILL.md body.
|
|
1741
1746
|
|
|
1747
|
+
Purpose: Skills codify repeating patterns from the codebase so every developer and every LLM session produces consistent code. Study the existing code to extract the exact patterns used, then write instructions that replicate those patterns for new work.
|
|
1748
|
+
|
|
1742
1749
|
Structure:
|
|
1743
1750
|
1. A heading with the skill name
|
|
1744
|
-
2. "##
|
|
1745
|
-
3. "##
|
|
1746
|
-
|
|
1751
|
+
2. "## Critical" (if applicable) \u2014 put the most important rules and constraints FIRST. Things the agent must never skip, validation that must happen before any action, or project-specific constraints.
|
|
1752
|
+
3. "## Instructions" \u2014 clear, numbered steps derived from actual patterns in the codebase. Each step MUST:
|
|
1753
|
+
- Include exact file paths, naming conventions, imports, and boilerplate from existing code
|
|
1754
|
+
- Have a validation gate: "Verify X before proceeding to the next step"
|
|
1755
|
+
- Specify dependencies: "This step uses the output from Step N"
|
|
1756
|
+
4. "## Examples" \u2014 at least one example showing: User says \u2192 Actions taken \u2192 Result. The example should mirror how existing code in the project is structured.
|
|
1757
|
+
5. "## Common Issues" (required) \u2014 specific error messages and their fixes. Not "check your config" but "If you see 'Connection refused on port 5432': 1. Verify postgres is running: docker ps | grep postgres 2. Check .env has correct DATABASE_URL"
|
|
1747
1758
|
|
|
1748
1759
|
Rules:
|
|
1749
1760
|
- Max 150 lines. Focus on actionable instructions, not documentation prose.
|
|
1761
|
+
- Study existing code in the project context to extract the real patterns being used. A skill for "create API route" should show the exact file structure, imports, error handling, and naming that existing routes use.
|
|
1762
|
+
- Be specific and actionable. GOOD: "Run \`pnpm test -- --filter=api\` to verify". BAD: "Validate the data before proceeding."
|
|
1763
|
+
- Never use ambiguous language. Instead of "handle errors properly", write "Wrap the DB call in try/catch. On failure, return { error: string, code: number } matching the ErrorResponse type in \`src/types.ts\`."
|
|
1750
1764
|
- Reference actual commands, paths, and packages from the project context provided.
|
|
1751
1765
|
- Do NOT include YAML frontmatter \u2014 it will be generated separately.
|
|
1752
|
-
- Be specific to THIS project \u2014 avoid generic advice.
|
|
1766
|
+
- Be specific to THIS project \u2014 avoid generic advice. The skill should produce code that looks identical to what's already in the codebase.
|
|
1767
|
+
|
|
1768
|
+
Description field formula: [What it does] + [When to use it with trigger phrases] + [Key capabilities]. Include negative triggers ("Do NOT use for X") to prevent over-triggering.
|
|
1753
1769
|
|
|
1754
1770
|
Return ONLY a JSON object:
|
|
1755
|
-
{"name": "string (kebab-case)", "description": "string (what + when)", "content": "string (markdown body)"}`;
|
|
1771
|
+
{"name": "string (kebab-case)", "description": "string (what + when + capabilities + negative triggers)", "content": "string (markdown body)"}`;
|
|
1756
1772
|
var REFINE_SYSTEM_PROMPT = `You are an expert at modifying coding agent configurations (Claude Code, Cursor, and Codex).
|
|
1757
1773
|
|
|
1758
1774
|
You will receive the current AgentSetup JSON and a user request describing what to change.
|
package/package.json
CHANGED