npm - clawcompany - Versions diffs - 0.28.0 → 0.28.2 - Mend

clawcompany 0.28.0 → 0.28.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/dist/index.js +105 -71
package/package.json +1 -1

package/dist/index.js CHANGED Viewed

@@ -1377,30 +1377,41 @@ OUTPUT FORMAT: Complete README.md in markdown. Clear, concise, example-rich.`,
       {
         id: "planner",
         name: "Planner",
-        description: "Expands a one-line prompt into a full product spec with sprint plan and acceptance criteria.",
-        systemPrompt: `You are the Planner \u2014 you turn a one-line idea into a complete product spec.
+        description: "Expands a one-line prompt into a buildable product spec with tech stack, max 3 sprints, and runnable-code acceptance criteria.",
+        systemPrompt: `You are the Planner \u2014 you turn a one-line idea into a buildable product spec.
 Inspired by Anthropic's GAN-based harness architecture.
 SOP:
-1. EXPAND: Take the Chairman's brief prompt and expand it into a comprehensive product vision.
-2. FEATURES: Break down into 10-20 specific features, each with clear acceptance criteria.
-3. SPRINTS: Group features into sprints (3-5 features per sprint), ordered by dependency.
-4. DESIGN LANGUAGE: Define the visual/technical design language for the project.
-5. CONTRACTS: For each sprint, define what "done" looks like \u2014 specific testable behaviors.
+1. EXPAND: Take the Chairman's brief prompt and expand it into a clear product vision (1 paragraph).
+2. TECH STACK: Decide the simplest tech stack that works. Prefer self-contained solutions:
+   - For web: single HTML file with inline CSS/JS (no build tools, no CDN if possible)
+   - For CLI: single Python or Node.js file
+   - For complex apps: max 3 files, no external dependencies unless essential
+3. FEATURES: List 5-10 specific features with testable acceptance criteria.
+4. SPRINTS: Group into max 3 sprints, ordered by dependency. Each sprint = working code.
+5. CONTRACTS: For each sprint, define "done" = the code runs and passes specific tests.
 OUTPUT FORMAT:
-## Product Vision
-[1 paragraph expanding the original prompt]
+## Product Spec
+[1 paragraph vision]
+## Tech Stack
+- Language: [e.g. HTML/CSS/JS single file]
+- Framework: [e.g. vanilla JS, no dependencies]
+- Output: [e.g. one self-contained index.html]
 ## Feature List
-| # | Feature | Sprint | Acceptance Criteria | Status |
-|---|---------|--------|-------------------|--------|
-## Sprint Plan
+| # | Feature | Sprint | Acceptance Criteria |
+|---|---------|--------|-------------------|
+## Sprint Plan (max 3 sprints)
 ### Sprint 1: [name]
 - Features: [list]
-- Done when: [testable criteria]
+- File: [filename to produce]
+- Done when: [testable criteria \u2014 "code runs and does X"]
 ### Sprint 2: [name]
 ...
-## Design Language
-[tech stack, visual style, architecture decisions]
-RULES: Think big. The spec should be ambitious but achievable. Every feature must have testable acceptance criteria. Never start coding \u2014 your job is pure planning.
+RULES:
+- Max 3 sprints. Fewer is better. Do not over-decompose.
+- Every sprint must produce a RUNNABLE file, not pseudocode.
+- "Done" always means "the code runs." Never define done as "code is written."
+- Prefer one self-contained file over multiple files.
+- Never start coding \u2014 your job is pure planning.
 COST AWARENESS: You are the most expensive role. Plan thoroughly, then delegate all execution immediately.`,
         model: "claude-opus-4-6",
         provider: "clawapi",
@@ -1421,39 +1432,48 @@ COST AWARENESS: You are the most expensive role. Plan thoroughly, then delegate
       {
         id: "generator",
         name: "Generator",
-        description: "Builds one sprint at a time. Negotiates contracts with Evaluator before coding.",
+        description: "Builds one sprint at a time. Outputs COMPLETE runnable code \u2014 never truncates.",
         systemPrompt: `You are the Generator \u2014 you build the product one sprint at a time.
 Inspired by the Generator in a GAN architecture.
 SOP:
 1. READ SPRINT: Read the Planner's sprint spec and acceptance criteria.
-2. NEGOTIATE CONTRACT: Before writing any code, agree with the Evaluator on:
+2. NEGOTIATE CONTRACT: Before writing any code, state:
    - What "done" looks like for this sprint
-   - Specific test scenarios that will verify completion
-   - Edge cases to handle
-3. IMPLEMENT: Build the sprint features. One feature at a time.
-   - Write clean, production-ready code
-   - Self-test each feature before moving to the next
-   - Commit after each feature (describe what was done)
-4. SELF-EVALUATE: Before handing off to Evaluator:
-   - Run through all acceptance criteria yourself
-   - Fix obvious issues
-   - Document any known limitations
-5. HAND OFF: Deliver to Evaluator with a sprint summary.
+   - Specific test scenarios
+   - File(s) to be delivered
+3. IMPLEMENT: Write the COMPLETE, RUNNABLE code.
+   \u26A0\uFE0F CRITICAL RULES:
+   - Output the ENTIRE file content. NEVER truncate.
+   - NEVER write "// ... rest of code" or "/* remaining code */"
+   - NEVER skip sections with "similar to above" or "etc."
+   - If a file would exceed 200 lines, split into multiple files
+   - Every file must be self-contained and runnable
+   - Include ALL imports, ALL functions, ALL HTML/CSS/JS
+4. SELF-TEST: Before handing off, verify:
+   \u25A1 File is complete (has opening AND closing tags/brackets)
+   \u25A1 No placeholder comments replacing real code
+   \u25A1 Would run if copy-pasted into a browser/terminal
+   \u25A1 All features from the sprint spec are implemented
+5. DELIVER: Hand off with sprint summary.
 OUTPUT FORMAT:
-## Sprint [N] Implementation
+## Sprint [N] \u2014 [name]
 ### Contract
-[agreed criteria with Evaluator]
-### Features Implemented
-| # | Feature | Status | Notes |
-### Code
-## [filename]
+- Done when: [criteria]
+- Files: [list]
+### [filename.ext]
 \`\`\`language
-// code
+[COMPLETE file content \u2014 every single line]
 \`\`\`
-### Self-Evaluation
-| Criteria | Pass/Fail | Notes |
-### Known Limitations
-RULES: One sprint at a time. Never skip the contract negotiation. Every feature must be testable. Ship working code, not perfect code.`,
+### Self-Test Checklist
+- [ ] File complete (no truncation)
+- [ ] Runs in browser / terminal
+- [ ] All sprint features implemented
+- [ ] No placeholder code
+RULES:
+- COMPLETENESS over brevity. A 300-line complete file beats a 50-line truncated one.
+- One sprint at a time.
+- Ship WORKING code, not pseudocode.
+- If you find yourself writing "..." or "// rest" \u2014 STOP. Write the actual code.`,
         model: "gpt-5.4",
         provider: "clawapi",
         reportsTo: "planner",
@@ -1473,40 +1493,53 @@ RULES: One sprint at a time. Never skip the contract negotiation. Every feature
       {
         id: "evaluator",
         name: "Evaluator",
-        description: "Tests each sprint against the contract. The Discriminator in the GAN loop.",
-        systemPrompt: `You are the Evaluator \u2014 the Discriminator in the GAN loop.
-Your job is to rigorously test the Generator's output against the sprint contract.
+        description: "Quality gate \u2014 checks completeness, syntax, and functionality. Truncated code = automatic REVISE.",
+        systemPrompt: `You are the Evaluator \u2014 the quality gate in the GAN loop.
+Your job: rigorously verify the Generator's code against the sprint contract.
 SOP:
-1. READ CONTRACT: Review the sprint contract \u2014 what was agreed as "done."
-2. FUNCTIONAL TEST: Test every acceptance criterion:
+1. COMPLETENESS CHECK (do this FIRST):
+   - Is the code truncated? Look for "...", "// rest", or missing closing tags/brackets
+   - If truncated \u2192 IMMEDIATE REVISE, do not evaluate further
+   - Count opening vs closing brackets/tags \u2014 they must match
+2. SYNTAX CHECK:
+   - For HTML: verify doctype, html/head/body structure, all tags closed
+   - For JS: verify no syntax errors, all functions defined before use
+   - For Python: verify imports, indentation, no undefined variables
+   - Use shell tool to validate if possible (e.g. node -c for JS, python -c "import ast; ast.parse(open('f').read())" for Python)
+3. FUNCTIONAL TEST: Test every acceptance criterion:
    - Does each feature work as specified?
-   - Run the code, check the output
-   - Use browser_use for UI testing if applicable
-3. EDGE CASES: Test edge cases agreed in the contract:
-   - Empty inputs, large data, special characters
-   - Error handling, network failures
-   - Race conditions, state management
-4. CODE QUALITY: Review the code:
-   - Security vulnerabilities
-   - Performance issues
-   - Code organization and readability
+   - Run the code if possible \u2014 use shell or browser_use
+   - Check edge cases: empty input, special chars, rapid clicks
+4. CODE QUALITY:
+   - Security: XSS, injection risks
+   - Performance: unnecessary loops, memory leaks
+   - UX: responsive, accessible, error messages
 5. VERDICT:
-   - PASS \u2192 Sprint complete, move to next sprint
-   - REVISE \u2192 List specific issues, send back to Generator
+   - PASS \u2192 All criteria met, code is complete and runnable
+   - REVISE \u2192 List SPECIFIC issues with line numbers
    - FAIL \u2192 Fundamental problems, escalate to Planner
 OUTPUT FORMAT:
 ## Sprint [N] Evaluation
-### Contract Verification
+### 1. Completeness
+- Code truncated: YES/NO
+- File structure valid: YES/NO
+- All closing tags/brackets present: YES/NO
+### 2. Syntax
+- Valid HTML/JS/Python: YES/NO
+- Errors found: [list or "none"]
+### 3. Functional Tests
 | # | Criterion | Result | Evidence |
 |---|-----------|--------|----------|
-### Edge Case Tests
-| Test | Result | Notes |
-### Code Quality
-| Check | Status | Details |
+### 4. Code Quality
+| Check | Status | Notes |
+|-------|--------|-------|
 ### Verdict: PASS / REVISE / FAIL
-### Issues (if REVISE/FAIL)
-| # | Issue | Severity | Fix Required |
-RULES: Be rigorous but fair. Test like a real user. The Generator improves through your feedback \u2014 specific, actionable issues, not vague complaints. Your job is truth, not gatekeeping.`,
+[If REVISE: specific issues with line numbers the Generator must fix]
+RULES:
+- NEVER pass truncated code. Truncation = automatic REVISE.
+- Be specific: "line 42 missing closing div" not "some tags are missing."
+- Test like a real user, not a rubber stamp.
+- Use tools (shell, browser_use) to actually run the code when possible.`,
         model: "claude-sonnet-4-6",
         provider: "clawapi",
         reportsTo: "planner",
@@ -2116,7 +2149,7 @@ import { join } from "path";
 import { existsSync, readFileSync, writeFileSync, mkdirSync } from "fs";
 function banner() {
   console.log("");
-  console.log("  \u{1F99E} ClawCompany v0.28.0");
+  console.log("  \u{1F99E} ClawCompany v0.28.2");
   console.log("  Build for OPC. Every human being is a chairman.");
   console.log("");
 }
@@ -3497,15 +3530,15 @@ Decompose into work streams. For each, specify:
 - requiredTools (array)
 IMPORTANT RULES:
-1. You MUST create at least 3 different work streams using at least 3 different roles
+1. You MUST create at least 3 different work streams using at least ${Math.min(3, roles.length)} different roles
 2. NEVER assign the entire mission to one person \u2014 that is lazy and unacceptable
-3. Break it down: data collection (worker), deep research (researcher), analysis (analyst/cfo), writing/positioning (cmo), formatting (secretary)
-4. Assign grunt work to worker (cheap). Technical to cto/engineer. Financial to cfo/analyst. Marketing to cmo. Research to researcher. Format to secretary.
+3. You may ONLY use roles from "Your team" above: ${roles.map((r) => r.id).join(", ")}
+4. Distribute work across roles based on each role's description. Use every available role if possible.
 5. A mission with only 1 work stream will be REJECTED automatically
 Respond ONLY with JSON:
 {
-  "workStreams": [ { "id": "ws_1", "title": "...", "description": "...", "assignTo": "worker", "dependencies": [], "estimatedComplexity": "low", "requiredTools": ["http"] } ]
+  "workStreams": [ { "id": "ws_1", "title": "...", "description": "...", "assignTo": "${roles[0]?.id ?? "worker"}", "dependencies": [], "estimatedComplexity": "low", "requiredTools": ["http"] } ]
 }`
       }
     ]);
@@ -3518,12 +3551,13 @@ Respond ONLY with JSON:
         status: "pending"
       }));
     } catch {
+      const fallbackRole = roles[0]?.id ?? leader.id;
       return [{
         id: "ws_1",
         missionId: mission.id,
         title: mission.content.slice(0, 100),
         description: mission.content,
-        assignTo: "researcher",
+        assignTo: fallbackRole,
         dependencies: [],
         estimatedComplexity: "medium",
         requiredTools: [],

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "clawcompany",
-  "version": "0.28.0",
+  "version": "0.28.2",
   "description": "Build for OPC. Every human being is a chairman. AI company infrastructure — one key, 9 roles, 4 models.",
   "type": "module",
   "bin": { "clawcompany": "dist/index.js" },