npm - majlis - Versions diffs - 0.4.1 → 0.4.3 - Mend

majlis 0.4.1 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/dist/cli.js +73 -40
package/package.json +1 -1

package/dist/cli.js CHANGED Viewed

@@ -522,24 +522,35 @@ Before building:
 3. Check docs/classification/ for problem taxonomy
 4. Check docs/experiments/ for prior work
-## Scope Constraint (CRITICAL)
+Read as much code as you need to understand the problem. Reading is free \u2014 spend
+as many turns as necessary on Read, Grep, and Glob to build full context before
+you touch anything.
+## The Rule: ONE Change, Then Document
+You make ONE code change per cycle. Not two, not "one more quick fix." ONE.
-You get ONE attempt per cycle. Your job is:
-1. Read and diagnose \u2014 understand the problem thoroughly
-2. Form ONE hypothesis about what to fix
-3. Implement ONE focused change (not a multi-step debug session)
-4. Run the benchmark ONCE to see the result
-5. Update the experiment doc in docs/experiments/ \u2014 fill in Approach, Results, and Metrics sections. This is NOT optional.
-6. Output the structured majlis-json block with your decisions
-7. STOP
+The sequence:
+1. **Read and understand** \u2014 read synthesis, dead-ends, source code. Take your time.
+2. **Write the experiment doc FIRST** \u2014 before coding, fill in the Approach section
+   with what you plan to do and why. This ensures there is always a record.
+3. **Implement ONE focused change** \u2014 a single coherent edit to the codebase.
+4. **Run the benchmark ONCE** \u2014 observe the result.
+5. **Update the experiment doc** \u2014 fill in Results and Metrics with what happened.
+6. **Output the majlis-json block** \u2014 your structured decisions.
+7. **STOP.**
-Do NOT iterate. Do NOT try multiple approaches. Do NOT debug your own fix.
-If your change doesn't work, document why and let the cycle continue \u2014
-the adversary, critic, and verifier will help diagnose what went wrong.
-The cycle will come back to you with their insights.
+If your change doesn't work, document what happened and STOP. Do NOT try to fix it.
+Do NOT iterate. Do NOT "try one more thing." The adversary, critic, and verifier
+exist to diagnose what went wrong. The cycle comes back to you with their insights.
-If you find yourself wanting to "try one more thing," that's the signal to stop
-and write up what you learned. The other agents exist precisely for this reason.
+If you find yourself wanting to debug your own fix, that's the signal to stop
+and write up what you learned.
+## Off-limits (DO NOT modify)
+- \`fixtures/\` \u2014 test data, ground truth, STL files. Read-only.
+- \`scripts/benchmark.py\` \u2014 the measurement tool. Never change how you're measured.
+- \`.majlis/\` \u2014 framework config. Not your concern.
 ## During building:
 - Tag EVERY decision: proof / test / strong-consensus / consensus / analogy / judgment
@@ -2294,19 +2305,20 @@ ${contextJson}
 ${taskPrompt}`;
   const turns = ROLE_MAX_TURNS[role] ?? 15;
-  console.log(`[majlis] Spawning ${role} agent (model: ${agentDef.model}, maxTurns: ${turns})...`);
+  console.log(`[${role}] Spawning (model: ${agentDef.model}, maxTurns: ${turns})...`);
   const { text: markdown, costUsd } = await runQuery({
     prompt,
     model: agentDef.model,
     tools: agentDef.tools,
     systemPrompt: agentDef.systemPrompt,
     cwd: root,
-    maxTurns: turns
+    maxTurns: turns,
+    label: role
   });
-  console.log(`[majlis] ${role} agent complete (cost: $${costUsd.toFixed(4)})`);
+  console.log(`[${role}] Complete (cost: $${costUsd.toFixed(4)})`);
   const artifactPath = writeArtifact(role, context, markdown, root);
   if (artifactPath) {
-    console.log(`[majlis] ${role} artifact written to ${artifactPath}`);
+    console.log(`[${role}] Artifact written to ${artifactPath}`);
   }
   const structured = await extractStructuredData(role, markdown);
   return { output: markdown, structured };
@@ -2323,20 +2335,21 @@ ${contextJson}
 ${taskPrompt}`;
   const systemPrompt = 'You are a Synthesis Agent. Be concrete: which decisions failed, which assumptions broke, what constraints must the next approach satisfy. CRITICAL: Your LAST line of output MUST be a <!-- majlis-json --> block. The framework parses this programmatically \u2014 if you omit it, the pipeline breaks. Format: <!-- majlis-json {"guidance": "your guidance here"} -->';
-  console.log(`[majlis] Spawning synthesiser micro-agent...`);
+  console.log(`[synthesiser] Spawning (maxTurns: 5)...`);
   const { text: markdown, costUsd } = await runQuery({
     prompt,
     model: "opus",
     tools: ["Read", "Glob", "Grep"],
     systemPrompt,
     cwd: root,
-    maxTurns: 5
+    maxTurns: 5,
+    label: "synthesiser"
   });
-  console.log(`[majlis] Synthesiser complete (cost: $${costUsd.toFixed(4)})`);
-  const structured = await extractStructuredData("synthesiser", markdown);
-  return { output: markdown, structured };
+  console.log(`[synthesiser] Complete (cost: $${costUsd.toFixed(4)})`);
+  return { output: markdown, structured: { guidance: markdown } };
 }
 async function runQuery(opts) {
+  const tag = opts.label ?? "majlis";
   const conversation = (0, import_claude_agent_sdk2.query)({
     prompt: opts.prompt,
     options: {
@@ -2370,21 +2383,21 @@ async function runQuery(opts) {
           const toolName = block.name ?? "tool";
           const input = block.input ?? {};
           const detail = formatToolDetail(toolName, input);
-          process.stderr.write(`${DIM2}[majlis]   ${CYAN2}${toolName}${RESET2}${DIM2}${detail}${RESET2}
+          process.stderr.write(`${DIM2}[${tag}]   ${CYAN2}${toolName}${RESET2}${DIM2}${detail}${RESET2}
 `);
         }
       }
       if (hasText) {
         const preview = textParts[textParts.length - 1].slice(0, 120).replace(/\n/g, " ").trim();
         if (preview) {
-          process.stderr.write(`${DIM2}[majlis]   writing: ${preview}${preview.length >= 120 ? "..." : ""}${RESET2}
+          process.stderr.write(`${DIM2}[${tag}]   writing: ${preview}${preview.length >= 120 ? "..." : ""}${RESET2}
 `);
         }
       }
     } else if (message.type === "tool_progress") {
       const elapsed = Math.round(message.elapsed_time_seconds);
       if (elapsed > 0 && elapsed % 5 === 0) {
-        process.stderr.write(`${DIM2}[majlis]   ${message.tool_name} running (${elapsed}s)...${RESET2}
+        process.stderr.write(`${DIM2}[${tag}]   ${message.tool_name} running (${elapsed}s)...${RESET2}
 `);
       }
     } else if (message.type === "result") {
@@ -2392,7 +2405,7 @@ async function runQuery(opts) {
         costUsd = message.total_cost_usd;
       } else if (message.subtype === "error_max_turns") {
         costUsd = "total_cost_usd" in message ? message.total_cost_usd : 0;
-        console.warn(`[majlis] Agent hit max turns (${turnCount}). Returning partial output.`);
+        console.warn(`[${tag}] Hit max turns (${turnCount}). Returning partial output.`);
       } else {
         const errors = "errors" in message ? message.errors?.join("; ") ?? "Unknown error" : "Unknown error";
         throw new Error(`Agent query failed (${message.subtype}): ${errors}`);
@@ -2444,7 +2457,7 @@ function writeArtifact(role, context, markdown, projectRoot) {
   }
   const expSlug = context.experiment?.slug ?? "general";
   const existing = fs7.readdirSync(fullDir).filter((f) => f.endsWith(".md") && !f.startsWith("_"));
-  const nextNum = String(existing.length + 1).padStart(3, "0");
+  const nextNum = String(context.experiment?.id ?? existing.length + 1).padStart(3, "0");
   const filename = role === "builder" ? `${nextNum}-${expSlug}.md` : `${nextNum}-${role}-${expSlug}.md`;
   const target = path7.join(fullDir, filename);
   fs7.writeFileSync(target, markdown);
@@ -2460,13 +2473,13 @@ var init_spawn = __esm({
     init_parse();
     init_connection();
     ROLE_MAX_TURNS = {
-      builder: 15,
-      critic: 12,
-      adversary: 12,
-      verifier: 15,
-      compressor: 15,
-      reframer: 12,
-      scout: 12
+      builder: 50,
+      critic: 30,
+      adversary: 30,
+      verifier: 50,
+      compressor: 30,
+      reframer: 20,
+      scout: 20
     };
     DIM2 = "\x1B[2m";
     RESET2 = "\x1B[0m";
@@ -2482,9 +2495,19 @@ function worstGrade(grades) {
   return "sound";
 }
 async function resolve(db, exp, projectRoot) {
-  const grades = getVerificationsByExperiment(db, exp.id);
+  let grades = getVerificationsByExperiment(db, exp.id);
   if (grades.length === 0) {
-    throw new Error(`No verifications found for experiment ${exp.slug}. Run verify first.`);
+    warn(`No verification records for ${exp.slug}. Defaulting to weak.`);
+    insertVerification(
+      db,
+      exp.id,
+      "auto-default",
+      "weak",
+      null,
+      null,
+      "No structured verification output. Auto-defaulted to weak."
+    );
+    grades = getVerificationsByExperiment(db, exp.id);
   }
   const overallGrade = worstGrade(grades);
   switch (overallGrade) {
@@ -3230,7 +3253,16 @@ async function run(args) {
       continue;
     }
     info(`[Step ${stepCount}] ${exp.slug}: ${exp.status}`);
-    await next([exp.slug], false);
+    try {
+      await next([exp.slug], false);
+    } catch (err) {
+      const message = err instanceof Error ? err.message : String(err);
+      warn(`Step failed for ${exp.slug}: ${message}`);
+      try {
+        updateExperimentStatus(db, exp.id, "dead_end");
+      } catch {
+      }
+    }
   }
   if (stepCount >= MAX_STEPS) {
     warn(`Reached max steps (${MAX_STEPS}). Stopping autonomous mode.`);
@@ -3286,7 +3318,8 @@ ${deadEnds.map((d) => `- ${d.approach}: ${d.why_failed} [constraint: ${d.structu
 3. If NO \u2014 propose the SINGLE most promising next experiment hypothesis.
    - It must NOT repeat a dead-ended approach (check the dead-end registry!)
    - It should attack the weakest point revealed by synthesis/fragility
-   - It must be specific and actionable \u2014 name the exact code/function/mechanism to change
+   - It must be specific and actionable \u2014 name the function or mechanism to change
+   - Do NOT reference specific line numbers \u2014 they shift between experiments
    - The hypothesis should be a single sentence describing what to do, e.g.:
      "Activate addSeamEdges() in the runEdgeFirst pipeline for full-revolution cylinder faces"

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "majlis",
-  "version": "0.4.1",
+  "version": "0.4.3",
   "description": "Multi-agent workflow CLI for structured doubt, independent verification, and compressed knowledge",
   "bin": {
     "majlis": "./dist/cli.js"