moltblock 0.5.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/{readme.md → README.md} +82 -0
- package/dist/agents.d.ts +4 -4
- package/dist/agents.js +30 -16
- package/dist/code-verifier.d.ts +13 -0
- package/dist/code-verifier.js +21 -0
- package/dist/composite-verifier.d.ts +21 -0
- package/dist/composite-verifier.js +42 -0
- package/dist/config.d.ts +40 -0
- package/dist/config.js +20 -0
- package/dist/domain-prompts.d.ts +21 -0
- package/dist/domain-prompts.js +33 -0
- package/dist/entity-base.d.ts +37 -0
- package/dist/entity-base.js +87 -0
- package/dist/graph-runner.d.ts +11 -1
- package/dist/graph-runner.js +15 -4
- package/dist/improvement.d.ts +1 -1
- package/dist/improvement.js +21 -9
- package/dist/index.d.ts +10 -3
- package/dist/index.js +16 -4
- package/dist/policy-verifier.d.ts +29 -0
- package/dist/policy-verifier.js +90 -0
- package/dist/risk.d.ts +13 -0
- package/dist/risk.js +63 -0
- package/dist/verifier-interface.d.ts +24 -0
- package/dist/verifier-interface.js +4 -0
- package/package.json +3 -2
- package/skill/SKILL.md +103 -0
package/{readme.md → README.md}
RENAMED
|
@@ -208,6 +208,87 @@ npm test
|
|
|
208
208
|
- **Molt and governance** — `GovernanceConfig` (rate limit, veto); `canMolt()`, `triggerMolt()`, `pause()`, `resume()`, `emergencyShutdown()`; audit log and governance state in `Store`.
|
|
209
209
|
- **Multi-entity handoff** — `signArtifact()` / `verifyArtifact()`; inbox per entity; `sendArtifact()`, `receiveArtifacts()` for Entity A → Entity B.
|
|
210
210
|
|
|
211
|
+
### New in v0.6
|
|
212
|
+
|
|
213
|
+
- **Pluggable verifier system** — `Verifier` interface so verification isn't limited to vitest. Implement `verify(memory, context)` to plug in any gating strategy.
|
|
214
|
+
- **PolicyVerifier** — Rule-based verifier with ~20 built-in deny rules. Catches destructive commands (`rm -rf`, `DROP TABLE`), sensitive file access (`.ssh/`, `/etc/shadow`), hardcoded secrets, and exfiltration patterns — all without an LLM call.
|
|
215
|
+
- **CodeVerifier** — Adapter wrapping the existing vitest verifier into the pluggable interface.
|
|
216
|
+
- **CompositeVerifier** — Chains multiple verifiers (e.g. policy + code); all must pass. Supports fail-fast and collect-all modes.
|
|
217
|
+
- **Generic Entity** — `Entity` class with pluggable verifier and domain-aware prompts. Use `new Entity({ domain: "general" })` for non-code tasks.
|
|
218
|
+
- **Domain prompts** — Registry mapping domains to role-specific system prompts. Built-in `"code"` and `"general"` domains; register custom domains with `registerDomain()`.
|
|
219
|
+
- **Risk classification** — `classifyRisk(task)` returns `"low"` / `"medium"` / `"high"` with reasons. Pure regex matching, no LLM needed.
|
|
220
|
+
- **Policy rules in config** — Add custom `policy.rules` to `moltblock.json` for project-specific allow/deny rules.
|
|
221
|
+
- **OpenClaw skill** — `skill/SKILL.md` for one-step installation into OpenClaw workspace.
|
|
222
|
+
|
|
223
|
+
---
|
|
224
|
+
|
|
225
|
+
## Policy Verifier
|
|
226
|
+
|
|
227
|
+
The `PolicyVerifier` catches dangerous patterns in artifacts and tasks without needing an LLM:
|
|
228
|
+
|
|
229
|
+
```typescript
|
|
230
|
+
import { PolicyVerifier, WorkingMemory } from "moltblock";
|
|
231
|
+
|
|
232
|
+
const verifier = new PolicyVerifier();
|
|
233
|
+
const memory = new WorkingMemory();
|
|
234
|
+
memory.setFinalCandidate("rm -rf /");
|
|
235
|
+
|
|
236
|
+
const result = await verifier.verify(memory);
|
|
237
|
+
// result.passed === false
|
|
238
|
+
// result.evidence includes "[cmd-rm-rf] Recursive force delete"
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
Custom rules can be added via constructor or config:
|
|
242
|
+
|
|
243
|
+
```typescript
|
|
244
|
+
const verifier = new PolicyVerifier([
|
|
245
|
+
{
|
|
246
|
+
id: "allow-tmp-cleanup",
|
|
247
|
+
description: "Allow cleanup in /tmp",
|
|
248
|
+
target: "artifact",
|
|
249
|
+
pattern: "\\/tmp\\/",
|
|
250
|
+
action: "allow",
|
|
251
|
+
category: "destructive-cmd",
|
|
252
|
+
enabled: true,
|
|
253
|
+
},
|
|
254
|
+
]);
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
---
|
|
258
|
+
|
|
259
|
+
## Risk Classification
|
|
260
|
+
|
|
261
|
+
Classify task risk before deciding whether to verify:
|
|
262
|
+
|
|
263
|
+
```typescript
|
|
264
|
+
import { classifyRisk } from "moltblock";
|
|
265
|
+
|
|
266
|
+
classifyRisk("write a hello world function");
|
|
267
|
+
// { level: "low", reasons: [] }
|
|
268
|
+
|
|
269
|
+
classifyRisk("sudo rm -rf /home/user");
|
|
270
|
+
// { level: "high", reasons: ["Sudo privilege escalation", "Recursive file deletion (rm -rf)"] }
|
|
271
|
+
```
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
## Generic Entity
|
|
276
|
+
|
|
277
|
+
For non-code tasks, use the generic `Entity` with domain-aware prompts:
|
|
278
|
+
|
|
279
|
+
```typescript
|
|
280
|
+
import { Entity, PolicyVerifier, CompositeVerifier, CodeVerifier } from "moltblock";
|
|
281
|
+
|
|
282
|
+
// General-purpose entity (policy verification only)
|
|
283
|
+
const entity = new Entity({ domain: "general" });
|
|
284
|
+
|
|
285
|
+
// Code entity with both policy and vitest verification
|
|
286
|
+
const codeEntity = new Entity({
|
|
287
|
+
domain: "code",
|
|
288
|
+
verifier: new CompositeVerifier([new PolicyVerifier(), new CodeVerifier()]),
|
|
289
|
+
});
|
|
290
|
+
```
|
|
291
|
+
|
|
211
292
|
---
|
|
212
293
|
|
|
213
294
|
## Roadmap
|
|
@@ -215,6 +296,7 @@ npm test
|
|
|
215
296
|
- v0.1 — Protocol + architecture
|
|
216
297
|
- v0.2 — MVP Entity implementation (spec + Code Entity loop + graph, memory, improvement, governance, handoff)
|
|
217
298
|
- v0.3 — Multi-Entity collaboration (orchestration and tooling)
|
|
299
|
+
- v0.6 — Pluggable verification, policy rules, generic entity, risk classification, OpenClaw skill
|
|
218
300
|
|
|
219
301
|
---
|
|
220
302
|
|
package/dist/agents.d.ts
CHANGED
|
@@ -7,17 +7,17 @@ import { Store } from "./persistence.js";
|
|
|
7
7
|
/**
|
|
8
8
|
* Generator: task -> draft artifact (code).
|
|
9
9
|
*/
|
|
10
|
-
export declare function runGenerator(gateway: LLMGateway, memory: WorkingMemory, store?: Store | null): Promise<void>;
|
|
10
|
+
export declare function runGenerator(gateway: LLMGateway, memory: WorkingMemory, store?: Store | null, domain?: string): Promise<void>;
|
|
11
11
|
/**
|
|
12
12
|
* Critic: draft + task -> critique.
|
|
13
13
|
*/
|
|
14
|
-
export declare function runCritic(gateway: LLMGateway, memory: WorkingMemory, store?: Store | null): Promise<void>;
|
|
14
|
+
export declare function runCritic(gateway: LLMGateway, memory: WorkingMemory, store?: Store | null, domain?: string): Promise<void>;
|
|
15
15
|
/**
|
|
16
16
|
* Judge: task + draft + critique -> final candidate artifact.
|
|
17
17
|
*/
|
|
18
|
-
export declare function runJudge(gateway: LLMGateway, memory: WorkingMemory, store?: Store | null): Promise<void>;
|
|
18
|
+
export declare function runJudge(gateway: LLMGateway, memory: WorkingMemory, store?: Store | null, domain?: string): Promise<void>;
|
|
19
19
|
/**
|
|
20
20
|
* Run a single role with task and inputs (node_id -> content from predecessors).
|
|
21
21
|
* Returns the role's output string. Used by the graph runner.
|
|
22
22
|
*/
|
|
23
|
-
export declare function runRole(role: string, gateway: LLMGateway, task: string, inputs: Record<string, string>, longTermContext?: string, store?: Store | null): Promise<string>;
|
|
23
|
+
export declare function runRole(role: string, gateway: LLMGateway, task: string, inputs: Record<string, string>, longTermContext?: string, store?: Store | null, domain?: string): Promise<string>;
|
package/dist/agents.js
CHANGED
|
@@ -1,35 +1,49 @@
|
|
|
1
1
|
/**
|
|
2
2
|
* Agents: Generator, Critic, Judge. Each uses LLMGateway and reads/writes WorkingMemory.
|
|
3
3
|
*/
|
|
4
|
+
import { getDomainPrompts } from "./domain-prompts.js";
|
|
4
5
|
import { getStrategy } from "./persistence.js";
|
|
5
6
|
// Default prompts; can be overridden by strategy store (recursive improvement)
|
|
6
7
|
// Note: Prompts updated to produce TypeScript instead of Python
|
|
7
8
|
const CODE_GENERATOR_SYSTEM = `You are the Generator for a Code Entity. You produce a single TypeScript implementation that satisfies the user's task. Output only valid TypeScript code, no markdown fences or extra commentary. The code will be reviewed by a Critic and then verified by running tests.`;
|
|
8
9
|
const CODE_CRITIC_SYSTEM = `You are the Critic. Review the draft code for bugs, edge cases, and style. Be concise. List specific issues and suggestions. Do not rewrite the code; only critique.`;
|
|
9
10
|
const CODE_JUDGE_SYSTEM = `You are the Judge. Given the task, the draft code, and the critique, produce the final single TypeScript implementation. Output only valid TypeScript code, no markdown fences or extra commentary. Incorporate the critic's feedback. The result will be run through vitest.`;
|
|
10
|
-
function systemPrompt(role, store) {
|
|
11
|
+
function systemPrompt(role, store, domain = "code") {
|
|
11
12
|
if (store) {
|
|
12
13
|
const s = getStrategy(store, role);
|
|
13
14
|
if (s) {
|
|
14
15
|
return s;
|
|
15
16
|
}
|
|
16
17
|
}
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
18
|
+
// Hard-coded defaults for "code" domain (backward compat)
|
|
19
|
+
if (domain === "code") {
|
|
20
|
+
const defaults = {
|
|
21
|
+
generator: CODE_GENERATOR_SYSTEM,
|
|
22
|
+
critic: CODE_CRITIC_SYSTEM,
|
|
23
|
+
judge: CODE_JUDGE_SYSTEM,
|
|
24
|
+
};
|
|
25
|
+
const d = defaults[role];
|
|
26
|
+
if (d)
|
|
27
|
+
return d;
|
|
28
|
+
}
|
|
29
|
+
// Fall back to domain prompt registry
|
|
30
|
+
const prompts = getDomainPrompts(domain);
|
|
31
|
+
const roleMap = {
|
|
32
|
+
generator: prompts.generator,
|
|
33
|
+
critic: prompts.critic,
|
|
34
|
+
judge: prompts.judge,
|
|
21
35
|
};
|
|
22
|
-
return
|
|
36
|
+
return roleMap[role] ?? prompts.generator;
|
|
23
37
|
}
|
|
24
38
|
/**
|
|
25
39
|
* Generator: task -> draft artifact (code).
|
|
26
40
|
*/
|
|
27
|
-
export async function runGenerator(gateway, memory, store = null) {
|
|
41
|
+
export async function runGenerator(gateway, memory, store = null, domain = "code") {
|
|
28
42
|
let userContent = memory.task;
|
|
29
43
|
if (memory.longTermContext) {
|
|
30
44
|
userContent = userContent + "\n\nRelevant verified knowledge:\n" + memory.longTermContext;
|
|
31
45
|
}
|
|
32
|
-
const system = systemPrompt("generator", store);
|
|
46
|
+
const system = systemPrompt("generator", store, domain);
|
|
33
47
|
const messages = [
|
|
34
48
|
{ role: "system", content: system },
|
|
35
49
|
{ role: "user", content: userContent },
|
|
@@ -40,8 +54,8 @@ export async function runGenerator(gateway, memory, store = null) {
|
|
|
40
54
|
/**
|
|
41
55
|
* Critic: draft + task -> critique.
|
|
42
56
|
*/
|
|
43
|
-
export async function runCritic(gateway, memory, store = null) {
|
|
44
|
-
const system = systemPrompt("critic", store);
|
|
57
|
+
export async function runCritic(gateway, memory, store = null, domain = "code") {
|
|
58
|
+
const system = systemPrompt("critic", store, domain);
|
|
45
59
|
const messages = [
|
|
46
60
|
{ role: "system", content: system },
|
|
47
61
|
{ role: "user", content: `Task:\n${memory.task}\n\nDraft code:\n${memory.draft}` },
|
|
@@ -52,8 +66,8 @@ export async function runCritic(gateway, memory, store = null) {
|
|
|
52
66
|
/**
|
|
53
67
|
* Judge: task + draft + critique -> final candidate artifact.
|
|
54
68
|
*/
|
|
55
|
-
export async function runJudge(gateway, memory, store = null) {
|
|
56
|
-
const system = systemPrompt("judge", store);
|
|
69
|
+
export async function runJudge(gateway, memory, store = null, domain = "code") {
|
|
70
|
+
const system = systemPrompt("judge", store, domain);
|
|
57
71
|
const messages = [
|
|
58
72
|
{ role: "system", content: system },
|
|
59
73
|
{
|
|
@@ -68,13 +82,13 @@ export async function runJudge(gateway, memory, store = null) {
|
|
|
68
82
|
* Run a single role with task and inputs (node_id -> content from predecessors).
|
|
69
83
|
* Returns the role's output string. Used by the graph runner.
|
|
70
84
|
*/
|
|
71
|
-
export async function runRole(role, gateway, task, inputs, longTermContext = "", store = null) {
|
|
85
|
+
export async function runRole(role, gateway, task, inputs, longTermContext = "", store = null, domain = "code") {
|
|
72
86
|
let userContent = task;
|
|
73
87
|
if (longTermContext) {
|
|
74
88
|
userContent = task + "\n\nRelevant verified knowledge:\n" + longTermContext;
|
|
75
89
|
}
|
|
76
90
|
if (role === "generator") {
|
|
77
|
-
const system = systemPrompt("generator", store);
|
|
91
|
+
const system = systemPrompt("generator", store, domain);
|
|
78
92
|
const messages = [
|
|
79
93
|
{ role: "system", content: system },
|
|
80
94
|
{ role: "user", content: userContent },
|
|
@@ -87,7 +101,7 @@ export async function runRole(role, gateway, task, inputs, longTermContext = "",
|
|
|
87
101
|
if (longTermContext) {
|
|
88
102
|
content = content + "\n\nRelevant verified knowledge:\n" + longTermContext;
|
|
89
103
|
}
|
|
90
|
-
const system = systemPrompt("critic", store);
|
|
104
|
+
const system = systemPrompt("critic", store, domain);
|
|
91
105
|
const messages = [
|
|
92
106
|
{ role: "system", content: system },
|
|
93
107
|
{ role: "user", content: content },
|
|
@@ -101,7 +115,7 @@ export async function runRole(role, gateway, task, inputs, longTermContext = "",
|
|
|
101
115
|
if (longTermContext) {
|
|
102
116
|
content = content + "\n\nRelevant verified knowledge:\n" + longTermContext;
|
|
103
117
|
}
|
|
104
|
-
const system = systemPrompt("judge", store);
|
|
118
|
+
const system = systemPrompt("judge", store, domain);
|
|
105
119
|
const messages = [
|
|
106
120
|
{ role: "system", content: system },
|
|
107
121
|
{ role: "user", content: content },
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* CodeVerifier: adapter that wraps the existing vitest-based runVerifier into the Verifier interface.
|
|
3
|
+
*/
|
|
4
|
+
import type { WorkingMemory } from "./memory.js";
|
|
5
|
+
import type { Verifier, VerificationResult, VerifierContext } from "./verifier-interface.js";
|
|
6
|
+
/**
|
|
7
|
+
* Wraps the existing vitest verifier (runVerifier) into the pluggable Verifier interface.
|
|
8
|
+
* Uses context.testCode for the test file, same as CodeEntity.
|
|
9
|
+
*/
|
|
10
|
+
export declare class CodeVerifier implements Verifier {
|
|
11
|
+
readonly name = "CodeVerifier";
|
|
12
|
+
verify(memory: WorkingMemory, context?: VerifierContext): Promise<VerificationResult>;
|
|
13
|
+
}
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* CodeVerifier: adapter that wraps the existing vitest-based runVerifier into the Verifier interface.
|
|
3
|
+
*/
|
|
4
|
+
import { runVerifier } from "./verifier.js";
|
|
5
|
+
/**
|
|
6
|
+
* Wraps the existing vitest verifier (runVerifier) into the pluggable Verifier interface.
|
|
7
|
+
* Uses context.testCode for the test file, same as CodeEntity.
|
|
8
|
+
*/
|
|
9
|
+
export class CodeVerifier {
|
|
10
|
+
name = "CodeVerifier";
|
|
11
|
+
async verify(memory, context) {
|
|
12
|
+
const testCode = context?.testCode;
|
|
13
|
+
// runVerifier mutates memory.verificationPassed / verificationEvidence
|
|
14
|
+
await runVerifier(memory, testCode);
|
|
15
|
+
return {
|
|
16
|
+
passed: memory.verificationPassed,
|
|
17
|
+
evidence: memory.verificationEvidence || (memory.verificationPassed ? "Verification passed." : "Verification failed."),
|
|
18
|
+
verifierName: this.name,
|
|
19
|
+
};
|
|
20
|
+
}
|
|
21
|
+
}
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* CompositeVerifier: runs multiple verifiers, all must pass.
|
|
3
|
+
*/
|
|
4
|
+
import type { WorkingMemory } from "./memory.js";
|
|
5
|
+
import type { Verifier, VerificationResult, VerifierContext } from "./verifier-interface.js";
|
|
6
|
+
export interface CompositeVerifierOptions {
|
|
7
|
+
/** If true, stop at first failure. Defaults to true. */
|
|
8
|
+
failFast?: boolean;
|
|
9
|
+
}
|
|
10
|
+
/**
|
|
11
|
+
* Runs verifiers sequentially. All must pass for the composite to pass.
|
|
12
|
+
* Fail-fast mode (default) stops at the first failure.
|
|
13
|
+
* Collect-all mode runs every verifier and reports all results.
|
|
14
|
+
*/
|
|
15
|
+
export declare class CompositeVerifier implements Verifier {
|
|
16
|
+
readonly name = "CompositeVerifier";
|
|
17
|
+
private verifiers;
|
|
18
|
+
private failFast;
|
|
19
|
+
constructor(verifiers: Verifier[], options?: CompositeVerifierOptions);
|
|
20
|
+
verify(memory: WorkingMemory, context?: VerifierContext): Promise<VerificationResult>;
|
|
21
|
+
}
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* CompositeVerifier: runs multiple verifiers, all must pass.
|
|
3
|
+
*/
|
|
4
|
+
/**
|
|
5
|
+
* Runs verifiers sequentially. All must pass for the composite to pass.
|
|
6
|
+
* Fail-fast mode (default) stops at the first failure.
|
|
7
|
+
* Collect-all mode runs every verifier and reports all results.
|
|
8
|
+
*/
|
|
9
|
+
export class CompositeVerifier {
|
|
10
|
+
name = "CompositeVerifier";
|
|
11
|
+
verifiers;
|
|
12
|
+
failFast;
|
|
13
|
+
constructor(verifiers, options) {
|
|
14
|
+
if (verifiers.length === 0) {
|
|
15
|
+
throw new Error("CompositeVerifier requires at least one verifier");
|
|
16
|
+
}
|
|
17
|
+
this.verifiers = verifiers;
|
|
18
|
+
this.failFast = options?.failFast ?? true;
|
|
19
|
+
}
|
|
20
|
+
async verify(memory, context) {
|
|
21
|
+
const details = [];
|
|
22
|
+
let allPassed = true;
|
|
23
|
+
for (const verifier of this.verifiers) {
|
|
24
|
+
const result = await verifier.verify(memory, context);
|
|
25
|
+
details.push(result);
|
|
26
|
+
if (!result.passed) {
|
|
27
|
+
allPassed = false;
|
|
28
|
+
if (this.failFast)
|
|
29
|
+
break;
|
|
30
|
+
}
|
|
31
|
+
}
|
|
32
|
+
const evidence = details
|
|
33
|
+
.map((d) => `[${d.verifierName}] ${d.passed ? "PASS" : "FAIL"}: ${d.evidence}`)
|
|
34
|
+
.join("\n");
|
|
35
|
+
return {
|
|
36
|
+
passed: allPassed,
|
|
37
|
+
evidence,
|
|
38
|
+
verifierName: this.name,
|
|
39
|
+
details,
|
|
40
|
+
};
|
|
41
|
+
}
|
|
42
|
+
}
|
package/dist/config.d.ts
CHANGED
|
@@ -19,6 +19,23 @@ export declare const AgentConfigSchema: z.ZodObject<{
|
|
|
19
19
|
}, z.core.$strip>>>;
|
|
20
20
|
}, z.core.$strip>;
|
|
21
21
|
export type AgentConfig = z.infer<typeof AgentConfigSchema>;
|
|
22
|
+
export declare const PolicyRuleSchema: z.ZodObject<{
|
|
23
|
+
id: z.ZodString;
|
|
24
|
+
description: z.ZodString;
|
|
25
|
+
target: z.ZodEnum<{
|
|
26
|
+
artifact: "artifact";
|
|
27
|
+
task: "task";
|
|
28
|
+
both: "both";
|
|
29
|
+
}>;
|
|
30
|
+
pattern: z.ZodString;
|
|
31
|
+
action: z.ZodEnum<{
|
|
32
|
+
deny: "deny";
|
|
33
|
+
allow: "allow";
|
|
34
|
+
}>;
|
|
35
|
+
category: z.ZodString;
|
|
36
|
+
enabled: z.ZodDefault<z.ZodBoolean>;
|
|
37
|
+
}, z.core.$strip>;
|
|
38
|
+
export type PolicyRuleConfig = z.infer<typeof PolicyRuleSchema>;
|
|
22
39
|
export declare const MoltblockConfigSchema: z.ZodObject<{
|
|
23
40
|
agent: z.ZodOptional<z.ZodObject<{
|
|
24
41
|
bindings: z.ZodOptional<z.ZodRecord<z.ZodString, z.ZodObject<{
|
|
@@ -28,6 +45,24 @@ export declare const MoltblockConfigSchema: z.ZodObject<{
|
|
|
28
45
|
api_key: z.ZodOptional<z.ZodNullable<z.ZodString>>;
|
|
29
46
|
}, z.core.$strip>>>;
|
|
30
47
|
}, z.core.$strip>>;
|
|
48
|
+
policy: z.ZodOptional<z.ZodObject<{
|
|
49
|
+
rules: z.ZodOptional<z.ZodArray<z.ZodObject<{
|
|
50
|
+
id: z.ZodString;
|
|
51
|
+
description: z.ZodString;
|
|
52
|
+
target: z.ZodEnum<{
|
|
53
|
+
artifact: "artifact";
|
|
54
|
+
task: "task";
|
|
55
|
+
both: "both";
|
|
56
|
+
}>;
|
|
57
|
+
pattern: z.ZodString;
|
|
58
|
+
action: z.ZodEnum<{
|
|
59
|
+
deny: "deny";
|
|
60
|
+
allow: "allow";
|
|
61
|
+
}>;
|
|
62
|
+
category: z.ZodString;
|
|
63
|
+
enabled: z.ZodDefault<z.ZodBoolean>;
|
|
64
|
+
}, z.core.$strip>>>;
|
|
65
|
+
}, z.core.$strip>>;
|
|
31
66
|
}, z.core.$strip>;
|
|
32
67
|
export type MoltblockConfig = z.infer<typeof MoltblockConfigSchema>;
|
|
33
68
|
export declare const ModelBindingSchema: z.ZodObject<{
|
|
@@ -68,3 +103,8 @@ export declare function detectProvider(overrideProvider?: string, overrideModel?
|
|
|
68
103
|
* If no JSON, auto-detects provider from env vars. API keys from env win over JSON.
|
|
69
104
|
*/
|
|
70
105
|
export declare function defaultCodeEntityBindings(overrides?: BindingOverrides): Record<string, ModelBinding>;
|
|
106
|
+
/**
|
|
107
|
+
* Load custom policy rules from moltblock config.
|
|
108
|
+
* Returns empty array if no config or no rules defined.
|
|
109
|
+
*/
|
|
110
|
+
export declare function loadPolicyRules(): PolicyRuleConfig[];
|
package/dist/config.js
CHANGED
|
@@ -40,8 +40,20 @@ export const BindingEntrySchema = z.object({
|
|
|
40
40
|
export const AgentConfigSchema = z.object({
|
|
41
41
|
bindings: z.record(z.string(), BindingEntrySchema).optional().describe("Per-role model bindings"),
|
|
42
42
|
});
|
|
43
|
+
export const PolicyRuleSchema = z.object({
|
|
44
|
+
id: z.string().describe("Unique rule identifier"),
|
|
45
|
+
description: z.string().describe("Human-readable rule description"),
|
|
46
|
+
target: z.enum(["artifact", "task", "both"]).describe("What to match against"),
|
|
47
|
+
pattern: z.string().describe("Regex pattern string"),
|
|
48
|
+
action: z.enum(["deny", "allow"]).describe("deny blocks; allow overrides deny in same category"),
|
|
49
|
+
category: z.string().describe("Rule category for allow/deny grouping"),
|
|
50
|
+
enabled: z.boolean().default(true).describe("Whether the rule is active"),
|
|
51
|
+
});
|
|
43
52
|
export const MoltblockConfigSchema = z.object({
|
|
44
53
|
agent: AgentConfigSchema.optional().describe("Agent defaults and bindings"),
|
|
54
|
+
policy: z.object({
|
|
55
|
+
rules: z.array(PolicyRuleSchema).optional().describe("Custom policy rules"),
|
|
56
|
+
}).optional().describe("Policy verifier configuration"),
|
|
45
57
|
});
|
|
46
58
|
export const ModelBindingSchema = z.object({
|
|
47
59
|
backend: z.string().describe("e.g. 'local' or 'zai' or 'openai'"),
|
|
@@ -365,3 +377,11 @@ export function defaultCodeEntityBindings(overrides) {
|
|
|
365
377
|
verifier: bindingFor("verifier"),
|
|
366
378
|
};
|
|
367
379
|
}
|
|
380
|
+
/**
|
|
381
|
+
* Load custom policy rules from moltblock config.
|
|
382
|
+
* Returns empty array if no config or no rules defined.
|
|
383
|
+
*/
|
|
384
|
+
export function loadPolicyRules() {
|
|
385
|
+
const cfg = loadMoltblockConfig();
|
|
386
|
+
return cfg?.policy?.rules ?? [];
|
|
387
|
+
}
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Domain prompt registry: maps domain names to role-specific system prompts.
|
|
3
|
+
*/
|
|
4
|
+
/** Prompt set for a single domain: one system prompt per agent role. */
|
|
5
|
+
export interface DomainPrompts {
|
|
6
|
+
generator: string;
|
|
7
|
+
critic: string;
|
|
8
|
+
judge: string;
|
|
9
|
+
}
|
|
10
|
+
/**
|
|
11
|
+
* Get prompts for a domain. Falls back to "general" if domain is unknown.
|
|
12
|
+
*/
|
|
13
|
+
export declare function getDomainPrompts(domain: string): DomainPrompts;
|
|
14
|
+
/**
|
|
15
|
+
* Register a custom domain with its prompts. Overwrites if already exists.
|
|
16
|
+
*/
|
|
17
|
+
export declare function registerDomain(domain: string, prompts: DomainPrompts): void;
|
|
18
|
+
/**
|
|
19
|
+
* List all registered domain names.
|
|
20
|
+
*/
|
|
21
|
+
export declare function listDomains(): string[];
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Domain prompt registry: maps domain names to role-specific system prompts.
|
|
3
|
+
*/
|
|
4
|
+
const registry = new Map();
|
|
5
|
+
// --- Built-in domains ---
|
|
6
|
+
registry.set("code", {
|
|
7
|
+
generator: "You are the Generator for a Code Entity. You produce a single TypeScript implementation that satisfies the user's task. Output only valid TypeScript code, no markdown fences or extra commentary. The code will be reviewed by a Critic and then verified by running tests.",
|
|
8
|
+
critic: "You are the Critic. Review the draft code for bugs, edge cases, and style. Be concise. List specific issues and suggestions. Do not rewrite the code; only critique.",
|
|
9
|
+
judge: "You are the Judge. Given the task, the draft code, and the critique, produce the final single TypeScript implementation. Output only valid TypeScript code, no markdown fences or extra commentary. Incorporate the critic's feedback. The result will be run through vitest.",
|
|
10
|
+
});
|
|
11
|
+
registry.set("general", {
|
|
12
|
+
generator: "You are the Generator. Produce a clear, complete response that satisfies the user's task. Focus on accuracy and completeness. Your output will be reviewed by a Critic.",
|
|
13
|
+
critic: "You are the Critic. Review the draft response for factual errors, gaps, unclear reasoning, and potential risks. Be concise. List specific issues and suggestions. Do not rewrite the response; only critique.",
|
|
14
|
+
judge: "You are the Judge. Given the task, the draft response, and the critique, produce the final response. Incorporate the critic's feedback. Ensure the result is accurate, safe, and complete.",
|
|
15
|
+
});
|
|
16
|
+
/**
|
|
17
|
+
* Get prompts for a domain. Falls back to "general" if domain is unknown.
|
|
18
|
+
*/
|
|
19
|
+
export function getDomainPrompts(domain) {
|
|
20
|
+
return registry.get(domain) ?? registry.get("general");
|
|
21
|
+
}
|
|
22
|
+
/**
|
|
23
|
+
* Register a custom domain with its prompts. Overwrites if already exists.
|
|
24
|
+
*/
|
|
25
|
+
export function registerDomain(domain, prompts) {
|
|
26
|
+
registry.set(domain, prompts);
|
|
27
|
+
}
|
|
28
|
+
/**
|
|
29
|
+
* List all registered domain names.
|
|
30
|
+
*/
|
|
31
|
+
export function listDomains() {
|
|
32
|
+
return [...registry.keys()];
|
|
33
|
+
}
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Generic Entity: pluggable verifier and domain support for any task type.
|
|
3
|
+
* CodeEntity remains unchanged for backward compatibility.
|
|
4
|
+
*/
|
|
5
|
+
import { type ModelBinding } from "./config.js";
|
|
6
|
+
import { WorkingMemory } from "./memory.js";
|
|
7
|
+
import { Store } from "./persistence.js";
|
|
8
|
+
import type { Verifier } from "./verifier-interface.js";
|
|
9
|
+
/** Options for constructing a generic Entity. */
|
|
10
|
+
export interface EntityOptions {
|
|
11
|
+
/** Verifier to gate artifacts. Defaults to PolicyVerifier. */
|
|
12
|
+
verifier?: Verifier;
|
|
13
|
+
/** Domain for agent prompts. Defaults to "general". */
|
|
14
|
+
domain?: string;
|
|
15
|
+
/** Per-role model bindings. Auto-detected if omitted. */
|
|
16
|
+
bindings?: Record<string, ModelBinding>;
|
|
17
|
+
}
|
|
18
|
+
/**
|
|
19
|
+
* Generic Entity: same Generator -> Critic -> Judge pipeline as CodeEntity,
|
|
20
|
+
* but with a pluggable verifier and domain-aware prompts.
|
|
21
|
+
*/
|
|
22
|
+
export declare class Entity {
|
|
23
|
+
private gateways;
|
|
24
|
+
private verifier;
|
|
25
|
+
private domain;
|
|
26
|
+
constructor(options?: EntityOptions);
|
|
27
|
+
/**
|
|
28
|
+
* One full loop: task -> Generator -> Critic -> Judge -> Verifier -> gating.
|
|
29
|
+
* Returns working memory with authoritative_artifact set only if verification passed.
|
|
30
|
+
*/
|
|
31
|
+
run(task: string, options?: {
|
|
32
|
+
testCode?: string;
|
|
33
|
+
store?: Store;
|
|
34
|
+
entityVersion?: string;
|
|
35
|
+
writeCheckpointAfter?: boolean;
|
|
36
|
+
}): Promise<WorkingMemory>;
|
|
37
|
+
}
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Generic Entity: pluggable verifier and domain support for any task type.
|
|
3
|
+
* CodeEntity remains unchanged for backward compatibility.
|
|
4
|
+
*/
|
|
5
|
+
import { runCritic, runGenerator, runJudge } from "./agents.js";
|
|
6
|
+
import { defaultCodeEntityBindings } from "./config.js";
|
|
7
|
+
import { LLMGateway } from "./gateway.js";
|
|
8
|
+
import { WorkingMemory } from "./memory.js";
|
|
9
|
+
import { hashMemory, recordOutcome } from "./persistence.js";
|
|
10
|
+
import { PolicyVerifier } from "./policy-verifier.js";
|
|
11
|
+
import { validateTask } from "./validation.js";
|
|
12
|
+
/**
|
|
13
|
+
* Generic Entity: same Generator -> Critic -> Judge pipeline as CodeEntity,
|
|
14
|
+
* but with a pluggable verifier and domain-aware prompts.
|
|
15
|
+
*/
|
|
16
|
+
export class Entity {
|
|
17
|
+
gateways;
|
|
18
|
+
verifier;
|
|
19
|
+
domain;
|
|
20
|
+
constructor(options) {
|
|
21
|
+
const resolvedBindings = options?.bindings ?? defaultCodeEntityBindings();
|
|
22
|
+
this.verifier = options?.verifier ?? new PolicyVerifier();
|
|
23
|
+
this.domain = options?.domain ?? "general";
|
|
24
|
+
this.gateways = {
|
|
25
|
+
generator: new LLMGateway(resolvedBindings["generator"]),
|
|
26
|
+
critic: new LLMGateway(resolvedBindings["critic"]),
|
|
27
|
+
judge: new LLMGateway(resolvedBindings["judge"]),
|
|
28
|
+
};
|
|
29
|
+
}
|
|
30
|
+
/**
|
|
31
|
+
* One full loop: task -> Generator -> Critic -> Judge -> Verifier -> gating.
|
|
32
|
+
* Returns working memory with authoritative_artifact set only if verification passed.
|
|
33
|
+
*/
|
|
34
|
+
async run(task, options = {}) {
|
|
35
|
+
const { testCode, store, entityVersion = "0.5.0", writeCheckpointAfter = false, } = options;
|
|
36
|
+
// Validate input
|
|
37
|
+
const taskValidation = validateTask(task);
|
|
38
|
+
if (!taskValidation.valid) {
|
|
39
|
+
throw new Error(`Invalid task: ${taskValidation.error}`);
|
|
40
|
+
}
|
|
41
|
+
const t0 = performance.now();
|
|
42
|
+
const memory = new WorkingMemory();
|
|
43
|
+
memory.setTask(task);
|
|
44
|
+
// Inject long-term context from verified memory
|
|
45
|
+
if (store) {
|
|
46
|
+
const recent = store.getRecentVerified(5);
|
|
47
|
+
const parts = [];
|
|
48
|
+
for (const e of recent) {
|
|
49
|
+
if (e.content_preview) {
|
|
50
|
+
parts.push(e.content_preview.slice(0, 500));
|
|
51
|
+
}
|
|
52
|
+
else if (e.summary) {
|
|
53
|
+
parts.push(e.summary);
|
|
54
|
+
}
|
|
55
|
+
}
|
|
56
|
+
memory.longTermContext = parts.length > 0 ? parts.join("\n---\n") : "";
|
|
57
|
+
}
|
|
58
|
+
// Run the agent pipeline with domain-aware prompts
|
|
59
|
+
await runGenerator(this.gateways["generator"], memory, store ?? null, this.domain);
|
|
60
|
+
await runCritic(this.gateways["critic"], memory, store ?? null, this.domain);
|
|
61
|
+
await runJudge(this.gateways["judge"], memory, store ?? null, this.domain);
|
|
62
|
+
// Run pluggable verifier
|
|
63
|
+
const ctx = {
|
|
64
|
+
task,
|
|
65
|
+
testCode,
|
|
66
|
+
domain: this.domain,
|
|
67
|
+
};
|
|
68
|
+
const result = await this.verifier.verify(memory, ctx);
|
|
69
|
+
memory.setVerification(result.passed, result.evidence);
|
|
70
|
+
// Record outcome and persist if verification passed
|
|
71
|
+
const latencySec = (performance.now() - t0) / 1000;
|
|
72
|
+
if (store) {
|
|
73
|
+
recordOutcome(store, memory.verificationPassed, latencySec, task.slice(0, 100));
|
|
74
|
+
}
|
|
75
|
+
if (store && memory.verificationPassed && memory.authoritativeArtifact) {
|
|
76
|
+
const artifactRef = `artifact_${Date.now()}`;
|
|
77
|
+
store.addVerified(artifactRef, `Verified artifact (${memory.authoritativeArtifact.length} chars)`, memory.authoritativeArtifact.slice(0, 2000));
|
|
78
|
+
if (writeCheckpointAfter) {
|
|
79
|
+
const graphHash = `entity-${this.domain}`;
|
|
80
|
+
const refs = [artifactRef];
|
|
81
|
+
const memHash = hashMemory(refs);
|
|
82
|
+
store.writeCheckpoint(entityVersion, graphHash, memHash, refs);
|
|
83
|
+
}
|
|
84
|
+
}
|
|
85
|
+
return memory;
|
|
86
|
+
}
|
|
87
|
+
}
|
package/dist/graph-runner.d.ts
CHANGED
|
@@ -5,6 +5,14 @@ import { type ModelBinding } from "./config.js";
|
|
|
5
5
|
import { AgentGraph } from "./graph-schema.js";
|
|
6
6
|
import { WorkingMemory } from "./memory.js";
|
|
7
7
|
import { Store } from "./persistence.js";
|
|
8
|
+
import type { Verifier } from "./verifier-interface.js";
|
|
9
|
+
/** Options for configuring the GraphRunner beyond bindings. */
|
|
10
|
+
export interface GraphRunnerOptions {
|
|
11
|
+
/** Pluggable verifier. If omitted, falls back to the existing vitest-based runVerifier. */
|
|
12
|
+
verifier?: Verifier;
|
|
13
|
+
/** Domain for agent prompts. Defaults to "code". */
|
|
14
|
+
domain?: string;
|
|
15
|
+
}
|
|
8
16
|
/**
|
|
9
17
|
* Runs a declarative agent graph: nodes (role + binding), edges (data flow).
|
|
10
18
|
* After all nodes run, verifier runs on the final node's output and gating is applied.
|
|
@@ -12,7 +20,9 @@ import { Store } from "./persistence.js";
|
|
|
12
20
|
export declare class GraphRunner {
|
|
13
21
|
private graph;
|
|
14
22
|
private gateways;
|
|
15
|
-
|
|
23
|
+
private pluggableVerifier?;
|
|
24
|
+
private domain;
|
|
25
|
+
constructor(graph: AgentGraph, bindings?: Record<string, ModelBinding>, options?: GraphRunnerOptions);
|
|
16
26
|
/**
|
|
17
27
|
* Execute graph: task in -> run nodes in topo order -> run verifier on final node -> gating.
|
|
18
28
|
* If store is provided and verification passed: admit to verified memory; optionally write checkpoint.
|
package/dist/graph-runner.js
CHANGED
|
@@ -14,8 +14,12 @@ import { runVerifier } from "./verifier.js";
|
|
|
14
14
|
export class GraphRunner {
|
|
15
15
|
graph;
|
|
16
16
|
gateways = new Map();
|
|
17
|
-
|
|
17
|
+
pluggableVerifier;
|
|
18
|
+
domain;
|
|
19
|
+
constructor(graph, bindings, options) {
|
|
18
20
|
this.graph = graph;
|
|
21
|
+
this.pluggableVerifier = options?.verifier;
|
|
22
|
+
this.domain = options?.domain ?? "code";
|
|
19
23
|
const resolvedBindings = bindings ?? defaultCodeEntityBindings();
|
|
20
24
|
for (const node of graph.nodes) {
|
|
21
25
|
if (node.role === "verifier") {
|
|
@@ -71,7 +75,7 @@ export class GraphRunner {
|
|
|
71
75
|
if (!gateway) {
|
|
72
76
|
throw new Error(`No gateway for binding '${node.binding}'`);
|
|
73
77
|
}
|
|
74
|
-
const out = await runRole(node.role, gateway, task, inputs, memory.longTermContext, store ?? null);
|
|
78
|
+
const out = await runRole(node.role, gateway, task, inputs, memory.longTermContext, store ?? null, this.domain);
|
|
75
79
|
memory.setSlot(nodeId, out);
|
|
76
80
|
}
|
|
77
81
|
// Set final candidate from final node
|
|
@@ -79,8 +83,15 @@ export class GraphRunner {
|
|
|
79
83
|
if (finalId) {
|
|
80
84
|
memory.finalCandidate = memory.getSlot(finalId);
|
|
81
85
|
}
|
|
82
|
-
// Run verification
|
|
83
|
-
|
|
86
|
+
// Run verification: pluggable verifier if provided, otherwise legacy runVerifier
|
|
87
|
+
if (this.pluggableVerifier) {
|
|
88
|
+
const ctx = { task, testCode, domain: this.domain };
|
|
89
|
+
const result = await this.pluggableVerifier.verify(memory, ctx);
|
|
90
|
+
memory.setVerification(result.passed, result.evidence);
|
|
91
|
+
}
|
|
92
|
+
else {
|
|
93
|
+
await runVerifier(memory, testCode);
|
|
94
|
+
}
|
|
84
95
|
// Record outcome and persist if verification passed
|
|
85
96
|
const latencySec = (performance.now() - t0) / 1000;
|
|
86
97
|
if (store) {
|
package/dist/improvement.d.ts
CHANGED
|
@@ -7,7 +7,7 @@ import type { StrategySuggestion } from "./types.js";
|
|
|
7
7
|
* Review recent outcomes and return suggested strategy updates (rule-based for MVP).
|
|
8
8
|
* Returns list of { role, suggestion } for human or governance to apply.
|
|
9
9
|
*/
|
|
10
|
-
export declare function critiqueStrategies(store: Store, recentCount?: number): StrategySuggestion[];
|
|
10
|
+
export declare function critiqueStrategies(store: Store, recentCount?: number, domain?: string): StrategySuggestion[];
|
|
11
11
|
/**
|
|
12
12
|
* Apply a new prompt for role (strategy update). Under governance, this would require approval.
|
|
13
13
|
*/
|
package/dist/improvement.js
CHANGED
|
@@ -6,7 +6,7 @@ import { getRecentOutcomes, recordOutcome, setStrategy, } from "./persistence.js
|
|
|
6
6
|
* Review recent outcomes and return suggested strategy updates (rule-based for MVP).
|
|
7
7
|
* Returns list of { role, suggestion } for human or governance to apply.
|
|
8
8
|
*/
|
|
9
|
-
export function critiqueStrategies(store, recentCount = 10) {
|
|
9
|
+
export function critiqueStrategies(store, recentCount = 10, domain = "code") {
|
|
10
10
|
const outcomes = getRecentOutcomes(store, recentCount);
|
|
11
11
|
if (outcomes.length < 3) {
|
|
12
12
|
return [];
|
|
@@ -15,14 +15,26 @@ export function critiqueStrategies(store, recentCount = 10) {
|
|
|
15
15
|
const failRate = 1.0 - passed / outcomes.length;
|
|
16
16
|
const suggestions = [];
|
|
17
17
|
if (failRate >= 0.5) {
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
18
|
+
if (domain === "code") {
|
|
19
|
+
suggestions.push({
|
|
20
|
+
role: "generator",
|
|
21
|
+
suggestion: "Add explicit instruction: output only valid TypeScript with no markdown fences or commentary.",
|
|
22
|
+
});
|
|
23
|
+
suggestions.push({
|
|
24
|
+
role: "judge",
|
|
25
|
+
suggestion: "Ensure Judge incorporates all critic feedback and outputs runnable code only.",
|
|
26
|
+
});
|
|
27
|
+
}
|
|
28
|
+
else {
|
|
29
|
+
suggestions.push({
|
|
30
|
+
role: "generator",
|
|
31
|
+
suggestion: "Add explicit instruction: produce clear, complete, and accurate responses. Avoid ambiguity.",
|
|
32
|
+
});
|
|
33
|
+
suggestions.push({
|
|
34
|
+
role: "judge",
|
|
35
|
+
suggestion: "Ensure Judge addresses all critic concerns and produces a safe, well-structured final response.",
|
|
36
|
+
});
|
|
37
|
+
}
|
|
26
38
|
}
|
|
27
39
|
return suggestions;
|
|
28
40
|
}
|
package/dist/index.d.ts
CHANGED
|
@@ -1,19 +1,26 @@
|
|
|
1
1
|
/**
|
|
2
2
|
* Moltblock — framework for evolving composite intelligences (Entities).
|
|
3
3
|
*/
|
|
4
|
-
export declare const VERSION = "0.
|
|
4
|
+
export declare const VERSION = "0.6.0";
|
|
5
5
|
export type { ModelBinding, BindingEntry, AgentConfig, MoltblockConfig, ChatMessage, VerifiedMemoryEntry, CheckpointEntry, OutcomeEntry, InboxEntry, StrategySuggestion, ReceivedArtifact, GovernanceConfig, } from "./types.js";
|
|
6
6
|
export { WorkingMemory } from "./memory.js";
|
|
7
7
|
export { signArtifact, verifyArtifact, artifactHash } from "./signing.js";
|
|
8
|
-
export { loadMoltblockConfig, defaultCodeEntityBindings, detectProvider, getConfigSource, BindingEntrySchema, AgentConfigSchema, MoltblockConfigSchema, ModelBindingSchema, type BindingOverrides, type ConfigSource, } from "./config.js";
|
|
8
|
+
export { loadMoltblockConfig, defaultCodeEntityBindings, detectProvider, getConfigSource, loadPolicyRules, BindingEntrySchema, AgentConfigSchema, MoltblockConfigSchema, ModelBindingSchema, PolicyRuleSchema, type BindingOverrides, type ConfigSource, type PolicyRuleConfig, } from "./config.js";
|
|
9
9
|
export { Store, hashGraph, hashMemory, auditLog, getGovernanceValue, setGovernanceValue, putInbox, getInbox, recordOutcome, getRecentOutcomes, getStrategy, setStrategy, } from "./persistence.js";
|
|
10
10
|
export { LLMGateway } from "./gateway.js";
|
|
11
11
|
export { runGenerator, runCritic, runJudge, runRole, } from "./agents.js";
|
|
12
12
|
export { AgentGraph, GraphNodeSchema, GraphEdgeSchema, AgentGraphSchema, type GraphNode, type GraphEdge, type AgentGraphData, } from "./graph-schema.js";
|
|
13
|
-
export { GraphRunner } from "./graph-runner.js";
|
|
13
|
+
export { GraphRunner, type GraphRunnerOptions } from "./graph-runner.js";
|
|
14
14
|
export { extractCodeBlock, runVitestOnCode, runVerifier } from "./verifier.js";
|
|
15
|
+
export type { Verifier, VerificationResult, VerifierContext, } from "./verifier-interface.js";
|
|
16
|
+
export { PolicyVerifier, type PolicyRule } from "./policy-verifier.js";
|
|
17
|
+
export { CodeVerifier } from "./code-verifier.js";
|
|
18
|
+
export { CompositeVerifier, type CompositeVerifierOptions } from "./composite-verifier.js";
|
|
19
|
+
export { getDomainPrompts, registerDomain, listDomains, type DomainPrompts, } from "./domain-prompts.js";
|
|
20
|
+
export { classifyRisk, type RiskLevel, type RiskClassification } from "./risk.js";
|
|
15
21
|
export { createGovernanceConfig, canMolt, triggerMolt, pause, resume, isPaused, emergencyShutdown, } from "./governance.js";
|
|
16
22
|
export { sendArtifact, receiveArtifacts } from "./handoff.js";
|
|
17
23
|
export { critiqueStrategies, applySuggestion, runEval, runImprovementCycle, } from "./improvement.js";
|
|
18
24
|
export { validateTask, validateTestCode, MAX_TASK_LENGTH, MIN_TASK_LENGTH, type ValidationResult, } from "./validation.js";
|
|
19
25
|
export { CodeEntity, loadEntityWithGraph } from "./entity.js";
|
|
26
|
+
export { Entity, type EntityOptions } from "./entity-base.js";
|
package/dist/index.js
CHANGED
|
@@ -1,13 +1,13 @@
|
|
|
1
1
|
/**
|
|
2
2
|
* Moltblock — framework for evolving composite intelligences (Entities).
|
|
3
3
|
*/
|
|
4
|
-
export const VERSION = "0.
|
|
4
|
+
export const VERSION = "0.6.0";
|
|
5
5
|
// Memory
|
|
6
6
|
export { WorkingMemory } from "./memory.js";
|
|
7
7
|
// Signing
|
|
8
8
|
export { signArtifact, verifyArtifact, artifactHash } from "./signing.js";
|
|
9
9
|
// Config
|
|
10
|
-
export { loadMoltblockConfig, defaultCodeEntityBindings, detectProvider, getConfigSource, BindingEntrySchema, AgentConfigSchema, MoltblockConfigSchema, ModelBindingSchema, } from "./config.js";
|
|
10
|
+
export { loadMoltblockConfig, defaultCodeEntityBindings, detectProvider, getConfigSource, loadPolicyRules, BindingEntrySchema, AgentConfigSchema, MoltblockConfigSchema, ModelBindingSchema, PolicyRuleSchema, } from "./config.js";
|
|
11
11
|
// Persistence
|
|
12
12
|
export { Store, hashGraph, hashMemory, auditLog, getGovernanceValue, setGovernanceValue, putInbox, getInbox, recordOutcome, getRecentOutcomes, getStrategy, setStrategy, } from "./persistence.js";
|
|
13
13
|
// Gateway
|
|
@@ -18,8 +18,18 @@ export { runGenerator, runCritic, runJudge, runRole, } from "./agents.js";
|
|
|
18
18
|
export { AgentGraph, GraphNodeSchema, GraphEdgeSchema, AgentGraphSchema, } from "./graph-schema.js";
|
|
19
19
|
// Graph Runner
|
|
20
20
|
export { GraphRunner } from "./graph-runner.js";
|
|
21
|
-
// Verifier
|
|
21
|
+
// Verifier (legacy vitest-based)
|
|
22
22
|
export { extractCodeBlock, runVitestOnCode, runVerifier } from "./verifier.js";
|
|
23
|
+
// Policy Verifier
|
|
24
|
+
export { PolicyVerifier } from "./policy-verifier.js";
|
|
25
|
+
// Code Verifier (adapter)
|
|
26
|
+
export { CodeVerifier } from "./code-verifier.js";
|
|
27
|
+
// Composite Verifier
|
|
28
|
+
export { CompositeVerifier } from "./composite-verifier.js";
|
|
29
|
+
// Domain Prompts
|
|
30
|
+
export { getDomainPrompts, registerDomain, listDomains, } from "./domain-prompts.js";
|
|
31
|
+
// Risk Classification
|
|
32
|
+
export { classifyRisk } from "./risk.js";
|
|
23
33
|
// Governance
|
|
24
34
|
export { createGovernanceConfig, canMolt, triggerMolt, pause, resume, isPaused, emergencyShutdown, } from "./governance.js";
|
|
25
35
|
// Handoff
|
|
@@ -28,5 +38,7 @@ export { sendArtifact, receiveArtifacts } from "./handoff.js";
|
|
|
28
38
|
export { critiqueStrategies, applySuggestion, runEval, runImprovementCycle, } from "./improvement.js";
|
|
29
39
|
// Validation
|
|
30
40
|
export { validateTask, validateTestCode, MAX_TASK_LENGTH, MIN_TASK_LENGTH, } from "./validation.js";
|
|
31
|
-
// Entity
|
|
41
|
+
// Entity (code-specific, backward compat)
|
|
32
42
|
export { CodeEntity, loadEntityWithGraph } from "./entity.js";
|
|
43
|
+
// Entity (generic, pluggable)
|
|
44
|
+
export { Entity } from "./entity-base.js";
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* PolicyVerifier: rule-based verifier that catches dangerous patterns without an LLM call.
|
|
3
|
+
*/
|
|
4
|
+
import type { WorkingMemory } from "./memory.js";
|
|
5
|
+
import type { Verifier, VerificationResult, VerifierContext } from "./verifier-interface.js";
|
|
6
|
+
/** A single policy rule. */
|
|
7
|
+
export interface PolicyRule {
|
|
8
|
+
id: string;
|
|
9
|
+
description: string;
|
|
10
|
+
/** What to match against: "artifact", "task", or "both". */
|
|
11
|
+
target: "artifact" | "task" | "both";
|
|
12
|
+
/** Regex pattern string. */
|
|
13
|
+
pattern: string;
|
|
14
|
+
/** "deny" blocks the artifact; "allow" overrides deny rules in the same category. */
|
|
15
|
+
action: "deny" | "allow";
|
|
16
|
+
category: string;
|
|
17
|
+
enabled: boolean;
|
|
18
|
+
}
|
|
19
|
+
/**
|
|
20
|
+
* Rule-based policy verifier. Checks artifacts and tasks against deny/allow rules.
|
|
21
|
+
* Allow rules in the same category override deny rules.
|
|
22
|
+
*/
|
|
23
|
+
export declare class PolicyVerifier implements Verifier {
|
|
24
|
+
readonly name = "PolicyVerifier";
|
|
25
|
+
private rules;
|
|
26
|
+
constructor(customRules?: PolicyRule[]);
|
|
27
|
+
verify(memory: WorkingMemory, context?: VerifierContext): Promise<VerificationResult>;
|
|
28
|
+
private getTargetText;
|
|
29
|
+
}
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* PolicyVerifier: rule-based verifier that catches dangerous patterns without an LLM call.
|
|
3
|
+
*/
|
|
4
|
+
// --- Built-in deny rules ---
|
|
5
|
+
const BUILTIN_RULES = [
|
|
6
|
+
// Destructive commands
|
|
7
|
+
{ id: "cmd-rm-rf", description: "Recursive force delete", target: "artifact", pattern: "\\brm\\s+-rf\\b", action: "deny", category: "destructive-cmd", enabled: true },
|
|
8
|
+
{ id: "cmd-rm-r", description: "Recursive delete", target: "artifact", pattern: "\\brm\\s+-r\\s+/", action: "deny", category: "destructive-cmd", enabled: true },
|
|
9
|
+
{ id: "cmd-drop-table", description: "SQL DROP TABLE", target: "artifact", pattern: "\\bDROP\\s+(TABLE|DATABASE)\\b", action: "deny", category: "destructive-sql", enabled: true },
|
|
10
|
+
{ id: "cmd-truncate", description: "SQL TRUNCATE", target: "artifact", pattern: "\\bTRUNCATE\\s+TABLE\\b", action: "deny", category: "destructive-sql", enabled: true },
|
|
11
|
+
{ id: "cmd-dd", description: "Raw disk write (dd)", target: "artifact", pattern: "\\bdd\\s+if=", action: "deny", category: "destructive-cmd", enabled: true },
|
|
12
|
+
{ id: "cmd-chmod-777", description: "World-writable permissions", target: "artifact", pattern: "\\bchmod\\s+777\\b", action: "deny", category: "destructive-cmd", enabled: true },
|
|
13
|
+
{ id: "cmd-mkfs", description: "Filesystem creation", target: "artifact", pattern: "\\bmkfs\\b", action: "deny", category: "destructive-cmd", enabled: true },
|
|
14
|
+
// Sensitive file paths
|
|
15
|
+
{ id: "path-ssh", description: "SSH directory access", target: "both", pattern: "~?\\/?\\.ssh\\/", action: "deny", category: "sensitive-path", enabled: true },
|
|
16
|
+
{ id: "path-etc-passwd", description: "/etc/passwd access", target: "both", pattern: "\\/etc\\/passwd\\b", action: "deny", category: "sensitive-path", enabled: true },
|
|
17
|
+
{ id: "path-etc-shadow", description: "/etc/shadow access", target: "both", pattern: "\\/etc\\/shadow\\b", action: "deny", category: "sensitive-path", enabled: true },
|
|
18
|
+
{ id: "path-dotenv", description: ".env file access", target: "artifact", pattern: "\\.(env|env\\.local|env\\.production)\\b", action: "deny", category: "sensitive-path", enabled: true },
|
|
19
|
+
{ id: "path-id-rsa", description: "Private key file", target: "both", pattern: "\\bid_rsa\\b|\\bid_ed25519\\b", action: "deny", category: "sensitive-path", enabled: true },
|
|
20
|
+
{ id: "path-credentials", description: "Credentials file", target: "both", pattern: "\\bcredentials\\.(json|yaml|yml|xml)\\b", action: "deny", category: "sensitive-path", enabled: true },
|
|
21
|
+
// Hardcoded secrets
|
|
22
|
+
{ id: "secret-api-key", description: "Hardcoded API key pattern", target: "artifact", pattern: "(api[_-]?key|apikey)\\s*[=:]\\s*[\"'][A-Za-z0-9_\\-]{20,}", action: "deny", category: "hardcoded-secret", enabled: true },
|
|
23
|
+
{ id: "secret-password", description: "Hardcoded password", target: "artifact", pattern: "(password|passwd|pwd)\\s*[=:]\\s*[\"'][^\"']{4,}", action: "deny", category: "hardcoded-secret", enabled: true },
|
|
24
|
+
{ id: "secret-private-key", description: "Private key material", target: "artifact", pattern: "-----BEGIN\\s+(RSA|EC|DSA|OPENSSH)?\\s*PRIVATE\\s+KEY-----", action: "deny", category: "hardcoded-secret", enabled: true },
|
|
25
|
+
{ id: "secret-token", description: "Hardcoded token/secret", target: "artifact", pattern: "(secret|token)\\s*[=:]\\s*[\"'][A-Za-z0-9_\\-]{20,}", action: "deny", category: "hardcoded-secret", enabled: true },
|
|
26
|
+
// Data exfiltration
|
|
27
|
+
{ id: "exfil-curl-post", description: "curl POST request", target: "artifact", pattern: "\\bcurl\\s+.*-X\\s*POST\\b", action: "deny", category: "exfiltration", enabled: true },
|
|
28
|
+
{ id: "exfil-wget", description: "wget to HTTP", target: "artifact", pattern: "\\bwget\\s+http", action: "deny", category: "exfiltration", enabled: true },
|
|
29
|
+
];
|
|
30
|
+
/**
|
|
31
|
+
* Rule-based policy verifier. Checks artifacts and tasks against deny/allow rules.
|
|
32
|
+
* Allow rules in the same category override deny rules.
|
|
33
|
+
*/
|
|
34
|
+
export class PolicyVerifier {
|
|
35
|
+
name = "PolicyVerifier";
|
|
36
|
+
rules;
|
|
37
|
+
constructor(customRules) {
|
|
38
|
+
this.rules = [...BUILTIN_RULES];
|
|
39
|
+
if (customRules) {
|
|
40
|
+
this.rules.push(...customRules);
|
|
41
|
+
}
|
|
42
|
+
}
|
|
43
|
+
async verify(memory, context) {
|
|
44
|
+
const artifact = memory.finalCandidate || "";
|
|
45
|
+
const task = context?.task ?? memory.task ?? "";
|
|
46
|
+
const violations = [];
|
|
47
|
+
const allowedCategories = new Set();
|
|
48
|
+
// First pass: collect allowed categories
|
|
49
|
+
for (const rule of this.rules) {
|
|
50
|
+
if (!rule.enabled || rule.action !== "allow")
|
|
51
|
+
continue;
|
|
52
|
+
const text = this.getTargetText(rule.target, artifact, task);
|
|
53
|
+
const regex = new RegExp(rule.pattern, "i");
|
|
54
|
+
if (regex.test(text)) {
|
|
55
|
+
allowedCategories.add(rule.category);
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
// Second pass: check deny rules (skip allowed categories)
|
|
59
|
+
for (const rule of this.rules) {
|
|
60
|
+
if (!rule.enabled || rule.action !== "deny")
|
|
61
|
+
continue;
|
|
62
|
+
if (allowedCategories.has(rule.category))
|
|
63
|
+
continue;
|
|
64
|
+
const text = this.getTargetText(rule.target, artifact, task);
|
|
65
|
+
const regex = new RegExp(rule.pattern, "i");
|
|
66
|
+
if (regex.test(text)) {
|
|
67
|
+
violations.push(`[${rule.id}] ${rule.description}`);
|
|
68
|
+
}
|
|
69
|
+
}
|
|
70
|
+
if (violations.length > 0) {
|
|
71
|
+
return {
|
|
72
|
+
passed: false,
|
|
73
|
+
evidence: `Policy violations:\n${violations.join("\n")}`,
|
|
74
|
+
verifierName: this.name,
|
|
75
|
+
};
|
|
76
|
+
}
|
|
77
|
+
return {
|
|
78
|
+
passed: true,
|
|
79
|
+
evidence: "All policy rules passed.",
|
|
80
|
+
verifierName: this.name,
|
|
81
|
+
};
|
|
82
|
+
}
|
|
83
|
+
getTargetText(target, artifact, task) {
|
|
84
|
+
if (target === "artifact")
|
|
85
|
+
return artifact;
|
|
86
|
+
if (target === "task")
|
|
87
|
+
return task;
|
|
88
|
+
return `${task}\n${artifact}`;
|
|
89
|
+
}
|
|
90
|
+
}
|
package/dist/risk.d.ts
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Risk classification: keyword-based risk levels for tasks.
|
|
3
|
+
*/
|
|
4
|
+
export type RiskLevel = "low" | "medium" | "high";
|
|
5
|
+
export interface RiskClassification {
|
|
6
|
+
level: RiskLevel;
|
|
7
|
+
reasons: string[];
|
|
8
|
+
}
|
|
9
|
+
/**
|
|
10
|
+
* Classify a task's risk level based on keyword/pattern matching.
|
|
11
|
+
* Returns the highest risk level found and all matching reasons.
|
|
12
|
+
*/
|
|
13
|
+
export declare function classifyRisk(task: string): RiskClassification;
|
package/dist/risk.js
ADDED
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Risk classification: keyword-based risk levels for tasks.
|
|
3
|
+
*/
|
|
4
|
+
const RISK_PATTERNS = [
|
|
5
|
+
// High: destructive operations
|
|
6
|
+
{ pattern: /\brm\s+-rf\b/i, level: "high", reason: "Recursive file deletion (rm -rf)" },
|
|
7
|
+
{ pattern: /\brm\s+-r\b/i, level: "high", reason: "Recursive file deletion (rm -r)" },
|
|
8
|
+
{ pattern: /\brmdir\b/i, level: "high", reason: "Directory removal" },
|
|
9
|
+
{ pattern: /\bdrop\s+(table|database)\b/i, level: "high", reason: "SQL DROP statement" },
|
|
10
|
+
{ pattern: /\btruncate\s+table\b/i, level: "high", reason: "SQL TRUNCATE statement" },
|
|
11
|
+
{ pattern: /\bformat\s+[a-z]:/i, level: "high", reason: "Disk format command" },
|
|
12
|
+
{ pattern: /\bmkfs\b/i, level: "high", reason: "Filesystem creation (mkfs)" },
|
|
13
|
+
{ pattern: /\bdd\s+if=/i, level: "high", reason: "Raw disk write (dd)" },
|
|
14
|
+
// High: privilege escalation
|
|
15
|
+
{ pattern: /\bsudo\b/i, level: "high", reason: "Sudo privilege escalation" },
|
|
16
|
+
{ pattern: /\bchmod\s+777\b/i, level: "high", reason: "World-writable permissions (chmod 777)" },
|
|
17
|
+
{ pattern: /\bchmod\s+\+s\b/i, level: "high", reason: "Set-UID/GID bit (chmod +s)" },
|
|
18
|
+
// High: credential/key access
|
|
19
|
+
{ pattern: /\b(private[_\s]?key|id_rsa|id_ed25519)\b/i, level: "high", reason: "Private key access" },
|
|
20
|
+
{ pattern: /\/etc\/shadow\b/i, level: "high", reason: "Shadow password file access" },
|
|
21
|
+
{ pattern: /~?\/?\.ssh\//i, level: "high", reason: "SSH directory access" },
|
|
22
|
+
{ pattern: /\bcredentials?\.(json|yaml|yml|xml|conf)\b/i, level: "high", reason: "Credentials file access" },
|
|
23
|
+
// High: system modification
|
|
24
|
+
{ pattern: /\/etc\/passwd\b/i, level: "high", reason: "System password file access" },
|
|
25
|
+
{ pattern: /\bsystemctl\s+(stop|disable|mask)\b/i, level: "high", reason: "System service modification" },
|
|
26
|
+
{ pattern: /\bkill\s+-9\b/i, level: "high", reason: "Force kill process" },
|
|
27
|
+
// Medium: network operations
|
|
28
|
+
{ pattern: /\bcurl\b/i, level: "medium", reason: "Network request (curl)" },
|
|
29
|
+
{ pattern: /\bwget\b/i, level: "medium", reason: "Network download (wget)" },
|
|
30
|
+
{ pattern: /\bfetch\s*\(/i, level: "medium", reason: "Network fetch call" },
|
|
31
|
+
{ pattern: /\bhttp(s)?:\/\//i, level: "medium", reason: "HTTP URL reference" },
|
|
32
|
+
// Medium: file writes
|
|
33
|
+
{ pattern: /\bwrite\s*file\b/i, level: "medium", reason: "File write operation" },
|
|
34
|
+
{ pattern: /\bfs\.write/i, level: "medium", reason: "Filesystem write (fs.write)" },
|
|
35
|
+
{ pattern: /\bfs\.unlink/i, level: "medium", reason: "File deletion (fs.unlink)" },
|
|
36
|
+
// Medium: database modifications
|
|
37
|
+
{ pattern: /\b(insert|update|delete|alter)\s+(into|from|table)\b/i, level: "medium", reason: "Database modification" },
|
|
38
|
+
// Medium: subprocess spawning
|
|
39
|
+
{ pattern: /\bexec\s*\(/i, level: "medium", reason: "Subprocess execution (exec)" },
|
|
40
|
+
{ pattern: /\bspawn\s*\(/i, level: "medium", reason: "Subprocess spawning" },
|
|
41
|
+
{ pattern: /\bchild_process\b/i, level: "medium", reason: "Child process module" },
|
|
42
|
+
{ pattern: /\beval\s*\(/i, level: "medium", reason: "Dynamic code evaluation (eval)" },
|
|
43
|
+
];
|
|
44
|
+
/**
|
|
45
|
+
* Classify a task's risk level based on keyword/pattern matching.
|
|
46
|
+
* Returns the highest risk level found and all matching reasons.
|
|
47
|
+
*/
|
|
48
|
+
export function classifyRisk(task) {
|
|
49
|
+
const reasons = [];
|
|
50
|
+
let level = "low";
|
|
51
|
+
for (const { pattern, level: patternLevel, reason } of RISK_PATTERNS) {
|
|
52
|
+
if (pattern.test(task)) {
|
|
53
|
+
reasons.push(reason);
|
|
54
|
+
if (patternLevel === "high") {
|
|
55
|
+
level = "high";
|
|
56
|
+
}
|
|
57
|
+
else if (patternLevel === "medium" && level !== "high") {
|
|
58
|
+
level = "medium";
|
|
59
|
+
}
|
|
60
|
+
}
|
|
61
|
+
}
|
|
62
|
+
return { level, reasons };
|
|
63
|
+
}
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Pluggable verifier interface: any verification strategy implements this contract.
|
|
3
|
+
*/
|
|
4
|
+
import type { WorkingMemory } from "./memory.js";
|
|
5
|
+
/** Result of a single verification run. */
|
|
6
|
+
export interface VerificationResult {
|
|
7
|
+
passed: boolean;
|
|
8
|
+
evidence: string;
|
|
9
|
+
verifierName: string;
|
|
10
|
+
/** Per-verifier details when running a composite verifier. */
|
|
11
|
+
details?: VerificationResult[];
|
|
12
|
+
}
|
|
13
|
+
/** Context passed to verifiers alongside working memory. */
|
|
14
|
+
export interface VerifierContext {
|
|
15
|
+
task?: string;
|
|
16
|
+
testCode?: string;
|
|
17
|
+
domain?: string;
|
|
18
|
+
meta?: Record<string, unknown>;
|
|
19
|
+
}
|
|
20
|
+
/** A verifier that can gate artifacts before they gain authority. */
|
|
21
|
+
export interface Verifier {
|
|
22
|
+
readonly name: string;
|
|
23
|
+
verify(memory: WorkingMemory, context?: VerifierContext): Promise<VerificationResult>;
|
|
24
|
+
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "moltblock",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.6.0",
|
|
4
4
|
"description": "Framework for building evolving composite AI intelligences (Entities)",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -41,7 +41,8 @@
|
|
|
41
41
|
},
|
|
42
42
|
"files": [
|
|
43
43
|
"dist",
|
|
44
|
-
"config"
|
|
44
|
+
"config",
|
|
45
|
+
"skill"
|
|
45
46
|
],
|
|
46
47
|
"dependencies": {
|
|
47
48
|
"better-sqlite3": "^12.6.2",
|
package/skill/SKILL.md
ADDED
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# moltblock — Trust Layer for AI Agents
|
|
2
|
+
|
|
3
|
+
## Description
|
|
4
|
+
|
|
5
|
+
Moltblock provides verification gating for AI-generated artifacts. Before any high-risk action (file deletion, credential access, system modification, network exfiltration), the `moltblock_verify` tool runs policy checks and optional code verification to catch dangerous patterns.
|
|
6
|
+
|
|
7
|
+
## When to Use
|
|
8
|
+
|
|
9
|
+
Use `moltblock_verify` when the task involves:
|
|
10
|
+
|
|
11
|
+
- **High risk** (always verify): Destructive operations (`rm -rf`, `DROP TABLE`), privilege escalation (`sudo`, `chmod 777`), credential/key access (`.ssh/`, `id_rsa`, `.env`), system modification, raw disk writes
|
|
12
|
+
- **Medium risk** (verify when uncertain): Network requests (`curl`, `wget`, `fetch`), file writes, database modifications, subprocess spawning, dynamic code evaluation
|
|
13
|
+
- **Low risk** (skip verification): Simple text responses, math, reading public information, code that doesn't touch the filesystem or network
|
|
14
|
+
|
|
15
|
+
## Tool: moltblock_verify
|
|
16
|
+
|
|
17
|
+
Verify a task before execution.
|
|
18
|
+
|
|
19
|
+
### Usage
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
npx moltblock "<task description>" --provider <provider> --json
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
### Parameters
|
|
26
|
+
|
|
27
|
+
| Parameter | Required | Description |
|
|
28
|
+
|-----------|----------|-------------|
|
|
29
|
+
| task | Yes | The task description to verify |
|
|
30
|
+
| --provider | No | LLM provider: openai, google, zai, local (auto-detected from env) |
|
|
31
|
+
| --model | No | Model override |
|
|
32
|
+
| --test | No | Path to test file (for code verification) |
|
|
33
|
+
| --json | No | Output structured JSON result |
|
|
34
|
+
|
|
35
|
+
### Environment Variables
|
|
36
|
+
|
|
37
|
+
Set one of these for provider auto-detection:
|
|
38
|
+
- `OPENAI_API_KEY` — OpenAI
|
|
39
|
+
- `GOOGLE_API_KEY` — Google/Gemini
|
|
40
|
+
- `MOLTBLOCK_ZAI_API_KEY` — ZAI
|
|
41
|
+
|
|
42
|
+
### Example
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
# Verify a potentially dangerous task
|
|
46
|
+
npx moltblock "delete all temporary files older than 30 days" --json
|
|
47
|
+
|
|
48
|
+
# Verify code with tests
|
|
49
|
+
npx moltblock "implement user authentication" --test ./tests/auth.test.ts --json
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Output (JSON mode)
|
|
53
|
+
|
|
54
|
+
```json
|
|
55
|
+
{
|
|
56
|
+
"task": "...",
|
|
57
|
+
"verificationPassed": true,
|
|
58
|
+
"verificationEvidence": "All policy rules passed.",
|
|
59
|
+
"riskLevel": "high",
|
|
60
|
+
"riskReasons": ["Recursive file deletion (rm -rf)"]
|
|
61
|
+
}
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
## Installation
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
npm install -g moltblock
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Or use directly with npx (no install needed):
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
npx moltblock "your task" --json
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## Configuration
|
|
77
|
+
|
|
78
|
+
Optional. Place `moltblock.json` in your project root or `~/.moltblock/moltblock.json`:
|
|
79
|
+
|
|
80
|
+
```json
|
|
81
|
+
{
|
|
82
|
+
"agent": {
|
|
83
|
+
"bindings": {
|
|
84
|
+
"generator": { "backend": "google", "base_url": "https://generativelanguage.googleapis.com/v1beta/openai/", "model": "gemini-2.0-flash" },
|
|
85
|
+
"critic": { "backend": "google", "base_url": "https://generativelanguage.googleapis.com/v1beta/openai/", "model": "gemini-2.0-flash" },
|
|
86
|
+
"judge": { "backend": "google", "base_url": "https://generativelanguage.googleapis.com/v1beta/openai/", "model": "gemini-2.0-flash" }
|
|
87
|
+
}
|
|
88
|
+
},
|
|
89
|
+
"policy": {
|
|
90
|
+
"rules": [
|
|
91
|
+
{
|
|
92
|
+
"id": "custom-allow-tmp",
|
|
93
|
+
"description": "Allow operations in /tmp",
|
|
94
|
+
"target": "artifact",
|
|
95
|
+
"pattern": "\\/tmp\\/",
|
|
96
|
+
"action": "allow",
|
|
97
|
+
"category": "destructive-cmd",
|
|
98
|
+
"enabled": true
|
|
99
|
+
}
|
|
100
|
+
]
|
|
101
|
+
}
|
|
102
|
+
}
|
|
103
|
+
```
|