@kradle/cli 0.0.6 → 0.0.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +66 -40
- package/dist/commands/evaluation/{init.d.ts → create.d.ts} +1 -1
- package/dist/commands/evaluation/{init.js → create.js} +26 -5
- package/dist/commands/init.d.ts +0 -1
- package/dist/commands/init.js +27 -25
- package/dist/lib/api-client.d.ts +1 -0
- package/dist/lib/api-client.js +3 -0
- package/dist/lib/schemas.d.ts +4 -4
- package/dist/lib/schemas.js +2 -2
- package/oclif.manifest.json +5 -13
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -2,19 +2,29 @@
|
|
|
2
2
|
|
|
3
3
|
Kradle's CLI for managing Minecraft challenges, evaluations, agents, and more!
|
|
4
4
|
|
|
5
|
+
* [Installation](#installation)
|
|
6
|
+
* [Autocomplete](#autocomplete)
|
|
7
|
+
* [Configuration](#configuration)
|
|
8
|
+
* [Challenge](#challenge-commands)
|
|
9
|
+
* [Evaluations](#evaluation-commands)
|
|
10
|
+
* [Publishing a New Version](#publishing-a-new-version)
|
|
11
|
+
* [Development](#development)
|
|
12
|
+
* [Architecture](#architecture)
|
|
13
|
+
|
|
5
14
|
## Installation
|
|
6
15
|
|
|
7
16
|
1. Install Kradle's CLI globally
|
|
8
17
|
```
|
|
9
18
|
npm i -g @kradle/cli
|
|
10
19
|
```
|
|
11
|
-
2. Initialize a new
|
|
20
|
+
2. Initialize a new directory to store challenges and evaluations
|
|
12
21
|
```
|
|
13
22
|
kradle init
|
|
14
23
|
```
|
|
15
|
-
3. Congrats 🎉 You can now create a new challenge:
|
|
24
|
+
3. Congrats 🎉 You can now create a new challenge or a new evaluation:
|
|
16
25
|
```
|
|
17
26
|
kradle challenge create <challenge-name>
|
|
27
|
+
kradle evaluation create <evaluation-name>
|
|
18
28
|
```
|
|
19
29
|
|
|
20
30
|
In addition, you can enable [autocomplete](#Autocomplete).
|
|
@@ -35,19 +45,6 @@ After setup, you will be able to use Tab to autocomplete:
|
|
|
35
45
|
kradle challenge <TAB> # Shows: build, create, list, run, upload, watch, etc.
|
|
36
46
|
```
|
|
37
47
|
|
|
38
|
-
## Configuration
|
|
39
|
-
|
|
40
|
-
The `.env` should have the following variables:
|
|
41
|
-
|
|
42
|
-
```env
|
|
43
|
-
WEB_API_URL=https://api.kradle.ai
|
|
44
|
-
WEB_URL=https://kradle.ai
|
|
45
|
-
STUDIO_API_URL=http://localhost:8080
|
|
46
|
-
STUDIO_URL=kradle-studio://
|
|
47
|
-
KRADLE_API_KEY=your-api-key
|
|
48
|
-
KRADLE_CHALLENGES_PATH=~/Documents/kradle-studio/challenges
|
|
49
|
-
```
|
|
50
|
-
|
|
51
48
|
## Challenge Commands
|
|
52
49
|
|
|
53
50
|
### Create Challenge
|
|
@@ -125,28 +122,60 @@ kradle challenge multi-upload
|
|
|
125
122
|
|
|
126
123
|
Provides an interactive UI to select multiple challenges and uploads them in parallel.
|
|
127
124
|
|
|
128
|
-
##
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
125
|
+
## Evaluation Commands
|
|
126
|
+
|
|
127
|
+
Evaluations allow you to run batches of challenge runs with different agents and configurations, then analyze the results. This is useful for benchmarking agents, testing challenge difficulty, or gathering statistics across many runs.
|
|
128
|
+
|
|
129
|
+
### Concepts
|
|
130
|
+
|
|
131
|
+
**Evaluation**: A named collection of run configurations defined in a `config.ts` file. Each evaluation lives in `evaluations/<name>/`.
|
|
132
|
+
|
|
133
|
+
**Iteration**: A snapshot of an evaluation execution. When you run an evaluation, it creates an iteration containing:
|
|
134
|
+
- A copy of the `config.ts` at that point in time
|
|
135
|
+
- A `manifest.json` with the generated list of runs
|
|
136
|
+
- A `progress.json` tracking the status of each run
|
|
137
|
+
|
|
138
|
+
Iterations are stored in `evaluations/<name>/iterations/001/`, `002/`, etc. This allows you to:
|
|
139
|
+
- Resume an interrupted evaluation from where it left off
|
|
140
|
+
- Re-run the same evaluation with `--new` to create a fresh iteration
|
|
141
|
+
- Compare results across different iterations
|
|
142
|
+
|
|
143
|
+
### Create Evaluation
|
|
144
|
+
|
|
145
|
+
Create a new evaluation with a template config file:
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
kradle evaluation create <name>
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
This creates `evaluations/<name>/config.ts` with a template that you can customize. The config exports a `main()` function that returns a manifest with:
|
|
152
|
+
- `runs`: Array of run configurations (challenge + participants)
|
|
153
|
+
- `tags`: Optional tags applied to all runs for filtering in analytics
|
|
154
|
+
|
|
155
|
+
### Run Evaluation
|
|
156
|
+
|
|
157
|
+
Execute or resume an evaluation:
|
|
158
|
+
|
|
159
|
+
```bash
|
|
160
|
+
kradle evaluation run <name> # Resume current iteration or create first one
|
|
161
|
+
kradle evaluation run <name> --new # Start a new iteration
|
|
162
|
+
kradle evaluation run <name> --max-concurrent 10 # Control parallelism (default: 5)
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
The run command:
|
|
166
|
+
1. Creates a new iteration (or resumes the current one)
|
|
167
|
+
2. Generates a manifest by executing `config.ts`
|
|
168
|
+
3. Displays an interactive TUI showing run progress
|
|
169
|
+
4. Saves progress periodically (allows resuming if interrupted)
|
|
170
|
+
5. Opens Metabase dashboard with results when complete
|
|
171
|
+
|
|
172
|
+
### List Evaluations
|
|
173
|
+
|
|
174
|
+
List all local evaluations:
|
|
175
|
+
|
|
176
|
+
```bash
|
|
177
|
+
kradle evaluation list
|
|
178
|
+
```
|
|
150
179
|
|
|
151
180
|
## Publishing a New Version
|
|
152
181
|
|
|
@@ -161,9 +190,6 @@ The CLI uses GitHub Actions for automated releases. To publish a new version:
|
|
|
161
190
|
4. **Review and merge** the automatically created PR
|
|
162
191
|
5. **Done!** The package is automatically published to npm when the PR is merged
|
|
163
192
|
|
|
164
|
-
### Setup (one-time)
|
|
165
|
-
|
|
166
|
-
For the publish workflow to work, we're using [NPM Trusted Publishers](https://docs.npmjs.com/trusted-publishers).
|
|
167
193
|
|
|
168
194
|
## Development
|
|
169
195
|
|
|
@@ -2,11 +2,13 @@ import { exec } from "node:child_process";
|
|
|
2
2
|
import fs from "node:fs/promises";
|
|
3
3
|
import path from "node:path";
|
|
4
4
|
import { Args, Command } from "@oclif/core";
|
|
5
|
+
import enquirer from "enquirer";
|
|
5
6
|
import pc from "picocolors";
|
|
7
|
+
import { ApiClient } from "../../lib/api-client.js";
|
|
6
8
|
import { loadConfig } from "../../lib/config.js";
|
|
7
9
|
import { getStaticResourcePath } from "../../lib/utils.js";
|
|
8
|
-
export default class
|
|
9
|
-
static description = "
|
|
10
|
+
export default class Create extends Command {
|
|
11
|
+
static description = "Create a new evaluation";
|
|
10
12
|
static examples = ["<%= config.bin %> <%= command.id %> my-evaluation"];
|
|
11
13
|
static args = {
|
|
12
14
|
name: Args.string({
|
|
@@ -15,7 +17,7 @@ export default class Init extends Command {
|
|
|
15
17
|
}),
|
|
16
18
|
};
|
|
17
19
|
async run() {
|
|
18
|
-
const { args } = await this.parse(
|
|
20
|
+
const { args } = await this.parse(Create);
|
|
19
21
|
loadConfig(); // Validate config is available
|
|
20
22
|
const evaluationDir = path.resolve(process.cwd(), "evaluations", args.name);
|
|
21
23
|
const configPath = path.join(evaluationDir, "config.ts");
|
|
@@ -29,9 +31,28 @@ export default class Init extends Command {
|
|
|
29
31
|
}
|
|
30
32
|
// Create evaluation directory
|
|
31
33
|
await fs.mkdir(evaluationDir, { recursive: true });
|
|
32
|
-
//
|
|
34
|
+
// Ask for the slug of the challenge to evaluate
|
|
35
|
+
const config = loadConfig();
|
|
36
|
+
const api = new ApiClient(config);
|
|
37
|
+
const [kradleChallenges, cloudChallenges] = await Promise.all([api.listKradleChallenges(), api.listChallenges()]);
|
|
38
|
+
const choices = [...kradleChallenges, ...cloudChallenges]
|
|
39
|
+
.map((c) => c.slug)
|
|
40
|
+
.toSorted()
|
|
41
|
+
.map((s) => ({
|
|
42
|
+
name: s,
|
|
43
|
+
message: s,
|
|
44
|
+
}));
|
|
45
|
+
const response = await enquirer.prompt({
|
|
46
|
+
type: "select",
|
|
47
|
+
name: "challenge",
|
|
48
|
+
message: "Select the challenge to evaluate",
|
|
49
|
+
choices: choices,
|
|
50
|
+
});
|
|
51
|
+
// Read template file and fill in the challenge slug, then write to config file
|
|
33
52
|
const templatePath = getStaticResourcePath("evaluation_template.ts");
|
|
34
|
-
await fs.
|
|
53
|
+
const template = await fs.readFile(templatePath, "utf-8");
|
|
54
|
+
const filledTemplate = template.replace("[INSERT CHALLENGE SLUG HERE]", response.challenge);
|
|
55
|
+
await fs.writeFile(configPath, filledTemplate);
|
|
35
56
|
this.log(pc.green(`✓ Created evaluation '${args.name}'`));
|
|
36
57
|
this.log(pc.dim(` Config: ${configPath}`));
|
|
37
58
|
// Offer to open in editor on macOS
|
package/dist/commands/init.d.ts
CHANGED
|
@@ -4,7 +4,6 @@ export default class Init extends Command {
|
|
|
4
4
|
static examples: string[];
|
|
5
5
|
static flags: {
|
|
6
6
|
name: import("@oclif/core/interfaces").OptionFlag<string | undefined, import("@oclif/core/interfaces").CustomOptions>;
|
|
7
|
-
dev: import("@oclif/core/interfaces").BooleanFlag<boolean>;
|
|
8
7
|
"api-key": import("@oclif/core/interfaces").OptionFlag<string | undefined, import("@oclif/core/interfaces").CustomOptions>;
|
|
9
8
|
};
|
|
10
9
|
run(): Promise<void>;
|
package/dist/commands/init.js
CHANGED
|
@@ -14,11 +14,11 @@ export default class Init extends Command {
|
|
|
14
14
|
description: "Project name",
|
|
15
15
|
required: false,
|
|
16
16
|
}),
|
|
17
|
-
dev: Flags.boolean({
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
}),
|
|
17
|
+
// dev: Flags.boolean({
|
|
18
|
+
// char: "d",
|
|
19
|
+
// description: "Use Kradle's development environment instead of production",
|
|
20
|
+
// required: false,
|
|
21
|
+
// }),
|
|
22
22
|
"api-key": Flags.string({
|
|
23
23
|
char: "k",
|
|
24
24
|
description: "Kradle API key",
|
|
@@ -34,10 +34,10 @@ export default class Init extends Command {
|
|
|
34
34
|
const nonHiddenFiles = files.filter((f) => !f.startsWith("."));
|
|
35
35
|
const useCurrentDir = nonHiddenFiles.length === 0;
|
|
36
36
|
if (useCurrentDir) {
|
|
37
|
-
this.log(pc.yellow("Current directory is empty, it will be used
|
|
37
|
+
this.log(pc.yellow("Current directory is empty, it will be used to store challenges and evaluations."));
|
|
38
38
|
}
|
|
39
39
|
else {
|
|
40
|
-
this.log(pc.yellow("Current directory is not empty, a subdirectory will be created
|
|
40
|
+
this.log(pc.yellow("Current directory is not empty, a subdirectory will be created to store challenges and evaluations."));
|
|
41
41
|
}
|
|
42
42
|
let projectName;
|
|
43
43
|
if (flags.name) {
|
|
@@ -51,34 +51,36 @@ export default class Init extends Command {
|
|
|
51
51
|
const { name } = await enquirer.prompt({
|
|
52
52
|
type: "input",
|
|
53
53
|
name: "name",
|
|
54
|
-
message: "
|
|
54
|
+
message: "What should the directory be called?",
|
|
55
55
|
initial: initial,
|
|
56
56
|
});
|
|
57
57
|
projectName = name;
|
|
58
58
|
}
|
|
59
|
-
let useDev = flags.dev;
|
|
60
|
-
if (!useDev) {
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
}
|
|
69
|
-
if (useDev) {
|
|
70
|
-
|
|
71
|
-
}
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
59
|
+
// let useDev = flags.dev;
|
|
60
|
+
// if (!useDev) {
|
|
61
|
+
// const { confirm } = await enquirer.prompt<{ confirm: boolean }>({
|
|
62
|
+
// type: "confirm",
|
|
63
|
+
// name: "confirm",
|
|
64
|
+
// message: "Do you want to use Kradle's development environment?",
|
|
65
|
+
// initial: false,
|
|
66
|
+
// });
|
|
67
|
+
// useDev = confirm;
|
|
68
|
+
// }
|
|
69
|
+
// if (useDev) {
|
|
70
|
+
// this.log(pc.yellow("Using Kradle's development environment."));
|
|
71
|
+
// } else {
|
|
72
|
+
// this.log(pc.green("Using Kradle's production environment."));
|
|
73
|
+
// }
|
|
74
|
+
this.log();
|
|
75
|
+
this.log(pc.yellow("Cloud Analytics are only available in the development environment for now. Development environment will be used."));
|
|
76
|
+
const useDev = true;
|
|
75
77
|
const domain = useDev ? "dev.kradle.ai" : "kradle.ai";
|
|
76
78
|
let apiKey;
|
|
77
79
|
if (flags["api-key"]) {
|
|
78
80
|
apiKey = flags["api-key"];
|
|
79
81
|
}
|
|
80
82
|
else {
|
|
81
|
-
this.log(pc.dim(
|
|
83
|
+
this.log(pc.dim(`Get your API key at: https://${domain}/settings#api-keys`));
|
|
82
84
|
const { key } = await enquirer.prompt({
|
|
83
85
|
type: "password",
|
|
84
86
|
name: "key",
|
package/dist/lib/api-client.d.ts
CHANGED
|
@@ -29,6 +29,7 @@ export declare class ApiClient {
|
|
|
29
29
|
getHuman(): Promise<z.infer<typeof HumanSchema>>;
|
|
30
30
|
listChallenges(): Promise<ChallengeSchemaType[]>;
|
|
31
31
|
listKradleAgents(): Promise<AgentSchemaType[]>;
|
|
32
|
+
listKradleChallenges(): Promise<ChallengeSchemaType[]>;
|
|
32
33
|
getChallenge(challengeId: string): Promise<ChallengeSchemaType>;
|
|
33
34
|
/**
|
|
34
35
|
* Check if a challenge exists in the cloud.
|
package/dist/lib/api-client.js
CHANGED
|
@@ -117,6 +117,9 @@ export class ApiClient {
|
|
|
117
117
|
async listKradleAgents() {
|
|
118
118
|
return this.listResource("humans/team-kradle/agents", "agents", AgentsResponseSchema);
|
|
119
119
|
}
|
|
120
|
+
async listKradleChallenges() {
|
|
121
|
+
return this.listResource("humans/team-kradle/challenges", "challenges", ChallengesResponseSchema);
|
|
122
|
+
}
|
|
120
123
|
async getChallenge(challengeId) {
|
|
121
124
|
const url = `challenges/${challengeId}`;
|
|
122
125
|
return this.get("web", url, {}, ChallengeSchema);
|
package/dist/lib/schemas.d.ts
CHANGED
|
@@ -21,9 +21,9 @@ export declare const ChallengeSchema: z.ZodObject<{
|
|
|
21
21
|
}>;
|
|
22
22
|
}, z.core.$strip>;
|
|
23
23
|
description: z.ZodOptional<z.ZodString>;
|
|
24
|
-
task: z.ZodString
|
|
24
|
+
task: z.ZodOptional<z.ZodString>;
|
|
25
25
|
roles: z.ZodRecord<z.ZodString, z.ZodObject<{
|
|
26
|
-
description: z.ZodString
|
|
26
|
+
description: z.ZodOptional<z.ZodString>;
|
|
27
27
|
specificTask: z.ZodString;
|
|
28
28
|
minParticipants: z.ZodOptional<z.ZodNumber>;
|
|
29
29
|
maxParticipants: z.ZodOptional<z.ZodNumber>;
|
|
@@ -63,9 +63,9 @@ export declare const ChallengesResponseSchema: z.ZodObject<{
|
|
|
63
63
|
}>;
|
|
64
64
|
}, z.core.$strip>;
|
|
65
65
|
description: z.ZodOptional<z.ZodString>;
|
|
66
|
-
task: z.ZodString
|
|
66
|
+
task: z.ZodOptional<z.ZodString>;
|
|
67
67
|
roles: z.ZodRecord<z.ZodString, z.ZodObject<{
|
|
68
|
-
description: z.ZodString
|
|
68
|
+
description: z.ZodOptional<z.ZodString>;
|
|
69
69
|
specificTask: z.ZodString;
|
|
70
70
|
minParticipants: z.ZodOptional<z.ZodNumber>;
|
|
71
71
|
maxParticipants: z.ZodOptional<z.ZodNumber>;
|
package/dist/lib/schemas.js
CHANGED
|
@@ -12,9 +12,9 @@ export const ChallengeSchema = z.object({
|
|
|
12
12
|
gameMode: z.enum(["survival", "creative", "adventure", "spectator"]),
|
|
13
13
|
}),
|
|
14
14
|
description: z.string().optional(),
|
|
15
|
-
task: z.string(),
|
|
15
|
+
task: z.string().optional(),
|
|
16
16
|
roles: z.record(z.string(), z.object({
|
|
17
|
-
description: z.string(),
|
|
17
|
+
description: z.string().optional(),
|
|
18
18
|
specificTask: z.string(),
|
|
19
19
|
minParticipants: z.number().optional(),
|
|
20
20
|
maxParticipants: z.number().optional(),
|
package/oclif.manifest.json
CHANGED
|
@@ -17,14 +17,6 @@
|
|
|
17
17
|
"multiple": false,
|
|
18
18
|
"type": "option"
|
|
19
19
|
},
|
|
20
|
-
"dev": {
|
|
21
|
-
"char": "d",
|
|
22
|
-
"description": "Use Kradle's development environment instead of production",
|
|
23
|
-
"name": "dev",
|
|
24
|
-
"required": false,
|
|
25
|
-
"allowNo": false,
|
|
26
|
-
"type": "boolean"
|
|
27
|
-
},
|
|
28
20
|
"api-key": {
|
|
29
21
|
"char": "k",
|
|
30
22
|
"description": "Kradle API key",
|
|
@@ -305,7 +297,7 @@
|
|
|
305
297
|
"watch.js"
|
|
306
298
|
]
|
|
307
299
|
},
|
|
308
|
-
"evaluation:
|
|
300
|
+
"evaluation:create": {
|
|
309
301
|
"aliases": [],
|
|
310
302
|
"args": {
|
|
311
303
|
"name": {
|
|
@@ -314,14 +306,14 @@
|
|
|
314
306
|
"required": true
|
|
315
307
|
}
|
|
316
308
|
},
|
|
317
|
-
"description": "
|
|
309
|
+
"description": "Create a new evaluation",
|
|
318
310
|
"examples": [
|
|
319
311
|
"<%= config.bin %> <%= command.id %> my-evaluation"
|
|
320
312
|
],
|
|
321
313
|
"flags": {},
|
|
322
314
|
"hasDynamicHelp": false,
|
|
323
315
|
"hiddenAliases": [],
|
|
324
|
-
"id": "evaluation:
|
|
316
|
+
"id": "evaluation:create",
|
|
325
317
|
"pluginAlias": "@kradle/cli",
|
|
326
318
|
"pluginName": "@kradle/cli",
|
|
327
319
|
"pluginType": "core",
|
|
@@ -332,7 +324,7 @@
|
|
|
332
324
|
"dist",
|
|
333
325
|
"commands",
|
|
334
326
|
"evaluation",
|
|
335
|
-
"
|
|
327
|
+
"create.js"
|
|
336
328
|
]
|
|
337
329
|
},
|
|
338
330
|
"evaluation:list": {
|
|
@@ -409,5 +401,5 @@
|
|
|
409
401
|
]
|
|
410
402
|
}
|
|
411
403
|
},
|
|
412
|
-
"version": "0.0.
|
|
404
|
+
"version": "0.0.8"
|
|
413
405
|
}
|