@profoundlogic/coderflow-server 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE.txt +322 -0
- package/README.md +158 -0
- package/dist/LICENSE.txt +322 -0
- package/dist/README.md +158 -0
- package/dist/base-image/Dockerfile +184 -0
- package/dist/base-image/agent-wrapper.sh +143 -0
- package/dist/base-image/apply-local-state.sh +357 -0
- package/dist/base-image/coder-git-credential-helper +307 -0
- package/dist/base-image/entrypoint.sh +942 -0
- package/dist/base-image/ssh_config_template +41 -0
- package/dist/base-image/start-code-server.sh +76 -0
- package/dist/base-image/sync-repos.sh +170 -0
- package/dist/base-image/vscode-extensions.txt +10 -0
- package/dist/base-image/vscode-settings.json +41 -0
- package/dist/coder-server.js +2 -0
- package/dist/config/cli-models.json +45 -0
- package/dist/config/imported-skills.schema.json +83 -0
- package/dist/config/skill-catalog.json +18 -0
- package/dist/config/skill-catalog.schema.json +140 -0
- package/dist/config.js +1 -0
- package/dist/examples/oidc.json.example +11 -0
- package/dist/lib/agent-keepalive.js +1 -0
- package/dist/lib/api-keys.js +1 -0
- package/dist/lib/apiKeys.js +1 -0
- package/dist/lib/auto-judge.js +1 -0
- package/dist/lib/basic-auth.js +1 -0
- package/dist/lib/build-history.js +1 -0
- package/dist/lib/build-output-service.js +1 -0
- package/dist/lib/build-scheduler.js +1 -0
- package/dist/lib/build-service.js +1 -0
- package/dist/lib/claude-oauth-refresh.js +1 -0
- package/dist/lib/cli/build.js +1 -0
- package/dist/lib/cli/config-command.js +1 -0
- package/dist/lib/cli/config.js +1 -0
- package/dist/lib/cli/create-user.js +1 -0
- package/dist/lib/cli/init.js +1 -0
- package/dist/lib/cli/jira.js +1 -0
- package/dist/lib/cli/license.js +1 -0
- package/dist/lib/cli/server-manager.js +1 -0
- package/dist/lib/container-tokens.js +1 -0
- package/dist/lib/data-dir.js +1 -0
- package/dist/lib/deployment-history.js +1 -0
- package/dist/lib/deployment-service.js +1 -0
- package/dist/lib/docker-utils.js +1 -0
- package/dist/lib/email.js +1 -0
- package/dist/lib/emailTemplates.js +1 -0
- package/dist/lib/entitlement.js +1 -0
- package/dist/lib/fetch-utils.js +1 -0
- package/dist/lib/git-provider-service.js +1 -0
- package/dist/lib/git-provider-setup/assets/coderflow_github_app.png +0 -0
- package/dist/lib/git-provider-setup/github-setup-handler.js +1 -0
- package/dist/lib/git-provider-setup/index.js +1 -0
- package/dist/lib/git-provider-setup/setup-factory.js +1 -0
- package/dist/lib/git-provider-setup/setup-interface.js +1 -0
- package/dist/lib/git-providers/azure-devops-provider.js +1 -0
- package/dist/lib/git-providers/github-app-provider.js +1 -0
- package/dist/lib/git-providers/index.js +1 -0
- package/dist/lib/git-providers/provider-factory.js +1 -0
- package/dist/lib/git-providers/provider-interface.js +1 -0
- package/dist/lib/jira-client.js +1 -0
- package/dist/lib/logger.js +1 -0
- package/dist/lib/model-fetcher.js +1 -0
- package/dist/lib/notifications.js +1 -0
- package/dist/lib/oidc-auth.js +1 -0
- package/dist/lib/oidc-device-flow.js +1 -0
- package/dist/lib/passwordTokens.js +1 -0
- package/dist/lib/pin-cascade.js +1 -0
- package/dist/lib/provider-accounts.js +1 -0
- package/dist/lib/provider-oauth.js +1 -0
- package/dist/lib/provider-profile.js +1 -0
- package/dist/lib/provider-token-refresh.js +1 -0
- package/dist/lib/roles.js +1 -0
- package/dist/lib/secrets.js +1 -0
- package/dist/lib/state-capture.js +1 -0
- package/dist/lib/static-files.js +1 -0
- package/dist/lib/task-name-generator.js +1 -0
- package/dist/lib/users.js +1 -0
- package/dist/middleware/requireAuth.js +1 -0
- package/dist/middleware/requireInit.js +1 -0
- package/dist/middleware/requirePermission.js +1 -0
- package/dist/package-lock.json +4151 -0
- package/dist/package.json +50 -0
- package/dist/routes/apiKeys.js +1 -0
- package/dist/routes/auth-oidc.js +1 -0
- package/dist/routes/auth.js +1 -0
- package/dist/routes/build.js +1 -0
- package/dist/routes/containers.js +1 -0
- package/dist/routes/deploy-task.js +1 -0
- package/dist/routes/environment-management.js +1 -0
- package/dist/routes/environments.js +1 -0
- package/dist/routes/external-skills.js +1 -0
- package/dist/routes/git-credentials.js +1 -0
- package/dist/routes/git-provider-setup.js +1 -0
- package/dist/routes/health.js +1 -0
- package/dist/routes/jira.js +1 -0
- package/dist/routes/objective-management.js +1 -0
- package/dist/routes/password.js +1 -0
- package/dist/routes/prompt.js +1 -0
- package/dist/routes/provider-auth.js +1 -0
- package/dist/routes/qa.js +1 -0
- package/dist/routes/settings.js +1 -0
- package/dist/routes/skill-management.js +1 -0
- package/dist/routes/skills.js +1 -0
- package/dist/routes/tasks.js +2 -0
- package/dist/routes/templates.js +1 -0
- package/dist/routes/test-task.js +1 -0
- package/dist/routes/test.js +1 -0
- package/dist/routes/users.js +1 -0
- package/dist/routes/visualizations.js +1 -0
- package/dist/schemas/template-metadata.schema.json +178 -0
- package/dist/scripts/create-user.js +2 -0
- package/dist/shipped-skills/environment-instructions/SKILL.md +154 -0
- package/dist/shipped-skills/environment-templates/SKILL.md +282 -0
- package/dist/shipped-skills/objective-management/SKILL.md +238 -0
- package/dist/shipped-skills/skill-editor/SKILL.md +326 -0
- package/dist/start.js +2 -0
- package/dist/web-ui/public/activity-detail-modal.js +1 -0
- package/dist/web-ui/public/activity-feed.js +1 -0
- package/dist/web-ui/public/activity-formatters.js +1 -0
- package/dist/web-ui/public/agent-event-parser.js +1 -0
- package/dist/web-ui/public/app.js +1 -0
- package/dist/web-ui/public/approve-dialog.js +1 -0
- package/dist/web-ui/public/coderflow-logo-reversed.svg +46 -0
- package/dist/web-ui/public/coderflow-logo.svg +46 -0
- package/dist/web-ui/public/comments-widget.js +1 -0
- package/dist/web-ui/public/docs/.nojekyll +0 -0
- package/dist/web-ui/public/docs/README.md +26 -0
- package/dist/web-ui/public/docs/_sidebar.md +47 -0
- package/dist/web-ui/public/docs/admin/ai-providers.md +132 -0
- package/dist/web-ui/public/docs/admin/email-notifications.md +69 -0
- package/dist/web-ui/public/docs/admin/environments.md +215 -0
- package/dist/web-ui/public/docs/admin/git-providers.md +147 -0
- package/dist/web-ui/public/docs/admin/installation.md +313 -0
- package/dist/web-ui/public/docs/admin/skills.md +35 -0
- package/dist/web-ui/public/docs/admin/sso.md +241 -0
- package/dist/web-ui/public/docs/admin/users-and-roles.md +57 -0
- package/dist/web-ui/public/docs/code/cli.md +102 -0
- package/dist/web-ui/public/docs/code/files-and-editing.md +86 -0
- package/dist/web-ui/public/docs/code/terminal-access.md +110 -0
- package/dist/web-ui/public/docs/code/vscode-extension.md +58 -0
- package/dist/web-ui/public/docs/getting-started/core-concepts.md +129 -0
- package/dist/web-ui/public/docs/getting-started/overview.md +46 -0
- package/dist/web-ui/public/docs/index.html +151 -0
- package/dist/web-ui/public/docs/integrations/custom.md +58 -0
- package/dist/web-ui/public/docs/integrations/ibmi/overview.md +58 -0
- package/dist/web-ui/public/docs/integrations/overview.md +48 -0
- package/dist/web-ui/public/docs/objectives/qa-mode.md +90 -0
- package/dist/web-ui/public/docs/objectives/staged-tasks.md +60 -0
- package/dist/web-ui/public/docs/objectives/working-with-objectives.md +102 -0
- package/dist/web-ui/public/docs/tasks/approval-and-deployment.md +83 -0
- package/dist/web-ui/public/docs/tasks/creating-tasks.md +111 -0
- package/dist/web-ui/public/docs/tasks/judging.md +114 -0
- package/dist/web-ui/public/docs/tasks/providing-feedback.md +41 -0
- package/dist/web-ui/public/docs/tasks/task-groups.md +73 -0
- package/dist/web-ui/public/docs/tasks/winner-selection.md +75 -0
- package/dist/web-ui/public/docs/templates/batch-processing.md +152 -0
- package/dist/web-ui/public/docs/templates/task-templates.md +44 -0
- package/dist/web-ui/public/docs/templates/template-examples.md +93 -0
- package/dist/web-ui/public/docs/testing/profound-automated-testing.md +77 -0
- package/dist/web-ui/public/docs/testing/task-visualizations.md +42 -0
- package/dist/web-ui/public/docs/testing/testing-menu.md +118 -0
- package/dist/web-ui/public/environments.css +3942 -0
- package/dist/web-ui/public/environments.html +1791 -0
- package/dist/web-ui/public/environments.js +1 -0
- package/dist/web-ui/public/favicon-16.png +0 -0
- package/dist/web-ui/public/favicon-32.png +0 -0
- package/dist/web-ui/public/favicon.ico +0 -0
- package/dist/web-ui/public/feedback-widget.css +3133 -0
- package/dist/web-ui/public/feedback-widget.js +1 -0
- package/dist/web-ui/public/git-history.css +2663 -0
- package/dist/web-ui/public/git-history.html +272 -0
- package/dist/web-ui/public/git-history.js +1 -0
- package/dist/web-ui/public/git-status.js +1 -0
- package/dist/web-ui/public/index.html +1459 -0
- package/dist/web-ui/public/index.js +1 -0
- package/dist/web-ui/public/login.html +346 -0
- package/dist/web-ui/public/login.js +1 -0
- package/dist/web-ui/public/markdown-editor.js +1 -0
- package/dist/web-ui/public/markdown-file-editor.js +1 -0
- package/dist/web-ui/public/modal-maximize.js +1 -0
- package/dist/web-ui/public/notifications.js +1 -0
- package/dist/web-ui/public/server-health.js +1 -0
- package/dist/web-ui/public/settings.css +761 -0
- package/dist/web-ui/public/settings.html +1044 -0
- package/dist/web-ui/public/settings.js +1 -0
- package/dist/web-ui/public/setup-password.html +355 -0
- package/dist/web-ui/public/setup-password.js +1 -0
- package/dist/web-ui/public/skills.css +1949 -0
- package/dist/web-ui/public/skills.html +820 -0
- package/dist/web-ui/public/skills.js +1 -0
- package/dist/web-ui/public/sse-client.js +1 -0
- package/dist/web-ui/public/sse-shared-worker.js +1 -0
- package/dist/web-ui/public/styles.css +18614 -0
- package/dist/web-ui/public/task.html +1779 -0
- package/dist/web-ui/public/task.js +1 -0
- package/dist/web-ui/public/terminal.html +45 -0
- package/dist/web-ui/public/terminal.js +1 -0
- package/dist/web-ui/public/theme.js +1 -0
- package/dist/web-ui/public/users.html +298 -0
- package/dist/web-ui/public/users.js +1 -0
- package/dist/web-ui/public/variant-grouping.js +1 -0
- package/package.json +63 -0
|
@@ -0,0 +1,111 @@
|
|
|
1
|
+
# Creating Tasks
|
|
2
|
+
|
|
3
|
+
A task is a unit of work executed by an AI agent. When you create a task, CoderFlow launches an isolated container, runs the agent with your instructions, and captures the results for your review.
|
|
4
|
+
|
|
5
|
+
## Tasks vs. Objectives
|
|
6
|
+
|
|
7
|
+
You can create tasks in two ways:
|
|
8
|
+
|
|
9
|
+
- **Direct tasks**: Create and launch immediately when you know exactly what needs to be done. Best for simple, well-defined work.
|
|
10
|
+
|
|
11
|
+
- **Tasks from objectives**: Launch a task from an existing objective. Best when requirements need refinement, work is complex, or you expect multiple iterations. The objective serves as the parent record, tracking all tasks launched from it.
|
|
12
|
+
|
|
13
|
+
For straightforward fixes or small changes, direct tasks are efficient. For larger initiatives where you'll iterate on requirements, start with an objective and launch tasks from it.
|
|
14
|
+
|
|
15
|
+
## Writing Instructions
|
|
16
|
+
|
|
17
|
+
Instructions tell the agent what you want to accomplish. Clear, specific instructions lead to better results.
|
|
18
|
+
|
|
19
|
+
### Best Practices
|
|
20
|
+
|
|
21
|
+
- **Be specific**: Instead of "fix the bug," describe the symptom, expected behavior, and relevant files
|
|
22
|
+
- **Include context**: Reference related code, error messages, or test failures
|
|
23
|
+
- **Break down complex work**: Large tasks should be decomposed into logical steps
|
|
24
|
+
- **Specify constraints**: Note performance requirements, style guides, or compatibility needs
|
|
25
|
+
- **Define success criteria**: What does "done" look like? What tests should pass?
|
|
26
|
+
|
|
27
|
+
### Example
|
|
28
|
+
|
|
29
|
+
Instead of:
|
|
30
|
+
> "Fix the login issue"
|
|
31
|
+
|
|
32
|
+
Try:
|
|
33
|
+
> "Users clicking the login button on /auth/login see a spinning loader that never resolves. The expected behavior is a redirect to /dashboard after successful authentication. Check the authentication service for timeout issues—recent logs show 504 errors from the identity provider."
|
|
34
|
+
|
|
35
|
+
## Attachments
|
|
36
|
+
|
|
37
|
+
Attach files and screenshots to provide additional context.
|
|
38
|
+
|
|
39
|
+
### Supported Types
|
|
40
|
+
|
|
41
|
+
- **Images**: Screenshots, mockups, diagrams (PNG, JPG, GIF, WebP)
|
|
42
|
+
- **Code files**: Source code, configuration, logs
|
|
43
|
+
- **Documents**: Text files, markdown, documentation
|
|
44
|
+
|
|
45
|
+
Attachments are placed in the container where the agent can access them. Reference attachments in your instructions so the agent knows to use them.
|
|
46
|
+
|
|
47
|
+
### Adding Attachments
|
|
48
|
+
|
|
49
|
+
1. Click **Attachments** in the task form
|
|
50
|
+
2. Select files or paste images from your clipboard
|
|
51
|
+
3. Add up to 10 files, 50MB total
|
|
52
|
+
|
|
53
|
+
Alternatively, drag and drop files directly into the instructions field.
|
|
54
|
+
|
|
55
|
+
You can also paste screenshots directly into the instructions box using Ctrl+V (Cmd+V on Mac).
|
|
56
|
+
|
|
57
|
+
## Selecting Agents
|
|
58
|
+
|
|
59
|
+
Choose which AI agent will work on the task.
|
|
60
|
+
|
|
61
|
+
### Available Agents
|
|
62
|
+
|
|
63
|
+
- **Claude**: Strong at complex reasoning and multi-step engineering tasks. Good default for most work.
|
|
64
|
+
- **Codex**: Fast at translating specifications into code. Use for straightforward coding tasks.
|
|
65
|
+
- **Gemini**: Large context window for tasks requiring deep file understanding.
|
|
66
|
+
|
|
67
|
+
### Running Multiple Agents
|
|
68
|
+
|
|
69
|
+
Select multiple agents to run the same task in parallel, creating a **task group**. Each agent works independently in its own container. You can then compare their approaches and select the best result.
|
|
70
|
+
|
|
71
|
+
This is especially powerful for complex tasks where different agents might find different solutions.
|
|
72
|
+
|
|
73
|
+
## Skills in Tasks
|
|
74
|
+
|
|
75
|
+
Tasks automatically inherit the skills assigned to the selected environment. Those skills are injected into the task container at launch, so the agent can invoke them immediately.
|
|
76
|
+
|
|
77
|
+
If a task needs additional skills, update the environment's skill assignments before launching the task.
|
|
78
|
+
|
|
79
|
+
## Selecting Branches
|
|
80
|
+
|
|
81
|
+
Whether your environment has one or multiple repositories, you can specify which branch to use for each.
|
|
82
|
+
|
|
83
|
+
- **Default branch**: Used unless you select another
|
|
84
|
+
- **Branch restrictions**: Some repositories may be locked to specific branches
|
|
85
|
+
- **New branches**: You can create new branches during the approval step
|
|
86
|
+
|
|
87
|
+
The agent checks out your selected branches before starting work. During approval, you choose whether to push to the same branch, create a new one, or commit without pushing.
|
|
88
|
+
|
|
89
|
+
## Running the Task
|
|
90
|
+
|
|
91
|
+
Once you've entered instructions and selected options:
|
|
92
|
+
|
|
93
|
+
1. Click **Launch Task** (or press Ctrl-Enter)
|
|
94
|
+
2. Task enters the queue if all agent slots are full
|
|
95
|
+
3. Agent starts when a slot is available
|
|
96
|
+
4. Live updates appear in the Activity Feed
|
|
97
|
+
5. Task completes when the agent finishes
|
|
98
|
+
6. Review results and approve, or send follow-up instructions
|
|
99
|
+
|
|
100
|
+
### Task States
|
|
101
|
+
|
|
102
|
+
- **Pending**: Created, waiting to be queued
|
|
103
|
+
- **Queued**: Waiting for an available agent slot
|
|
104
|
+
- **Running**: Agent is actively working
|
|
105
|
+
- **Completed**: Agent finished successfully
|
|
106
|
+
- **Failed**: Agent encountered an error
|
|
107
|
+
- **Staged**: Container ready, waiting for you to start the agent
|
|
108
|
+
|
|
109
|
+
### Queue Management
|
|
110
|
+
|
|
111
|
+
When all slots are occupied, tasks wait in a first-in-first-out queue. Your position is displayed in the UI. Tasks run automatically when slots open.
|
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
# Judging
|
|
2
|
+
|
|
3
|
+
When you run multiple agents on the same task, judge agents can automatically evaluate the results and help you identify the best solution. Judges analyze code quality, correctness, and completeness—providing objective feedback that saves you time reviewing variants.
|
|
4
|
+
|
|
5
|
+
## What Are Judge Tasks?
|
|
6
|
+
|
|
7
|
+
A judge task is a special task that evaluates other tasks in a group. Unlike regular tasks that modify your code, judges:
|
|
8
|
+
|
|
9
|
+
- Run **after** primary tasks complete
|
|
10
|
+
- Have **read-only access** to primary task results
|
|
11
|
+
- Analyze code quality, correctness, and completeness
|
|
12
|
+
- Produce evaluation notes and scoring
|
|
13
|
+
- **Do not modify** your repositories
|
|
14
|
+
|
|
15
|
+
Judge tasks appear in the task group alongside other variants.
|
|
16
|
+
|
|
17
|
+
## How Judges Evaluate
|
|
18
|
+
|
|
19
|
+
When a judge task runs, it:
|
|
20
|
+
|
|
21
|
+
1. Reads all primary task results—patches, summaries, exit codes, logs
|
|
22
|
+
2. Analyzes the code changes each agent made
|
|
23
|
+
3. Reviews test results and error messages
|
|
24
|
+
4. Evaluates each variant on multiple dimensions
|
|
25
|
+
5. Generates a detailed report with scores and recommendations
|
|
26
|
+
|
|
27
|
+
### Evaluation Dimensions
|
|
28
|
+
|
|
29
|
+
Judges score variants on:
|
|
30
|
+
|
|
31
|
+
- **Correctness**: Does the code solve the problem? Are edge cases handled? Do tests pass?
|
|
32
|
+
- **Code quality**: Is it readable, maintainable, and following good patterns?
|
|
33
|
+
- **Completeness**: Are all requirements addressed? Is anything missing?
|
|
34
|
+
- **Performance**: Is the implementation efficient? (when applicable)
|
|
35
|
+
|
|
36
|
+
Each dimension receives a score, and judges provide detailed notes explaining their reasoning.
|
|
37
|
+
|
|
38
|
+
Judges may use or add other dimensions based on the task context.
|
|
39
|
+
|
|
40
|
+
## Automatic Judging
|
|
41
|
+
|
|
42
|
+
You can configure task groups to automatically launch judge tasks when primary agents finish.
|
|
43
|
+
|
|
44
|
+
### Configuring Auto-Judge
|
|
45
|
+
|
|
46
|
+
When creating a task group, select which agents should serve as judges.
|
|
47
|
+
|
|
48
|
+
Multiple judges provide independent evaluations, reducing bias and increasing confidence in the results.
|
|
49
|
+
|
|
50
|
+
### When Auto-Judge Launches
|
|
51
|
+
|
|
52
|
+
Judge tasks launch automatically when:
|
|
53
|
+
|
|
54
|
+
- All primary tasks have completed
|
|
55
|
+
- At least two variants finished successfully
|
|
56
|
+
- Multiple variants made file changes
|
|
57
|
+
- No follow-up instructions are pending
|
|
58
|
+
|
|
59
|
+
If conditions aren't met, auto-judge is skipped—but you can always launch judges manually.
|
|
60
|
+
|
|
61
|
+
## Manual Judge Launch
|
|
62
|
+
|
|
63
|
+
You can launch judge tasks at any time:
|
|
64
|
+
|
|
65
|
+
1. Open the task group
|
|
66
|
+
2. Click the **Judge ucib**
|
|
67
|
+
3. Select which agents to use as judges
|
|
68
|
+
4. Judge tasks are created and queued
|
|
69
|
+
|
|
70
|
+
This is useful when auto-judge conditions weren't met, or when you want additional evaluation after making changes.
|
|
71
|
+
|
|
72
|
+
## Judge Consensus
|
|
73
|
+
|
|
74
|
+
When multiple judges evaluate the same variants:
|
|
75
|
+
|
|
76
|
+
- Each judge scores independently
|
|
77
|
+
- Results can be compared side-by-side
|
|
78
|
+
- Consensus emerges when judges agree on the best variant
|
|
79
|
+
- Disagreements highlight areas worth closer review
|
|
80
|
+
|
|
81
|
+
If two out of three judges recommend the same variant, that's a strong signal. If judges disagree significantly, you may want to review their reasoning before deciding.
|
|
82
|
+
|
|
83
|
+
## Using Judge Feedback
|
|
84
|
+
|
|
85
|
+
Judge feedback isn't just for picking a winner—it helps you improve the code.
|
|
86
|
+
|
|
87
|
+
### Common Issues Judges Identify
|
|
88
|
+
|
|
89
|
+
- **Test failures**: Some tests aren't passing
|
|
90
|
+
- **Edge cases**: Boundary conditions not handled
|
|
91
|
+
- **Error handling**: Missing validation or exception handling
|
|
92
|
+
- **Code style**: Inconsistent naming or formatting
|
|
93
|
+
- **Incomplete implementation**: Features not fully implemented
|
|
94
|
+
|
|
95
|
+
### Feedback Loops
|
|
96
|
+
|
|
97
|
+
After reviewing judge feedback:
|
|
98
|
+
|
|
99
|
+
1. Identify specific issues mentioned in the evaluation
|
|
100
|
+
2. Send follow-up instructions to the winning variant addressing those issues
|
|
101
|
+
3. The agent resumes and implements improvements
|
|
102
|
+
4. Optionally re-run judges to verify the improvements
|
|
103
|
+
|
|
104
|
+
This creates an automated refinement cycle where judges catch issues that agents then fix.
|
|
105
|
+
|
|
106
|
+
## Judges Don't Approve
|
|
107
|
+
|
|
108
|
+
Important: Judge tasks provide **feedback and recommendations only**. They do not:
|
|
109
|
+
|
|
110
|
+
- Automatically approve changes
|
|
111
|
+
- Commit or push code
|
|
112
|
+
- Mark tasks as winners
|
|
113
|
+
|
|
114
|
+
You make the final decision on winner selection and approval. Judges inform your decision—they don't make it for you.
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
# Providing Feedback
|
|
2
|
+
|
|
3
|
+
After a task starts running, you can provide follow-up instructions to guide the agent toward better results. Feedback helps the agent adjust course, fix errors, or add missing features.
|
|
4
|
+
|
|
5
|
+
## Follow-Up Instructions (R Hotkey)
|
|
6
|
+
|
|
7
|
+
The fastest way to send feedback while viewing a task.
|
|
8
|
+
|
|
9
|
+
### Keyboard Shortcut
|
|
10
|
+
|
|
11
|
+
Press **R** while viewing a task detail to open the follow-up input under the Latest Update section. Alternatively, use the follow-up box at the bottom of the activity feed.
|
|
12
|
+
|
|
13
|
+
This works on completed tasks that have an active container. If the container is stopped due to inactivity, you can start it first by clicking the "Start" button.
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
### How Follow-Ups Work
|
|
17
|
+
|
|
18
|
+
1. Type your feedback in the markdown editor
|
|
19
|
+
2. Paste screenshots or images if needed, attach files, logs, etc.
|
|
20
|
+
3. Press Cmd/Ctrl+Enter or click "Submit"
|
|
21
|
+
4. Agent resumes work, incorporating your feedback, and the status changes back to "running"
|
|
22
|
+
5. View new activity in the feed as the agent processes your instructions
|
|
23
|
+
|
|
24
|
+
## Using the Feedback Widget
|
|
25
|
+
|
|
26
|
+
If your environment has an Application Server with Launch URL's enabled, you can use the Testing Menu to launch a copy of your application in a separate browser tab. This runs the application either directly within the container or through a proxy. Both methods support the Feedback widget.
|
|
27
|
+
|
|
28
|
+
### Widget UI
|
|
29
|
+
|
|
30
|
+
The feedback widget appears as a floating icon in the corner of your application. Click the icon to open the feedback panel. The panel includes:
|
|
31
|
+
- **Latest Agent Activity Feed** that shows the most recent agent actions that are updated live as the agent runs
|
|
32
|
+
- **Markdown Editor** for formatted feedback, including the ability to paste images and drag/drop attachments
|
|
33
|
+
- **Screenshot Tool** to take annotated screenshots of the application you are running
|
|
34
|
+
- **Element Selector Tool** to provide context-based information by selecting HTML elements on the screen
|
|
35
|
+
- **Context Selection** to send to the agent - DOM, Rich Display data, 5250 buffer data, etc. depending on the application type
|
|
36
|
+
|
|
37
|
+
## Feedback from Judge Tasks
|
|
38
|
+
|
|
39
|
+
Judge tasks can identify issues and generate automated feedback for execution tasks. When a completed judge task is selected in the Judge Panel, it may present suggested feedback for the associated execution task along with a severity level.
|
|
40
|
+
|
|
41
|
+
You can click the **Send as Feedback...** button to send this feedback directly to the execution task. Before the feedback is sent, you have the option to edit or augment it in the markdown editor that appears. Once you confirm, the feedback is submitted, and the execution task will resume work based on the provided instructions.
|
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# Task Groups & Variants
|
|
2
|
+
|
|
3
|
+
Task groups let you run the same task with multiple AI agents in parallel, comparing their approaches and selecting the best result.
|
|
4
|
+
|
|
5
|
+
## What is a Task Group?
|
|
6
|
+
|
|
7
|
+
A task group is a collection of related tasks that share:
|
|
8
|
+
|
|
9
|
+
- **Same instructions** and context
|
|
10
|
+
- **Same branch selections** for repositories
|
|
11
|
+
- **Linked visibility** in the UI (see all variants together)
|
|
12
|
+
|
|
13
|
+
Each task in the group is a **variant**—a different agent's approach to the same problem.
|
|
14
|
+
|
|
15
|
+
## Creating a Task Group
|
|
16
|
+
|
|
17
|
+
Groups are created automatically when you submit a task with multiple agents selected. Select Claude, Codex, and Gemini, and you get three variants running in parallel.
|
|
18
|
+
|
|
19
|
+
You can also add variants to an existing task later (see below).
|
|
20
|
+
|
|
21
|
+
## Parallel Execution
|
|
22
|
+
|
|
23
|
+
When your server has multiple queue slots available, variants run simultaneously:
|
|
24
|
+
|
|
25
|
+
1. All selected agents start at roughly the same time
|
|
26
|
+
2. Each works independently in its own container
|
|
27
|
+
3. Results become available as each agent finishes
|
|
28
|
+
4. You can review completed variants while others are still running
|
|
29
|
+
|
|
30
|
+
This parallel approach means you get multiple solutions in about the same time it takes to get one.
|
|
31
|
+
|
|
32
|
+
## Viewing Variants
|
|
33
|
+
|
|
34
|
+
In the task detail view:
|
|
35
|
+
|
|
36
|
+
- All variants appear as tabs or panels
|
|
37
|
+
- Each variant shows its own activity feed
|
|
38
|
+
- Switch between variants to see progress, logs, and changes
|
|
39
|
+
- Status indicators show which are running, completed, or failed
|
|
40
|
+
|
|
41
|
+
## Comparing Variants
|
|
42
|
+
|
|
43
|
+
Click the comparison view (trophy icon) to see variants side-by-side:
|
|
44
|
+
|
|
45
|
+
- **Code changes**: Diffs from each agent
|
|
46
|
+
- **Summaries**: What each agent reported doing
|
|
47
|
+
- **Exit status**: Success or failure
|
|
48
|
+
- **Files modified**: Count and list of changes
|
|
49
|
+
|
|
50
|
+
This view helps you quickly assess which approach looks best before diving into details.
|
|
51
|
+
|
|
52
|
+
## Adding Agents to Existing Groups
|
|
53
|
+
|
|
54
|
+
You can add more variants to a completed task:
|
|
55
|
+
|
|
56
|
+
1. Open the task detail
|
|
57
|
+
2. Click the **Resubmit** icon
|
|
58
|
+
3. Select additional agent(s)
|
|
59
|
+
4. Ensure "Add as variant to existing group" is checked
|
|
60
|
+
5. New variants are created and grouped with the original
|
|
61
|
+
|
|
62
|
+
This is useful when you want another agent's perspective after seeing initial results, or when you want to try a different agent on a task that didn't produce satisfactory results.
|
|
63
|
+
|
|
64
|
+
## When to Use Task Groups
|
|
65
|
+
|
|
66
|
+
Task groups are most valuable when:
|
|
67
|
+
|
|
68
|
+
- **Uncertain requirements**: Different agents may interpret ambiguous instructions differently
|
|
69
|
+
- **Complex problems**: Multiple approaches might all be valid
|
|
70
|
+
- **Learning agent strengths**: Compare how Claude, Codex, and Gemini handle your codebase
|
|
71
|
+
- **High-stakes changes**: Get multiple opinions before committing
|
|
72
|
+
|
|
73
|
+
For simple, well-defined tasks, a single agent is often sufficient. Use groups when the extra perspectives are worth the compute time.
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# Winner Selection
|
|
2
|
+
|
|
3
|
+
When multiple agents work on the same task, you need to choose which variant to approve and deploy. Winner selection is how you mark your preferred solution.
|
|
4
|
+
|
|
5
|
+
## Winners and Losers
|
|
6
|
+
|
|
7
|
+
Each variant in a task group can be marked as:
|
|
8
|
+
|
|
9
|
+
- **Winner**: The best variant—approved changes will come from this one
|
|
10
|
+
- **Loser**: This variant had issues and should be excluded
|
|
11
|
+
- **Neutral**: Neither winner nor loser (default state)
|
|
12
|
+
|
|
13
|
+
Only one variant can be the winner at a time. Marking a new winner automatically clears the previous one.
|
|
14
|
+
|
|
15
|
+
### Why Mark Variants?
|
|
16
|
+
|
|
17
|
+
- **Clarity**: When multiple agents succeeded, marking a winner makes your choice explicit
|
|
18
|
+
- **Approval flow**: The winner is what gets committed when you approve
|
|
19
|
+
- **Exclusion**: Marking losers removes them from consideration
|
|
20
|
+
- **Audit trail**: Track which variants were considered and why
|
|
21
|
+
|
|
22
|
+
## Selecting a Winner
|
|
23
|
+
|
|
24
|
+
1. Open the task group (click the group in your task list)
|
|
25
|
+
2. Review each variant's changes, summary, and test results
|
|
26
|
+
3. Click **Mark as Winner** icon on your preferred variant
|
|
27
|
+
4. The group updates to show your selection
|
|
28
|
+
|
|
29
|
+
You can change your mind at any time by marking a different variant as winner.
|
|
30
|
+
|
|
31
|
+
## Reviewing Variants
|
|
32
|
+
|
|
33
|
+
Before selecting a winner, review what each agent produced:
|
|
34
|
+
|
|
35
|
+
### Code Changes
|
|
36
|
+
|
|
37
|
+
View the diff for each variant to see exactly what changed. Look for:
|
|
38
|
+
|
|
39
|
+
- Correct implementation of requirements
|
|
40
|
+
- Clean, readable code
|
|
41
|
+
- Appropriate error handling
|
|
42
|
+
- No unintended side effects
|
|
43
|
+
|
|
44
|
+
### Summary
|
|
45
|
+
|
|
46
|
+
Each agent produces a summary explaining what it did. Compare summaries to understand different approaches and any issues encountered.
|
|
47
|
+
|
|
48
|
+
### Test Results
|
|
49
|
+
|
|
50
|
+
Check whether tests passed for each variant. A variant with failing tests may need follow-up work before it's ready to approve.
|
|
51
|
+
|
|
52
|
+
### Exit Status
|
|
53
|
+
|
|
54
|
+
Variants that completed successfully show exit code 0. Non-zero exit codes indicate the agent encountered an error—review the logs to understand what happened.
|
|
55
|
+
|
|
56
|
+
## Marking Losers
|
|
57
|
+
|
|
58
|
+
If a variant is clearly unsuitable, mark it as a loser:
|
|
59
|
+
|
|
60
|
+
1. Click **Mark as Loser** icon on the variant
|
|
61
|
+
2. The variant is visually de-emphasized
|
|
62
|
+
3. It's excluded from comparisons and judge evaluations
|
|
63
|
+
|
|
64
|
+
You can unmark a loser if you change your mind.
|
|
65
|
+
|
|
66
|
+
## After Selection
|
|
67
|
+
|
|
68
|
+
Once you've selected a winner:
|
|
69
|
+
|
|
70
|
+
- **Review in detail**: Open the winner to examine changes closely
|
|
71
|
+
- **Run the application**: Test the modified code in the container
|
|
72
|
+
- **Provide feedback**: If adjustments are needed, send follow-up instructions
|
|
73
|
+
- **Approve**: When satisfied, approve to commit the changes
|
|
74
|
+
|
|
75
|
+
See **Approval & Deployment** for the next steps after selecting a winner.
|
|
@@ -0,0 +1,152 @@
|
|
|
1
|
+
# Batch Processing
|
|
2
|
+
|
|
3
|
+
Batch processing lets you run the same template against many inputs at once. Instead of manually running a template 200 times, select 200 items and let the system create and manage all the tasks automatically.
|
|
4
|
+
|
|
5
|
+
## Why Batch Processing?
|
|
6
|
+
|
|
7
|
+
Consider a modernization project with 500 RPG programs to convert. Without batch processing:
|
|
8
|
+
|
|
9
|
+
- Run the template manually for each program
|
|
10
|
+
- Wait for completion, review, approve
|
|
11
|
+
- Repeat 500 times over weeks or months
|
|
12
|
+
|
|
13
|
+
With batch processing:
|
|
14
|
+
|
|
15
|
+
- Select all 500 programs
|
|
16
|
+
- Click Run All
|
|
17
|
+
- Tasks execute in parallel
|
|
18
|
+
- Review and approve as they complete
|
|
19
|
+
- Turn months/years of work into weeks
|
|
20
|
+
|
|
21
|
+
The template ensures every program gets the same careful treatment. The batch operation handles the scale.
|
|
22
|
+
|
|
23
|
+
## How It Works
|
|
24
|
+
|
|
25
|
+
### Multi-Select Parameters
|
|
26
|
+
|
|
27
|
+
Templates can mark parameters as supporting multiple selection. When you encounter such a parameter:
|
|
28
|
+
|
|
29
|
+
1. The prompt allows selecting multiple items (files, options, etc.)
|
|
30
|
+
2. A counter shows how many items you've selected
|
|
31
|
+
3. The **Run All** button shows the total tasks that will be created
|
|
32
|
+
|
|
33
|
+
### Task Creation
|
|
34
|
+
|
|
35
|
+
When you click Run All:
|
|
36
|
+
|
|
37
|
+
1. The system creates one task per selected item
|
|
38
|
+
2. Each task receives the template with that item's value substituted
|
|
39
|
+
3. Tasks enter the queue and execute as slots become available
|
|
40
|
+
4. Parallel execution is limited by your configured queue slots
|
|
41
|
+
|
|
42
|
+
### Monitoring Batch Progress
|
|
43
|
+
|
|
44
|
+
After launching a batch:
|
|
45
|
+
|
|
46
|
+
- View all created tasks in the Tasks list
|
|
47
|
+
- Filter by template name to see just your batch
|
|
48
|
+
- Monitor progress as tasks complete
|
|
49
|
+
- Pin tasks that need attention
|
|
50
|
+
|
|
51
|
+
## Running via CLI
|
|
52
|
+
|
|
53
|
+
You can also run templates from the command line using the `coder` CLI:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
# Run a template with a single value
|
|
57
|
+
coder run convert-to-modern-rpg --source_file=MYPGM.rpgle
|
|
58
|
+
|
|
59
|
+
# Run with multiple values (creates multiple tasks)
|
|
60
|
+
coder run convert-to-modern-rpg --source_file=PGM1.rpgle --source_file=PGM2.rpgle
|
|
61
|
+
|
|
62
|
+
# Specify environment and branch
|
|
63
|
+
coder run convert-to-modern-rpg --environment=pjs-dev --branch=feature-xyz --source_file=MYPGM.rpgle
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
CLI execution is useful for:
|
|
67
|
+
|
|
68
|
+
- **Scripted workflows**: Integrate template runs into shell scripts or automation
|
|
69
|
+
- **CI/CD pipelines**: Trigger batch processing from build systems
|
|
70
|
+
- **Programmatic input**: Generate the list of items to process dynamically
|
|
71
|
+
|
|
72
|
+
See **CLI** in the Working with Code section for full command reference.
|
|
73
|
+
|
|
74
|
+
## Cartesian Products
|
|
75
|
+
|
|
76
|
+
When multiple parameters support multi-select, you get a cartesian product—every combination of selections.
|
|
77
|
+
|
|
78
|
+
### Example
|
|
79
|
+
|
|
80
|
+
A template with two multi-select parameters:
|
|
81
|
+
|
|
82
|
+
- **Source files**: Select 10 files
|
|
83
|
+
- **Target formats**: Select 3 formats
|
|
84
|
+
|
|
85
|
+
Result: 30 tasks (10 files × 3 formats)
|
|
86
|
+
|
|
87
|
+
Each task processes one file in one format. Every combination is covered.
|
|
88
|
+
|
|
89
|
+
### Practical Limits
|
|
90
|
+
|
|
91
|
+
The UI warns you when combinations exceed practical limits. Creating thousands of tasks is technically possible but may not be the best approach—consider breaking the work into smaller batches.
|
|
92
|
+
|
|
93
|
+
## Batch Workflow
|
|
94
|
+
|
|
95
|
+
### 1. Validate the Template First
|
|
96
|
+
|
|
97
|
+
Before running at scale:
|
|
98
|
+
|
|
99
|
+
1. Run the template with a single representative item
|
|
100
|
+
2. Review the results carefully
|
|
101
|
+
3. Verify the output meets expectations
|
|
102
|
+
4. Fix any issues in the template
|
|
103
|
+
|
|
104
|
+
Never batch-process hundreds of items with an untested template.
|
|
105
|
+
|
|
106
|
+
### 2. Start with a Small Batch
|
|
107
|
+
|
|
108
|
+
Once the template works for one item:
|
|
109
|
+
|
|
110
|
+
1. Select 5-10 similar items
|
|
111
|
+
2. Run the batch
|
|
112
|
+
3. Review all results
|
|
113
|
+
4. Confirm consistency across the batch
|
|
114
|
+
|
|
115
|
+
This catches edge cases the single-item test might miss.
|
|
116
|
+
|
|
117
|
+
### 3. Scale Up
|
|
118
|
+
|
|
119
|
+
When confident in the template:
|
|
120
|
+
|
|
121
|
+
1. Select the full set of items
|
|
122
|
+
2. Run the batch
|
|
123
|
+
3. Monitor progress
|
|
124
|
+
4. Review and approve as tasks complete
|
|
125
|
+
|
|
126
|
+
### 4. Handle Failures
|
|
127
|
+
|
|
128
|
+
Some tasks may fail due to:
|
|
129
|
+
|
|
130
|
+
- Edge cases the template doesn't handle
|
|
131
|
+
- Unusual input that needs special treatment
|
|
132
|
+
- Transient errors (retry these)
|
|
133
|
+
|
|
134
|
+
Review failed tasks individually. You may need to:
|
|
135
|
+
|
|
136
|
+
- Adjust the template and re-run failed items
|
|
137
|
+
- Handle exceptions manually
|
|
138
|
+
- Exclude certain items from batch processing
|
|
139
|
+
|
|
140
|
+
## Best Practices
|
|
141
|
+
|
|
142
|
+
**Test before scaling**: Always validate with single items and small batches first.
|
|
143
|
+
|
|
144
|
+
**Use meaningful names**: Template names and parameters should make batch results easy to identify.
|
|
145
|
+
|
|
146
|
+
**Monitor queue depth**: Large batches may take time if queue slots are limited. Plan accordingly.
|
|
147
|
+
|
|
148
|
+
**Review incrementally**: Don't wait for all tasks to complete. Review and approve as they finish.
|
|
149
|
+
|
|
150
|
+
**Group related work**: Batch similar items together. Mixing very different inputs may produce inconsistent results.
|
|
151
|
+
|
|
152
|
+
**Plan for failures**: Expect some percentage of tasks to need individual attention. Build this into your timeline.
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Task Templates
|
|
2
|
+
|
|
3
|
+
Task templates are reusable task definitions with parameters. Define a template once, then run it with different inputs—whether for a single item or hundreds at once.
|
|
4
|
+
|
|
5
|
+
## What Are Templates?
|
|
6
|
+
|
|
7
|
+
A template combines:
|
|
8
|
+
|
|
9
|
+
- **Instructions**: The task description agents will follow
|
|
10
|
+
- **Parameters**: Variables that customize the task for specific inputs (like which file to process)
|
|
11
|
+
- **Prompts**: UI elements that let you select parameter values (file browsers, dropdown lists, etc.)
|
|
12
|
+
|
|
13
|
+
Templates are configured per environment by administrators. When you run a template, you provide values for its parameters, and the system creates tasks with those values filled in.
|
|
14
|
+
|
|
15
|
+
## Running a Template
|
|
16
|
+
|
|
17
|
+
1. Open the **Templates** section from the home page
|
|
18
|
+
2. Select your environment and template
|
|
19
|
+
3. Fill in the required parameters using the provided prompts
|
|
20
|
+
4. Click **Run** to create a task
|
|
21
|
+
|
|
22
|
+
The template renders with your parameter values and launches as a normal task. You can monitor progress, review results, and approve changes just like any other task.
|
|
23
|
+
|
|
24
|
+
## Batch Processing
|
|
25
|
+
|
|
26
|
+
Templates become especially powerful when you need to process many items. Select multiple values for a parameter and run the template against all of them at once—turning weeks of manual work into overnight batch operations.
|
|
27
|
+
|
|
28
|
+
See **Batch Processing** for details on multi-select parameters, cartesian products, and workflow best practices.
|
|
29
|
+
|
|
30
|
+
## Common Use Cases
|
|
31
|
+
|
|
32
|
+
Templates are commonly used for code modernization, refactoring, testing, and documentation tasks.
|
|
33
|
+
|
|
34
|
+
See **Template Examples** for specific use cases including RPG modernization, language migration, and batch refactoring patterns.
|
|
35
|
+
|
|
36
|
+
## Configuration
|
|
37
|
+
|
|
38
|
+
Templates are defined in your environment's configuration. Administrators create templates with:
|
|
39
|
+
|
|
40
|
+
- Markdown instructions containing parameter placeholders
|
|
41
|
+
- Parameter definitions specifying prompts and validation
|
|
42
|
+
- Multi-select settings for batch processing
|
|
43
|
+
|
|
44
|
+
See **Environments** in the Administration section for configuration details.
|