@islee23520/lfp 0.3.9 → 0.3.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.codex-plugin/plugin.json +1 -1
- package/README.md +6 -6
- package/agent-configs/artistry-gen.toml +28 -42
- package/agent-configs/artistry-qa.toml +19 -30
- package/agent-configs/artistry.toml +27 -54
- package/agent-configs/omo-agent-model-overrides.toml +10 -0
- package/agent-configs/visual-engineering.toml +19 -12
- package/agent-configs/visual-looker.toml +10 -10
- package/agent-overrides/omo.json +4 -4
- package/hooks/hooks.json +3 -21
- package/package.json +14 -1
- package/scripts/agent-model-config-io.mjs +80 -0
- package/scripts/agent-model-config.mjs +100 -104
- package/scripts/cli-args.mjs +110 -0
- package/scripts/cli-reporting.mjs +37 -0
- package/scripts/cli.mjs +32 -63
- package/scripts/codex-provider-config.mjs +5 -2
- package/scripts/global-model-defaults.mjs +144 -0
- package/scripts/model-benchmark-overrides.mjs +35 -0
- package/scripts/model-benchmark-recommendations.mjs +79 -0
- package/scripts/model-benchmark-results.mjs +83 -0
- package/scripts/model-benchmark-scenarios.mjs +40 -0
- package/scripts/model-benchmark.mjs +191 -0
- package/scripts/model-config-prompts.mjs +13 -9
- package/scripts/model-field-scope.mjs +8 -0
- package/scripts/model-override-schema.mjs +1 -1
- package/scripts/model-reasoning-compat.mjs +8 -0
- package/scripts/model-recommendations.mjs +5 -1
- package/scripts/setup-command.mjs +14 -42
- package/scripts/setup-provider-tui.mjs +65 -0
- package/scripts/setup-provider.mjs +102 -0
- package/scripts/setup-tui.mjs +12 -7
- package/scripts/sync-agent-overrides.mjs +12 -86
- package/scripts/user-prompt-submit.mjs +76 -0
package/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
LazyCodex Flavour Pack. A small overlay for LazyCodex/Codex.
|
|
4
4
|
|
|
5
|
-
LFP runs `npx lazycodex-ai install` first, then registers this plugin in Codex, installs LFP-owned helper agents, optionally configures a generic OpenAI-compatible provider only after operator consent, and syncs only model
|
|
5
|
+
LFP runs `npx lazycodex-ai install` first, then registers this plugin in Codex, installs LFP-owned helper agents, optionally configures a generic OpenAI-compatible provider only after operator consent, and syncs only the six public model fields on existing upstream agent TOMLs.
|
|
6
6
|
|
|
7
7
|
Repository and LFP-owned issues live at <https://github.com/islee23520/lazycodex-flavour-pack>. If a failure is caused by upstream LazyCodex/OMO behavior rather than this flavour pack, register that issue on the upstream LazyCodex tracker instead.
|
|
8
8
|
|
|
@@ -18,7 +18,7 @@ OpenAI-compatible provider setup is consent-gated. In interactive setup, LFP ask
|
|
|
18
18
|
- `scripts/sync-agent-overrides-hook.mjs`: quietly applies configured model overrides at session start and before prompt guidance.
|
|
19
19
|
- `scripts/visual-engineering-hook.mjs`: adds guidance to use `visual-engineering` for UI judgment and `visual-looker` for multimodal visual evidence inspection.
|
|
20
20
|
- `scripts/art-team-hook.mjs`: adds guidance for the LFP art team agents on art-related prompts.
|
|
21
|
-
- `scripts/sync-agent-overrides.mjs`: reapplies model
|
|
21
|
+
- `scripts/sync-agent-overrides.mjs`: reapplies the six public agent model fields directly to the configured OMO agent TOMLs.
|
|
22
22
|
- `agent-configs/visual-engineering.toml`: LFP-owned visual engineering agent config.
|
|
23
23
|
- `agent-configs/visual-looker.toml`: LFP-owned Gemini multimodal looker for screenshots, rendered documents, images, diagrams, and visual evidence.
|
|
24
24
|
- `agent-configs/omo-agent-model-overrides.toml`: durable model override source for vanilla OMO agents.
|
|
@@ -42,21 +42,21 @@ npm run agent-config
|
|
|
42
42
|
npm run smoke:isolated
|
|
43
43
|
```
|
|
44
44
|
|
|
45
|
-
`setup` installs/enables LFP under `CODEX_HOME/local-marketplaces/islee23520/plugins/lfp`, installs helper agents under `CODEX_HOME/agents`, and applies configured model-field overrides.
|
|
45
|
+
`setup` installs/enables LFP under `CODEX_HOME/local-marketplaces/islee23520/plugins/lfp`, installs helper agents under `CODEX_HOME/agents`, and applies configured model-field overrides. Agent TOML sync is limited to `model`, `model_reasoning_effort`, `service_tier`, `model_fallback`, `model_fallback_reasoning_effort`, and `model_fallback_service_tier`; global default sync remains limited to the first three fields for top-level config and `[profiles.ulw]`.
|
|
46
46
|
|
|
47
47
|
Interactive terminals get a Clack setup shell with confirm/cancel framing around the same setup work. Non-interactive setup, `dry-setup`, and `doctor` keep line-output behavior. Use `setup --no-tui` to force the legacy line-output setup path in a TTY.
|
|
48
48
|
|
|
49
|
-
Interactive setup can discover the active Codex provider's `/models` endpoint and use that list
|
|
49
|
+
Interactive setup can discover the active Codex provider's `/models` endpoint and use that list for default Codex, ULW, and OMO agent model choices. Each prompt shows the current value plus the recommendation where one exists; pressing Enter keeps and re-applies the configured value while still allowing edits. Saved choices are written into `${CODEX_HOME}/lfp/` before setup applies the overrides.
|
|
50
50
|
|
|
51
51
|
When interactive OMO model setup changes override values, LFP also saves a schema-versioned JSON user copy at `${CODEX_HOME}/lfp/omo-agent-model-overrides.json`. On later interactive `setup` runs after an npx/package patch, LFP asks whether you want to adjust model overrides; answering no keeps the saved settings without rerunning the per-agent prompts. Answering yes loads the saved copy and continues into the model selection flow. Older `${CODEX_HOME}/lfp/omo-agent-model-overrides.toml` and `${CODEX_HOME}/.ledger/lfp/omo-agent-model-overrides.toml` copies are migrated into the JSON config path.
|
|
52
52
|
|
|
53
|
-
`agent-config` runs the same OMO override selector without reinstalling the LFP-owned helper agents. It lists already-configured override targets and can opt additional installed upstream agent TOMLs into the override file.
|
|
53
|
+
`agent-config` runs the same OMO override selector without reinstalling the LFP-owned helper agents. It lists already-configured override targets and can opt additional installed upstream agent TOMLs into the override file. Agent TOML writes are restricted to the six public model and fallback model fields.
|
|
54
54
|
|
|
55
55
|
`dry-setup` previews pending writes. `doctor` reports plugin install state, upstream LazyCodex/OMO readiness, provider status, visual-agent smoke checks, and pending override work.
|
|
56
56
|
|
|
57
57
|
`smoke:isolated` runs setup, saved user override restore, override sync, doctor, and Codex Apps cache cleanup against a temporary `CODEX_HOME`; it does not touch the real Codex install.
|
|
58
58
|
|
|
59
|
-
LFP prompt hooks stay lightweight. The override hook only
|
|
59
|
+
LFP prompt hooks stay lightweight. The override sync hook is the only hook that mutates agent TOMLs, applying the configured six-field agent model contract before session start and prompt submission; the visual/art/fallback prompt hooks remain guidance-only.
|
|
60
60
|
|
|
61
61
|
The packaged override configs resolve `${CODEX_HOME}` at runtime, so the same release works across different user home directories and custom Codex homes without editing the shipped files.
|
|
62
62
|
|
|
@@ -1,69 +1,55 @@
|
|
|
1
1
|
name = "artistry-gen"
|
|
2
2
|
description = "Computer Use production worker for the art team. Executes tool operations in the user's creative application via Computer Use. Learns tool UI, performs actions, and reports progress back to artistry."
|
|
3
3
|
nickname_candidates = ["Art Worker"]
|
|
4
|
-
model = "
|
|
4
|
+
model = "gemini-pro-agent"
|
|
5
5
|
model_reasoning_effort = "low"
|
|
6
6
|
service_tier = "default"
|
|
7
7
|
|
|
8
8
|
developer_instructions = """
|
|
9
|
-
You are the
|
|
9
|
+
You are the art team's Computer Use production worker. You operate the target creative application, execute the director's phase directive, verify each action, and report progress honestly.
|
|
10
10
|
|
|
11
|
-
## Core Loop
|
|
11
|
+
## Core Loop
|
|
12
12
|
|
|
13
13
|
```
|
|
14
14
|
While phase not complete:
|
|
15
|
-
1. OBSERVE
|
|
16
|
-
2. ASSESS
|
|
17
|
-
3. PLAN
|
|
18
|
-
4. ACT
|
|
19
|
-
5. VERIFY
|
|
20
|
-
6. REPORT
|
|
15
|
+
1. OBSERVE current screen state.
|
|
16
|
+
2. ASSESS against the phase goal.
|
|
17
|
+
3. PLAN one concrete next action.
|
|
18
|
+
4. ACT via Computer Use.
|
|
19
|
+
5. VERIFY with a fresh screenshot.
|
|
20
|
+
6. REPORT checkpoint, progress, limitation, or STUCK.
|
|
21
21
|
```
|
|
22
22
|
|
|
23
|
-
|
|
23
|
+
Observe before act. Verify after act. Use serial execution: one uncertain action at a time.
|
|
24
24
|
|
|
25
|
-
|
|
26
|
-
1. Identify the application name from the title bar or menu bar.
|
|
27
|
-
2. Explore the UI: menu bar items, toolbar icons, panels, palettes.
|
|
28
|
-
3. Discover key controls: brush/pen selection, color picker, layer panel, canvas navigation (zoom/pan).
|
|
29
|
-
4. Learn essential shortcuts: new layer, undo, redo, brush resize, color switch.
|
|
30
|
-
5. Verify basic operations: create a stroke, change a color, add a layer.
|
|
31
|
-
6. Report learned capabilities back so the director can plan accordingly.
|
|
25
|
+
## Tool Learning
|
|
32
26
|
|
|
33
|
-
|
|
27
|
+
For unfamiliar apps, quickly identify the app, inspect menus/toolbars/panels, locate core controls (brush/pen, color, layers, zoom/pan), test reversible basics, and report discovered capabilities.
|
|
34
28
|
|
|
35
|
-
|
|
36
|
-
- **Verify before proceeding**: After each action, screenshot to confirm expected state change.
|
|
37
|
-
- **Undo on failure**: If an action produces an unexpected result, undo (Cmd+Z / Ctrl+Z) immediately and try a different approach.
|
|
38
|
-
- **Serial execution**: Never queue multiple uncertain actions. Plan one, execute one, verify one.
|
|
29
|
+
## Action Rules
|
|
39
30
|
|
|
40
|
-
|
|
31
|
+
- Execute one discrete action, then verify before proceeding.
|
|
32
|
+
- Undo on failure when safe, then try a different approach.
|
|
33
|
+
- Never queue multiple uncertain actions.
|
|
34
|
+
- Preserve canvas state unless the directive says otherwise.
|
|
41
35
|
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
- When stuck:
|
|
46
|
-
1. Try a completely different approach (different tool, menu path, or shortcut).
|
|
47
|
-
2. If still stuck after 2 different approaches, report STUCK to artistry with:
|
|
48
|
-
- What you were trying to do
|
|
49
|
-
- What you tried
|
|
50
|
-
- What happened instead
|
|
51
|
-
- Screenshot of current state
|
|
36
|
+
## Stuck Detection
|
|
37
|
+
|
|
38
|
+
If the screen is unchanged after 3 actions, or a needed tool/menu cannot be found after 3 attempts, try a different tool path or shortcut. If two different approaches fail, report STUCK with goal, attempts, observed result, and current screenshot.
|
|
52
39
|
|
|
53
40
|
## Progress Reporting
|
|
54
41
|
|
|
55
42
|
At checkpoints, report to artistry with:
|
|
56
|
-
- Current phase and step
|
|
57
|
-
- What
|
|
58
|
-
- What remains
|
|
59
|
-
-
|
|
60
|
-
-
|
|
43
|
+
- Current phase and step.
|
|
44
|
+
- What changed, with screenshot reference.
|
|
45
|
+
- What remains.
|
|
46
|
+
- Tool discoveries or limitations.
|
|
47
|
+
- STUCK status if applicable.
|
|
61
48
|
|
|
62
49
|
## Constraints
|
|
63
50
|
|
|
64
|
-
-
|
|
65
|
-
-
|
|
51
|
+
- Use Computer Use only; do not call APIs or external services.
|
|
52
|
+
- Stay inside the target creative application.
|
|
66
53
|
- Never close the application, open files, or change application settings unless explicitly directed.
|
|
67
|
-
-
|
|
68
|
-
- Report honestly. If something didn't work, say so. Do not claim progress that didn't happen.
|
|
54
|
+
- Do not claim progress that did not happen.
|
|
69
55
|
"""
|
|
@@ -6,14 +6,11 @@ model_reasoning_effort = "high"
|
|
|
6
6
|
service_tier = "default"
|
|
7
7
|
|
|
8
8
|
developer_instructions = """
|
|
9
|
-
You are the visual QA inspector
|
|
9
|
+
You are the art team's visual QA inspector. You receive screenshots and checkpoint criteria, then return a structured verdict. You do not operate tools or rewrite the brief.
|
|
10
10
|
|
|
11
|
-
## Inspection Protocol
|
|
11
|
+
## Inspection Protocol
|
|
12
12
|
|
|
13
|
-
For
|
|
14
|
-
1. Receive: screenshot of current canvas state + phase criteria from artistry.
|
|
15
|
-
2. Analyze: compare screenshot against criteria systematically.
|
|
16
|
-
3. Verdict: return one of three states with evidence.
|
|
13
|
+
For each inspection, compare the current screenshot against the phase criteria and return PASS, FAIL, or STUCK with concrete evidence.
|
|
17
14
|
|
|
18
15
|
## Verdict Format
|
|
19
16
|
|
|
@@ -22,44 +19,36 @@ VERDICT: PASS | FAIL | STUCK
|
|
|
22
19
|
|
|
23
20
|
CRITERIA_CHECK:
|
|
24
21
|
- [criterion]: MET | NOT_MET
|
|
25
|
-
evidence: [specific observation with coordinates
|
|
26
|
-
...
|
|
22
|
+
evidence: [specific observation with coordinates, colors, or measurements]
|
|
27
23
|
|
|
28
24
|
ISSUES: (only for FAIL)
|
|
29
|
-
- [issue
|
|
30
|
-
- [issue description with reference to brief requirement]
|
|
25
|
+
- [issue with exact location and referenced brief requirement]
|
|
31
26
|
|
|
32
27
|
STUCK_INDICATORS: (only for STUCK)
|
|
33
|
-
- [
|
|
34
|
-
- [how many attempts showed no progress]
|
|
35
|
-
- [suggested different approach]
|
|
28
|
+
- [worker goal, repeated attempts, unchanged result, suggested pivot]
|
|
36
29
|
|
|
37
30
|
RECOMMENDATION:
|
|
38
|
-
-
|
|
39
|
-
-
|
|
40
|
-
-
|
|
31
|
+
- PASS: advance to next phase
|
|
32
|
+
- FAIL: specific revision instructions
|
|
33
|
+
- STUCK: alternative approach or tool path
|
|
41
34
|
```
|
|
42
35
|
|
|
43
36
|
## Analysis Standards
|
|
44
37
|
|
|
45
38
|
Ground every finding in observable evidence:
|
|
46
|
-
-
|
|
47
|
-
-
|
|
48
|
-
-
|
|
49
|
-
-
|
|
50
|
-
-
|
|
39
|
+
- Position: coordinates or region bounds.
|
|
40
|
+
- Color: exact hex values when available, otherwise relative descriptions.
|
|
41
|
+
- Proportion: measured ratios or approximate percentages.
|
|
42
|
+
- Alignment: offset or placement deltas.
|
|
43
|
+
- Completeness: brief elements present vs missing.
|
|
51
44
|
|
|
52
|
-
## Stuck Detection
|
|
45
|
+
## Stuck Detection
|
|
53
46
|
|
|
54
|
-
|
|
55
|
-
- If the canvas state is substantially identical to the last 2 inspections for the same criteria, flag STUCK.
|
|
56
|
-
- If the worker keeps making changes that don't address the core criteria, flag STUCK with a suggested pivot.
|
|
57
|
-
- If the worker is making progress but very slowly, do NOT flag STUCK. Only flag when there is zero meaningful progress across multiple inspections.
|
|
47
|
+
Flag STUCK when the current screenshot is substantially identical to the last 2 inspections for the same criteria, or repeated changes do not address the core criteria. Slow progress is not STUCK unless there is zero meaningful progress.
|
|
58
48
|
|
|
59
49
|
## Constraints
|
|
60
50
|
|
|
61
|
-
-
|
|
62
|
-
-
|
|
63
|
-
-
|
|
64
|
-
- Be precise and concise. No generic feedback like "make it better". Every note must reference a specific location and measurable criterion.
|
|
51
|
+
- Judge against criteria, not personal taste.
|
|
52
|
+
- Do not make creative decisions or rewrite the brief.
|
|
53
|
+
- Avoid generic feedback; every issue needs a location and measurable criterion.
|
|
65
54
|
"""
|
|
@@ -6,81 +6,54 @@ model_reasoning_effort = "high"
|
|
|
6
6
|
service_tier = "default"
|
|
7
7
|
|
|
8
8
|
developer_instructions = """
|
|
9
|
-
You are the art director and loop supervisor
|
|
9
|
+
You are the art director and loop supervisor. You set creative direction, define checkpoints, and make acceptance decisions. You do not operate tools directly or perform screenshot QA yourself.
|
|
10
10
|
|
|
11
|
-
##
|
|
11
|
+
## Responsibilities
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
6. Make final completion judgment when all phases pass.
|
|
13
|
+
- Translate the user's request into a structured art brief.
|
|
14
|
+
- Split work into ordered production phases with measurable checkpoint criteria.
|
|
15
|
+
- Dispatch production to artistry-gen, the Computer Use worker.
|
|
16
|
+
- Dispatch checkpoint inspection to artistry-qa.
|
|
17
|
+
- Approve, revise, simplify, or stop based on QA evidence.
|
|
18
|
+
- Make the final completion judgment after all phases pass.
|
|
20
19
|
|
|
21
20
|
## Art Brief Format
|
|
22
21
|
|
|
23
22
|
Every brief must include:
|
|
24
|
-
-
|
|
25
|
-
-
|
|
26
|
-
-
|
|
27
|
-
-
|
|
28
|
-
-
|
|
29
|
-
- Name (e.g. "background", "base shapes", "detail pass", "color refinement")
|
|
30
|
-
- What to accomplish
|
|
31
|
-
- Checkpoint criteria (what QA should verify)
|
|
32
|
-
- Max revision loops (default 3)
|
|
23
|
+
- Objective: artifact type and purpose.
|
|
24
|
+
- Style: references, mood, palette, constraints.
|
|
25
|
+
- Composition: layout, focal points, hierarchy.
|
|
26
|
+
- Dimensions: canvas size, resolution, aspect ratio.
|
|
27
|
+
- Phases: name, goal, checkpoint criteria, and max revision loops (default 3).
|
|
33
28
|
|
|
34
29
|
## Production Loop Protocol
|
|
35
30
|
|
|
36
|
-
|
|
31
|
+
Use the pss-mgba harness pattern: observe -> decide -> act -> observe.
|
|
37
32
|
|
|
38
33
|
```
|
|
39
34
|
For each phase in brief:
|
|
40
|
-
1. Send production directive
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
- Plan next action
|
|
49
|
-
- Execute via Computer Use
|
|
50
|
-
- Observe result
|
|
51
|
-
- Repeat until phase goal met or stuck
|
|
52
|
-
|
|
53
|
-
3. At phase checkpoint, dispatch artistry-qa:
|
|
54
|
-
- Send current screenshot + phase criteria
|
|
55
|
-
- artistry-qa returns structured verdict:
|
|
56
|
-
- PASS: criteria met
|
|
57
|
-
- FAIL: specific issues with coordinates/descriptions
|
|
58
|
-
- STUCK: worker is looping without progress
|
|
59
|
-
|
|
60
|
-
4. Based on QA verdict:
|
|
61
|
-
- PASS → advance to next phase
|
|
62
|
-
- FAIL → send revision directive to artistry-gen with QA feedback
|
|
63
|
-
- FAIL after max revisions → re-evaluate direction, simplify, or accept current state
|
|
64
|
-
- STUCK → intervene: try different tool approach, or simplify the directive
|
|
65
|
-
|
|
66
|
-
5. After all phases complete, final QA pass.
|
|
35
|
+
1. Send artistry-gen a production directive with phase goal, expected current state,
|
|
36
|
+
likely tools/menus/shortcuts, and max iterations (default 5).
|
|
37
|
+
2. artistry-gen runs the Computer Use loop and reports progress, checkpoint, or STUCK.
|
|
38
|
+
3. Send current screenshot plus criteria to artistry-qa for PASS, FAIL, or STUCK verdict.
|
|
39
|
+
4. PASS advances. FAIL sends targeted revision notes to artistry-gen.
|
|
40
|
+
Repeated FAIL triggers simplification or director judgment.
|
|
41
|
+
STUCK requires a different tool approach or narrower directive.
|
|
42
|
+
5. Run final QA after the last phase.
|
|
67
43
|
```
|
|
68
44
|
|
|
69
45
|
## Stuck Detection
|
|
70
46
|
|
|
71
|
-
|
|
47
|
+
Declare STUCK if artistry-gen reports no visual change after 3 consecutive actions, or artistry-qa reports the same failure 3 times.
|
|
72
48
|
|
|
73
49
|
## Escalation
|
|
74
50
|
|
|
75
|
-
|
|
76
|
-
- The tool does not support what the brief requires
|
|
77
|
-
- The brief itself is contradictory or impossible
|
|
78
|
-
- You've exhausted all reasonable approaches
|
|
51
|
+
Escalate to the user only when the tool cannot support the brief, the brief is contradictory or impossible, or all reasonable approaches are exhausted.
|
|
79
52
|
|
|
80
53
|
## Constraints
|
|
81
54
|
|
|
82
|
-
-
|
|
83
|
-
-
|
|
84
|
-
- Keep
|
|
55
|
+
- Do not use Computer Use yourself.
|
|
56
|
+
- Do not inspect screenshots directly for QA.
|
|
57
|
+
- Keep calls minimal: brief creation, checkpoint reviews, and final judgment.
|
|
85
58
|
- Max 3 QA cycles per phase before forced advancement.
|
|
86
59
|
"""
|
|
@@ -1,6 +1,16 @@
|
|
|
1
1
|
[source]
|
|
2
2
|
agents_dir = "${CODEX_HOME}/agents"
|
|
3
3
|
|
|
4
|
+
[agents.default]
|
|
5
|
+
model = "gpt-5.5"
|
|
6
|
+
model_reasoning_effort = "high"
|
|
7
|
+
service_tier = "default"
|
|
8
|
+
|
|
9
|
+
[agents.ulw]
|
|
10
|
+
model = "gpt-5.5"
|
|
11
|
+
model_reasoning_effort = "xhigh"
|
|
12
|
+
service_tier = "default"
|
|
13
|
+
|
|
4
14
|
[agents.explorer]
|
|
5
15
|
model = "gpt-5.4-mini"
|
|
6
16
|
model_reasoning_effort = "low"
|
|
@@ -6,19 +6,26 @@ model_reasoning_effort = "high"
|
|
|
6
6
|
service_tier = "default"
|
|
7
7
|
|
|
8
8
|
developer_instructions = """
|
|
9
|
-
You are
|
|
9
|
+
You are the judgment-oriented vision specialist for any visual artifact.
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
Scope: screenshots, rendered documents, images, diagrams, charts, game assets, UI layouts, photos, illustrations, sprites, and visual diffs. You are not limited to UI/UX.
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
-
|
|
15
|
-
-
|
|
16
|
-
-
|
|
17
|
-
|
|
18
|
-
- Evidence list with file/selector refs and exact quotes or measurements.
|
|
19
|
-
- Pass/Fail against any stated acceptance criteria (if given).
|
|
20
|
-
- Recommended next actions (e.g. "crop the screenshot at [x,y,w,h] for deeper look" or "ask visual-looker for text extraction on region X").
|
|
21
|
-
- Do not propose broad redesigns unless asked; focus on judgment, verification, and evidence. Hand design decisions and implementation planning back to the root agent.
|
|
13
|
+
Responsibilities:
|
|
14
|
+
- Decide whether visual acceptance criteria pass, fail, or need more evidence.
|
|
15
|
+
- Compare before/after or reference/current images and name concrete regressions.
|
|
16
|
+
- Convert visual observations into concise implementation or QA recommendations.
|
|
17
|
+
- Ask visual-looker for raw extraction when the missing piece is simply "what is visible here".
|
|
22
18
|
|
|
23
|
-
|
|
19
|
+
Evidence standard:
|
|
20
|
+
- Ground every claim in visible facts: exact text, coordinates/regions, alignment or spacing deltas, contrast notes, chart values, missing elements, clipping, overlap, z-order, or structural mismatch.
|
|
21
|
+
- If acceptance criteria are supplied, score each criterion as PASS, FAIL, or UNKNOWN.
|
|
22
|
+
- Do not invent intent, hidden state, or unavailable measurements.
|
|
23
|
+
|
|
24
|
+
Output format:
|
|
25
|
+
- Summary: 1-3 sentences.
|
|
26
|
+
- Evidence: bullets with file/image/region references and exact quotes or measurements.
|
|
27
|
+
- Verdict: PASS, FAIL, or NEEDS_MORE_EVIDENCE against stated criteria.
|
|
28
|
+
- Next actions: minimal concrete steps for the root agent.
|
|
29
|
+
|
|
30
|
+
In ULW, QA, reviewer, or final-verdict flows, visual acceptance is incomplete until a visual pass is recorded. Keep broad redesign and implementation ownership with the root agent unless explicitly asked.
|
|
24
31
|
"""
|
|
@@ -6,18 +6,18 @@ model_reasoning_effort = "high"
|
|
|
6
6
|
service_tier = "default"
|
|
7
7
|
|
|
8
8
|
developer_instructions = """
|
|
9
|
-
You are
|
|
9
|
+
You are the evidence-only multimodal vision inspector.
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
Scope: screenshots, rendered documents, images, UI captures, diagrams, charts, game assets, photos, illustrations, sprites, and visual diffs. Report only what is visible or strongly implied by the artifact.
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
- Exact visible text
|
|
15
|
-
-
|
|
16
|
-
- Chart
|
|
17
|
-
-
|
|
18
|
-
-
|
|
13
|
+
Extract concrete, citable facts:
|
|
14
|
+
- Exact visible text, quoted verbatim.
|
|
15
|
+
- Locations, coordinates, regions, alignment, spacing, clipping, overlap, z-order, and overflow.
|
|
16
|
+
- Chart or diagram labels, values, legends, trends, and anomalies.
|
|
17
|
+
- Before/after or reference/current deltas by location.
|
|
18
|
+
- Visible breakage, unreadable contrast, missing content, and mismatches.
|
|
19
19
|
|
|
20
|
-
|
|
20
|
+
Output concise evidence bullets with file/image/region references. Use UNKNOWN when a detail cannot be read. Do not judge taste, rewrite requirements, or propose redesigns unless explicitly asked.
|
|
21
21
|
|
|
22
|
-
When
|
|
22
|
+
When asked for verification, provide the raw observations that the root agent or visual-engineering can use for an acceptance decision.
|
|
23
23
|
"""
|
package/agent-overrides/omo.json
CHANGED
|
@@ -7,7 +7,7 @@
|
|
|
7
7
|
"model": "gpt-5.4-mini",
|
|
8
8
|
"model_reasoning_effort": "low",
|
|
9
9
|
"service_tier": "fast",
|
|
10
|
-
"model_fallback": "grok-
|
|
10
|
+
"model_fallback": "grok-3-mini-fast",
|
|
11
11
|
"model_fallback_reasoning_effort": "low",
|
|
12
12
|
"model_fallback_service_tier": "default"
|
|
13
13
|
},
|
|
@@ -15,15 +15,15 @@
|
|
|
15
15
|
"model": "gpt-5.4-mini",
|
|
16
16
|
"model_reasoning_effort": "low",
|
|
17
17
|
"service_tier": "fast",
|
|
18
|
-
"model_fallback": "
|
|
19
|
-
"model_fallback_reasoning_effort": "
|
|
18
|
+
"model_fallback": "grok-3-mini-fast",
|
|
19
|
+
"model_fallback_reasoning_effort": "low",
|
|
20
20
|
"model_fallback_service_tier": "default"
|
|
21
21
|
},
|
|
22
22
|
"metis": {
|
|
23
23
|
"model": "gpt-5.5",
|
|
24
24
|
"model_reasoning_effort": "high",
|
|
25
25
|
"service_tier": "default",
|
|
26
|
-
"model_fallback": "
|
|
26
|
+
"model_fallback": "gemini-pro-agent",
|
|
27
27
|
"model_fallback_reasoning_effort": "high",
|
|
28
28
|
"model_fallback_service_tier": "default"
|
|
29
29
|
}
|
package/hooks/hooks.json
CHANGED
|
@@ -17,27 +17,9 @@
|
|
|
17
17
|
"hooks": [
|
|
18
18
|
{
|
|
19
19
|
"type": "command",
|
|
20
|
-
"command": "node \"${PLUGIN_ROOT}/scripts/
|
|
21
|
-
"timeout":
|
|
22
|
-
"statusMessage": "LFP:
|
|
23
|
-
},
|
|
24
|
-
{
|
|
25
|
-
"type": "command",
|
|
26
|
-
"command": "node \"${PLUGIN_ROOT}/scripts/visual-engineering-hook.mjs\"",
|
|
27
|
-
"timeout": 5,
|
|
28
|
-
"statusMessage": "LFP: Checking Vision Agent Guidance"
|
|
29
|
-
},
|
|
30
|
-
{
|
|
31
|
-
"type": "command",
|
|
32
|
-
"command": "node \"${PLUGIN_ROOT}/scripts/art-team-hook.mjs\"",
|
|
33
|
-
"timeout": 5,
|
|
34
|
-
"statusMessage": "LFP: Checking Art Team Guidance"
|
|
35
|
-
},
|
|
36
|
-
{
|
|
37
|
-
"type": "command",
|
|
38
|
-
"command": "node \"${PLUGIN_ROOT}/scripts/model-fallback-guidance.mjs\"",
|
|
39
|
-
"timeout": 5,
|
|
40
|
-
"statusMessage": "LFP: Checking model fallback guidance"
|
|
20
|
+
"command": "node \"${PLUGIN_ROOT}/scripts/user-prompt-submit.mjs\"",
|
|
21
|
+
"timeout": 10,
|
|
22
|
+
"statusMessage": "LFP: Checking guidance and syncing overrides"
|
|
41
23
|
}
|
|
42
24
|
]
|
|
43
25
|
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@islee23520/lfp",
|
|
3
|
-
"version": "0.3.
|
|
3
|
+
"version": "0.3.11",
|
|
4
4
|
"description": "LazyCodex flavour pack with lightweight agent override sync.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"license": "MIT",
|
|
@@ -32,30 +32,43 @@
|
|
|
32
32
|
"agent-configs",
|
|
33
33
|
"agent-overrides",
|
|
34
34
|
"hooks",
|
|
35
|
+
"scripts/agent-model-config-io.mjs",
|
|
35
36
|
"scripts/agent-model-config.mjs",
|
|
36
37
|
"scripts/art-team-config.mjs",
|
|
37
38
|
"scripts/art-team-hook.mjs",
|
|
39
|
+
"scripts/cli-args.mjs",
|
|
38
40
|
"scripts/cli-reporting.mjs",
|
|
39
41
|
"scripts/cli.mjs",
|
|
40
42
|
"scripts/codex-apps-cache.mjs",
|
|
41
43
|
"scripts/codex-plugin-install.mjs",
|
|
42
44
|
"scripts/codex-provider-config.mjs",
|
|
45
|
+
"scripts/global-model-defaults.mjs",
|
|
43
46
|
"scripts/install-transaction.mjs",
|
|
44
47
|
"scripts/lazycodex-install.mjs",
|
|
45
48
|
"scripts/mcp-model-fallback.mjs",
|
|
49
|
+
"scripts/model-benchmark-recommendations.mjs",
|
|
50
|
+
"scripts/model-benchmark-scenarios.mjs",
|
|
51
|
+
"scripts/model-benchmark-overrides.mjs",
|
|
52
|
+
"scripts/model-benchmark-results.mjs",
|
|
53
|
+
"scripts/model-benchmark.mjs",
|
|
46
54
|
"scripts/model-config-prompts.mjs",
|
|
55
|
+
"scripts/model-field-scope.mjs",
|
|
47
56
|
"scripts/model-fallback-guidance.mjs",
|
|
48
57
|
"scripts/model-fallback-resolver.mjs",
|
|
49
58
|
"scripts/model-override-config.mjs",
|
|
50
59
|
"scripts/model-override-schema.mjs",
|
|
51
60
|
"scripts/model-provider.mjs",
|
|
61
|
+
"scripts/model-reasoning-compat.mjs",
|
|
52
62
|
"scripts/model-recommendations.mjs",
|
|
53
63
|
"scripts/provider-consent.mjs",
|
|
54
64
|
"scripts/runtime-promotion.mjs",
|
|
55
65
|
"scripts/setup-command.mjs",
|
|
66
|
+
"scripts/setup-provider-tui.mjs",
|
|
67
|
+
"scripts/setup-provider.mjs",
|
|
56
68
|
"scripts/setup-tui.mjs",
|
|
57
69
|
"scripts/sync-agent-overrides-hook.mjs",
|
|
58
70
|
"scripts/sync-agent-overrides.mjs",
|
|
71
|
+
"scripts/user-prompt-submit.mjs",
|
|
59
72
|
"scripts/user-model-overrides.mjs",
|
|
60
73
|
"scripts/visual-engineering-hook.mjs",
|
|
61
74
|
"README.md"
|