@ai-agent-tools/picgen 0.1.0-alpha.0 → 0.1.0-alpha.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,155 @@
1
+ # PicGen Agent 安装指南
2
+
3
+ 当用户让 Codex、Trae Work、Claude Code 等 Agent 帮忙安装和配置 PicGen 时,请按这份指南执行。
4
+
5
+ ## 目标
6
+
7
+ 把 PicGen 安装到用户的本机持久环境,配置一个生图 provider,安全保存 API key,先生成一次低成本预览,用户确认后再生成 1 张测试图。
8
+
9
+ 不要让非技术用户理解命令细节。Agent 应尽量代办;只有无法获得本机持久环境权限时,才给用户最少的本机终端命令。
10
+
11
+ ## 1. 检查安装环境
12
+
13
+ 安装 CLI、安装 skill、保存 API key 之前,先判断当前终端是用户本机持久环境,还是 sandbox / remote container / 临时环境。
14
+
15
+ 执行轻量检查:
16
+
17
+ ```bash
18
+ pwd
19
+ echo "$HOME"
20
+ node -v
21
+ npm -v
22
+ npm root -g
23
+ which picgen || true
24
+ ```
25
+
26
+ 如果当前环境看起来是沙箱、远程容器、临时 workspace,或者不能写入用户真实 home 目录,不要把 PicGen、skill 或 API key 安装到这里。请先申请使用用户的本机持久终端或持久用户级环境。
27
+
28
+ 建议话术:
29
+
30
+ ```text
31
+ 我检测到当前可能是临时沙箱。PicGen 需要安装到你的本机持久环境,否则新会话可能会丢失 CLI、skill 或 API key。请允许我在本机持久环境中继续安装和配置;如果无法授权,我再给你最少的本机终端命令。
32
+ ```
33
+
34
+ 如果获得权限,就继续在本机持久环境中代办。只有无法申请权限或用户拒绝时,才让用户复制命令到本机终端执行。
35
+
36
+ ## 2. 检查 Node.js 和 npm
37
+
38
+ PicGen 依赖 Node.js 20 或更新版本,以及 npm。
39
+
40
+ ```bash
41
+ node -v
42
+ npm -v
43
+ ```
44
+
45
+ 如果命令不存在,请先指导用户安装 Node.js LTS 版本。安装后再次验证 `node -v` 和 `npm -v` 能正常显示版本号。
46
+
47
+ ## 3. 安装 CLI 和 Skill
48
+
49
+ 安装 PicGen CLI:
50
+
51
+ ```bash
52
+ npm install -g @ai-agent-tools/picgen@latest
53
+ picgen --version
54
+ picgen open --help
55
+ ```
56
+
57
+ 安装 PicGen skill:
58
+
59
+ ```bash
60
+ npx -y skills add ai-agent-tools/picgen --skill picgen -g -y --copy
61
+ ```
62
+
63
+ 如果 `skills` 安装器不可用,且当前是 Codex,可以使用 Codex fallback:
64
+
65
+ ```bash
66
+ picgen skill install codex --force
67
+ ```
68
+
69
+ 如果安装后 Agent 暂时看不到 skill,请提醒用户重启 Agent 或新开一个会话。
70
+
71
+ ## 3.1 打开本地网页
72
+
73
+ 如果用户说“打开 PicGen”、“打开生图工具”、“我想用网页配置/生图/找回图片”,请启动本地网页:
74
+
75
+ ```bash
76
+ picgen open
77
+ ```
78
+
79
+ PicGen 默认绑定本机 `127.0.0.1:8188`。如果端口被占用,会自动尝试后续端口。把命令输出的 URL 发给用户即可。这个服务不是常驻后台服务,终端关闭后网页也会关闭。
80
+
81
+ 如果当前是沙箱或临时环境,不要在沙箱里启动给用户使用的网页;请先申请使用用户本机持久环境。无法申请时,再指导用户在本机终端执行 `picgen open`。
82
+
83
+ 默认安装和首次代办配置时,Agent 仍优先使用剪贴板方式写入 API key,因为步骤少、不中断对话。网页适合用户想自己管理、排查或直接生图时打开。
84
+
85
+ 网页适合普通用户完成这些事:
86
+
87
+ - 配置多个 provider,并选择默认 provider。
88
+ - 保存和查看 API key;默认脱敏,用户可在本地页面点击眼睛图标查看完整 key。
89
+ - 查看 key 来源:终端环境变量、当前项目 `.env`,或 PicGen 管理文件 `~/.picgen/.env`。
90
+ - 新增同类 provider 时,PicGen 会默认分配独立 key 名称,避免多个渠道覆盖同一个 API key。
91
+ - 旧配置如果多个 provider 共用同一个 key 名称,新版本会在加载配置时自动迁移为一渠道一 key,并复制已有 key 值。
92
+ - 预览生成方案,确认后生成图片。
93
+ - 查看历史记录和本地保存路径,找回生成过的图片。
94
+
95
+ ## 4. 配置 Provider
96
+
97
+ 如果交互式终端选择项无法展示给用户,不要运行会阻塞的 `picgen setup`。请在聊天中询问:
98
+
99
+ - Provider 类型:Gemini 或 OpenAI-compatible。
100
+ - Provider host:只填域名,例如 `https://www.pandai.vip`,不要加 `/v1` 或 `/v1beta`。
101
+ - API key:请用户复制到剪贴板,然后回复“已复制”。不要让用户把 key 直接发到聊天里。
102
+
103
+ Gemini 第三方渠道:
104
+
105
+ ```bash
106
+ picgen provider quick-add gemini-proxy --host https://www.pandai.vip --prefer
107
+ picgen key set PICGEN_GEMINI_PROXY_KEY --clipboard
108
+ picgen key show PICGEN_GEMINI_PROXY_KEY --json
109
+ picgen provider test gemini_proxy --json
110
+ ```
111
+
112
+ OpenAI-compatible 第三方渠道:
113
+
114
+ ```bash
115
+ picgen provider quick-add openai-proxy --host https://www.pandai.vip --prefer
116
+ picgen key set PICGEN_OPENAI_PROXY_KEY --clipboard
117
+ picgen key show PICGEN_OPENAI_PROXY_KEY --json
118
+ picgen provider test openai_proxy --json
119
+ ```
120
+
121
+ 如果不能读取剪贴板,可以让用户运行 `picgen key set <ENV_NAME>`,在隐藏输入框里粘贴 key;或者在运行环境安全支持时,通过 stdin 写入。不要把 API key 放进 shell history。
122
+
123
+ 解释 key 检查时,使用这个口径:
124
+
125
+ ```text
126
+ 在对话里我只读取脱敏后的 key 状态,不读取完整密钥。需要查看或编辑完整配置时,可以打开 ~/.picgen/.env;当前项目目录下的 .env 可能会覆盖它;shell 环境变量优先级最高。
127
+ ```
128
+
129
+ ## 5. 首次轻量测试
130
+
131
+ 首次测试只验证工具和渠道是否可用,应低成本、快速、只生成 1 张。不要用 `poster`、`product-shot`、`social-cover`,不要用 premium / large / high 多图方案。
132
+
133
+ Gemini provider 首次测试优先用 flash image model:
134
+
135
+ ```bash
136
+ picgen create --dry-run --provider gemini_proxy --preset fast-draft --model gemini-3.1-flash-image-preview "一张简洁的 PicGen 测试图,白色背景,少量蓝绿色科技感点缀"
137
+ ```
138
+
139
+ OpenAI-compatible provider 首次测试:
140
+
141
+ ```bash
142
+ picgen create --dry-run --provider openai_proxy --preset fast-draft "一张简洁的 PicGen 测试图,白色背景,少量蓝绿色科技感点缀"
143
+ ```
144
+
145
+ 把 dry-run 预览展示给用户。用户确认后,再执行同一条命令并加上 `--yes`。
146
+
147
+ ## 6. 测试成功后
148
+
149
+ 一个 provider 配置并测试成功后,主动询问用户是否要继续添加备用渠道:
150
+
151
+ ```text
152
+ 当前渠道已经配置并测试成功。你还可以继续添加另一个渠道作为备用。要继续添加吗?
153
+ ```
154
+
155
+ 如果用户说继续,就重复 provider 配置和轻量测试流程。如果用户说不用,就结束配置,并告诉用户 PicGen 已经可以使用。
@@ -5,19 +5,25 @@ This checklist is for the first internal or friend-and-colleague trial of PicGen
5
5
  ## Install
6
6
 
7
7
  ```bash
8
+ node -v
9
+ npm -v
8
10
  npm install -g @ai-agent-tools/picgen
11
+ npx -y skills add ai-agent-tools/picgen --skill picgen -g -y --copy
12
+ picgen skill install codex
9
13
  picgen --help
10
14
  picgen quickstart
11
15
  ```
12
16
 
13
17
  Node.js 20 or newer is required.
14
18
 
19
+ `npx -y skills add ...` is the preferred cross-agent skill installation path when supported. `picgen skill install codex` installs the bundled PicGen skill into `~/.codex/skills/picgen` as a Codex-only fallback. Restart the agent or start a new session if the skill is not visible yet.
20
+
15
21
  ## Agent Prompt
16
22
 
17
23
  Send this to Codex, Trae, Claude Code, or a similar coding agent:
18
24
 
19
25
  ```text
20
- 请安装并体验 @ai-agent-tools/picgen:全局安装 npm install -g @ai-agent-tools/picgen,运行 picgen setup 配置,然后先 dry-run 预览,再确认生成一张测试图。如果我要用参考图,请使用 --reference <图片路径>。
26
+ 请帮我安装并配置 PicGen 生图工具。请先阅读并按这个指南执行:https://raw.githubusercontent.com/ai-agent-tools/picgen/refs/heads/main/docs/agent-install.md 。你负责判断是否在本机持久环境、安装 CLI skill、引导我配置 provider/API key,并先预览生成方案,等我确认后再生成测试图。不要让我理解命令细节,也不要让我把 API key 发到聊天里。
21
27
  ```
22
28
 
23
29
  ## First Run
@@ -40,7 +46,27 @@ https://generativelanguage.googleapis.com
40
46
 
41
47
  Do not include `/v1` or `/v1beta`.
42
48
 
43
- 4. Set API keys in the shell or a local `.env` file:
49
+ 4. Configure API keys:
50
+
51
+ For non-technical users, prefer `picgen setup`. It can save provider API keys in PicGen's managed env file:
52
+
53
+ ```text
54
+ ~/.picgen/.env
55
+ ```
56
+
57
+ PicGen loads this file automatically.
58
+
59
+ Agents should inspect keys with `picgen key list/show`, which only prints masked status. If a technical user needs the complete saved value, point them to `~/.picgen/.env`; a project `.env` may override it, and shell environment variables have highest priority.
60
+
61
+ In agent environments where interactive terminal prompts are not visible, ask the user for provider type, host, and API key in chat, then use non-interactive commands. Example for a Gemini-compatible third-party channel:
62
+
63
+ ```bash
64
+ picgen provider quick-add gemini-proxy --host https://www.pandai.vip --prefer
65
+ picgen key set PICGEN_GEMINI_PROXY_KEY --stdin
66
+ picgen provider test gemini_proxy --json
67
+ ```
68
+
69
+ Advanced users can still use shell environment variables or a local project `.env`:
44
70
 
45
71
  ```bash
46
72
  cp .env.example .env
@@ -57,12 +83,18 @@ GEMINI_API_KEY=...
57
83
  picgen doctor --json
58
84
  ```
59
85
 
86
+ 6. Check whether a newer PicGen version is available:
87
+
88
+ ```bash
89
+ picgen update check
90
+ ```
91
+
60
92
  ## Safe Preview
61
93
 
62
94
  Always start with dry-run:
63
95
 
64
96
  ```bash
65
- picgen create --dry-run "一张极简科技感产品海报"
97
+ picgen create --dry-run --preset fast-draft "一张简洁的 PicGen 测试图"
66
98
  ```
67
99
 
68
100
  Dry-run does not call providers and does not spend quota.
@@ -137,6 +169,14 @@ picgen provider test <provider-name> --json
137
169
 
138
170
  Check `base_url`, API key, model name, and provider availability.
139
171
 
172
+ `Provider request timed out`
173
+
174
+ High quality, large, or slow third-party image channels can take longer. Try again, use a faster preset, or raise the request timeout:
175
+
176
+ ```bash
177
+ PICGEN_PROVIDER_TIMEOUT_MS=450000 picgen create --yes --preset poster "<prompt>"
178
+ ```
179
+
140
180
  `No enabled provider can satisfy...`
141
181
 
142
182
  Run `picgen provider list`, enable a provider, add a fallback provider, or adjust the selected mode/model.
@@ -157,3 +197,9 @@ Publish when ready:
157
197
  ```bash
158
198
  npm publish --otp <code>
159
199
  ```
200
+
201
+ After publishing, ask trial users to upgrade with:
202
+
203
+ ```bash
204
+ npm install -g @ai-agent-tools/picgen@latest
205
+ ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@ai-agent-tools/picgen",
3
- "version": "0.1.0-alpha.0",
3
+ "version": "0.1.0-alpha.10",
4
4
  "description": "A lightweight image generation connector for AI agents.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -18,11 +18,13 @@
18
18
  "README.md"
19
19
  ],
20
20
  "scripts": {
21
- "build": "tsup",
21
+ "build": "npm run sync-version && tsup",
22
22
  "dev": "tsx src/cli.ts",
23
23
  "typecheck": "tsc --noEmit",
24
24
  "lint": "eslint .",
25
- "test": "vitest run"
25
+ "test": "vitest run",
26
+ "sync-version": "node scripts/sync-version.mjs",
27
+ "version": "npm run sync-version && git add src/version.ts package.json package-lock.json"
26
28
  },
27
29
  "keywords": [
28
30
  "agent",
@@ -17,10 +17,80 @@ Only suggest PicGen when the user discusses visual direction, mood, brand style,
17
17
 
18
18
  Never silently spend user quota. Do not send full conversation context to providers by default; summarize only the visual details needed for the final prompt.
19
19
 
20
+ ## Installation
21
+
22
+ If `picgen` is not available, install the CLI first:
23
+
24
+ ```bash
25
+ npm install -g @ai-agent-tools/picgen@latest
26
+ ```
27
+
28
+ To install or update this skill for supported agents, prefer the standard Skills installer:
29
+
30
+ ```bash
31
+ npx -y skills add ai-agent-tools/picgen --skill picgen -g -y --copy
32
+ ```
33
+
34
+ For Codex only, the CLI also provides a direct fallback:
35
+
36
+ ```bash
37
+ picgen skill install codex --force
38
+ ```
39
+
40
+ After installing or updating a skill, the user may need to restart the agent or open a new session before the skill is visible.
41
+
42
+ ## Environment Check
43
+
44
+ Before installing the CLI, installing skills, or saving API keys, check whether the current terminal is the user's persistent local environment or a sandbox/temporary environment.
45
+
46
+ Use lightweight checks such as:
47
+
48
+ ```bash
49
+ pwd
50
+ echo "$HOME"
51
+ node -v
52
+ npm -v
53
+ npm root -g
54
+ which picgen || true
55
+ ```
56
+
57
+ If the environment appears persistent and user-level, continue setup there.
58
+
59
+ If the environment appears sandboxed, remote, temporary, reset between sessions, or unable to write to the user's real home directory, do not install PicGen, install skills, or save API keys there. First request permission to use the user's host/persistent terminal or a persistent user-level environment. Example user-facing wording:
60
+
61
+ ```text
62
+ 我检测到当前可能是临时沙箱。PicGen 需要安装到你的本机持久环境,否则新会话可能会丢失 CLI、skill 或 API key。请允许我在本机持久环境中继续安装和配置;如果无法授权,我再给你最少的本机终端命令。
63
+ ```
64
+
65
+ If permission is granted, continue installation and provider setup in that persistent environment. If permission is unavailable or denied, give the user a minimal copy-paste command sequence for their local terminal. Keep manual instructions short and explain only what the user must do.
66
+
67
+ ## Local Web Interface
68
+
69
+ If the user says "open PicGen", "打开 PicGen", "打开生图工具", asks for a visual settings page, wants to generate without learning CLI commands, or wants to find previously generated images, start the local web interface:
70
+
71
+ ```bash
72
+ picgen open
73
+ ```
74
+
75
+ Return the printed local URL to the user. PicGen binds to `127.0.0.1`, defaults to port `8188`, and automatically tries the next ports if needed. The server runs in the foreground; the user or agent should keep the terminal running while the page is in use and close it with Ctrl+C when finished.
76
+
77
+ If the current environment appears sandboxed, remote, or temporary, first request access to the user's persistent local environment before running `picgen open`. If that is impossible, give the user one short instruction to run `picgen open` in their local terminal and open the printed URL.
78
+
79
+ The local page may show full API keys only inside the user's browser after an explicit reveal action. In chat, inspect keys only with masked commands such as:
80
+
81
+ ```bash
82
+ picgen key list --json
83
+ picgen key show PICGEN_GEMINI_PROXY_KEY --json
84
+ ```
85
+
86
+ Use the web interface for user-facing setup, provider management, generation, and history browsing. Use CLI commands for agent-driven dry-runs, automation, diagnostics, and precise reproducible steps.
87
+
88
+ For first-time agent-assisted setup, prefer the clipboard-based CLI flow because it keeps the user in conversation and avoids extra UI switching. Prefer the web interface when the user wants to manage multiple providers, inspect masked key sources, reveal a full key locally, generate images without CLI commands, or find saved image history.
89
+
20
90
  ## Workflow
21
91
 
22
92
  1. Run `picgen doctor --json` to check configuration.
23
- 2. If no usable provider is configured, guide the user to run `picgen setup`.
93
+ 2. If no usable provider is configured, configure one before generation.
24
94
  3. Choose a preset from the user's intent, such as `poster`, `product-shot`, or `social-cover`.
25
95
  4. Run `picgen create --dry-run --preset <preset> "<prompt>"`.
26
96
  5. Present the dry-run as a user-facing generation preview. Do not expose `dry-run` as a technical term unless useful.
@@ -29,6 +99,47 @@ Never silently spend user quota. Do not send full conversation context to provid
29
99
 
30
100
  If the user explicitly says to generate directly or not ask for confirmation, you may skip the user-facing confirmation step. Still form a generation plan internally.
31
101
 
102
+ ## Provider Setup
103
+
104
+ When terminal prompts are visible to the user, `picgen setup` is acceptable.
105
+
106
+ When running inside an agent environment where interactive terminal prompts are not visible, do not run `picgen setup` as a blocking wizard. Ask the user for:
107
+
108
+ - Provider type: Gemini or OpenAI-compatible.
109
+ - Provider host: host only, such as `https://www.pandai.vip`; do not include `/v1` or `/v1beta`.
110
+ - API key. Prefer asking the user to copy the key to their clipboard and reply "copied"; avoid asking them to paste secrets into chat.
111
+
112
+ Then use non-interactive commands.
113
+
114
+ Gemini-compatible third-party channel:
115
+
116
+ ```bash
117
+ picgen provider quick-add gemini-proxy --host https://www.pandai.vip --prefer
118
+ picgen key set PICGEN_GEMINI_PROXY_KEY --clipboard
119
+ picgen provider test gemini_proxy --json
120
+ ```
121
+
122
+ OpenAI-compatible third-party channel:
123
+
124
+ ```bash
125
+ picgen provider quick-add openai-proxy --host https://www.pandai.vip --prefer
126
+ picgen key set PICGEN_OPENAI_PROXY_KEY --clipboard
127
+ picgen provider test openai_proxy --json
128
+ ```
129
+
130
+ If clipboard access is unavailable, pass the API key through stdin for `picgen key set`; do not put secrets directly in shell history unless the user explicitly accepts that tradeoff. If the agent runtime cannot pass stdin safely, ask the user to run `picgen key set <ENV_NAME>` in their terminal and paste the key into the hidden prompt.
131
+
132
+ To inspect configured keys without revealing secret values:
133
+
134
+ ```bash
135
+ picgen key list --json
136
+ picgen key show PICGEN_GEMINI_PROXY_KEY --json
137
+ ```
138
+
139
+ These commands show source, length, masked preview, and fingerprint only. Never ask the user to paste a key into chat just to verify it.
140
+
141
+ When explaining key inspection to users, say: "In this conversation I only read masked key status, not the full secret. If you need to inspect or edit the complete saved key yourself, PicGen's managed key file is `~/.picgen/.env`; a project-level `.env` in the current directory may override it; shell environment variables take highest priority."
142
+
32
143
  For reference-image generation, pass local images with repeated `--reference <path>` flags:
33
144
 
34
145
  ```bash
@@ -40,6 +151,36 @@ Use Gemini providers for reference-image generation in Alpha. The OpenAI-compati
40
151
 
41
152
  PicGen routes by provider capabilities. When reference images are provided, agents may omit `--provider` and let PicGen select a provider that supports `reference-image`, unless the user explicitly requested a provider.
42
153
 
154
+ ## First Smoke Test
155
+
156
+ After configuring a provider, run the first test generation with a low-cost, fast, one-image plan. Do not use `poster`, `product-shot`, `social-cover`, premium modes, large sizes, or multi-image presets for initial verification.
157
+
158
+ For Gemini providers, prefer the flash image model for the first test:
159
+
160
+ ```bash
161
+ picgen create --dry-run --provider gemini_proxy --preset fast-draft --model gemini-3.1-flash-image-preview "一张简洁的 PicGen 测试图,白色背景,少量蓝绿色科技感点缀"
162
+ picgen create --yes --provider gemini_proxy --preset fast-draft --model gemini-3.1-flash-image-preview "一张简洁的 PicGen 测试图,白色背景,少量蓝绿色科技感点缀"
163
+ ```
164
+
165
+ For OpenAI-compatible providers:
166
+
167
+ ```bash
168
+ picgen create --dry-run --provider openai_proxy --preset fast-draft "一张简洁的 PicGen 测试图,白色背景,少量蓝绿色科技感点缀"
169
+ picgen create --yes --provider openai_proxy --preset fast-draft "一张简洁的 PicGen 测试图,白色背景,少量蓝绿色科技感点缀"
170
+ ```
171
+
172
+ Present the dry-run preview and ask for confirmation before the real generation unless the user explicitly asked to generate immediately. The first smoke test should generate one image.
173
+
174
+ ## After Provider Success
175
+
176
+ After one provider is configured and the first smoke test succeeds, tell the user the provider is ready and ask whether they want to add another channel as a fallback. Example:
177
+
178
+ ```text
179
+ Gemini 渠道已经配置并测试成功。你还可以继续添加另一个渠道作为备用,例如 OpenAI-compatible。要继续添加吗?
180
+ ```
181
+
182
+ If the user says yes, repeat provider setup and smoke testing for the next channel. If the user says no, stop setup and tell them PicGen is ready to use.
183
+
43
184
  ## Preferences and Overrides
44
185
 
45
186
  Treat `picgen create` flags as one-off overrides. They must not change user preferences:
@@ -73,9 +214,9 @@ PicGen redacts generated image payloads and Gemini thought signatures from metad
73
214
 
74
215
  ## Error Handling
75
216
 
76
- If `doctor` reports no usable provider, ask the user to run `picgen setup`.
217
+ If `doctor` reports no usable provider, configure a provider. Prefer non-interactive setup in agent environments.
77
218
 
78
- If an API key is missing, name the required environment variable.
219
+ If an API key is missing, save it with `picgen key set <ENV_NAME> --clipboard`, `--stdin`, or guide the user to run `picgen setup` when interactive prompts are visible. Name the required environment variable only when useful for debugging.
79
220
 
80
221
  If a provider is disabled, suggest enabling it or using a one-off provider override.
81
222
 
@@ -94,14 +235,14 @@ Explicit:
94
235
  Confirmation:
95
236
 
96
237
  ```text
97
- 我可以用 PicGen 基于当前方案生成一版主视觉。要我现在生成吗?默认用 poster 预设,出 2 张。
238
+ 我可以先用 PicGen 做一次轻量测试生成,默认只出 1 张,确认工具和渠道都可用。要我现在开始吗?
98
239
  ```
99
240
 
100
241
  Generation preview:
101
242
 
102
243
  ```text
103
244
  生成预览:
104
- 我将使用 OpenAI 官方渠道生成 2 张发布会海报,比例 3:4,保存到本地。
245
+ 我将使用当前渠道生成 1 张轻量测试图,保存到本地。
105
246
 
106
247
  确认后开始生成。
107
248
  ```