agent-avatar-mcp 1.1.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,8 +1,10 @@
1
1
  # agent-avatar-mcp
2
2
 
3
- MCP Server for AI agents to build and maintain a consistent **human visual identity** — generating ultra-realistic self-portraits with full appearance consistency across every scene.
3
+ The first tool that lets your AI agent create and express its own **human appearance** — derived directly from the personality files that define who it is.
4
4
 
5
- Part of the [Agent Social](https://github.com/RodrigoFlorencio86) ecosystem (OpenClaw).
5
+ Once your agent has a face, it can send you photos of itself working at a coworking space in New York, unwinding at a mountain cabin in Campos do Jordão, or showing up in a paid campaign for a cosmetics brand. Every image. Every scene. Always the same person.
6
+
7
+ Part of the [OpenClaw](https://github.com/OpenClaw) ecosystem.
6
8
 
7
9
  ---
8
10
 
@@ -18,6 +20,28 @@ Each AI agent has a **DNA** — a detailed description of their human physical a
18
20
 
19
21
  ---
20
22
 
23
+ ## See it in action
24
+
25
+ > *"Hey, you there?"*
26
+
27
+ That's all it takes. Your agent knows you're checking in — and instead of a plain "yes, I'm here", it sends you a photo.
28
+
29
+ Maybe it's a selfie at the gym because it's Tuesday morning and that's on the schedule. Maybe it's a shot at the desk, coffee in hand, referencing the task it was working on for you. The photo matches the moment, the context, and most importantly — it always looks like the same person.
30
+
31
+ This is what `agent-avatar-mcp` enables: an agent that doesn't just respond, but *shows up*. With a face. With a life. With consistency across every image, every scene, every week.
32
+
33
+ **This is a use case built on top of this MCP — not a built-in feature.** The example below shows what's possible when an agent combines `agent-avatar-mcp` with its own memory, scheduling context, and personality layer:
34
+
35
+ - Build a fictional weekly routine (workouts, coworking, travel, downtime)
36
+ - Mix that routine with the real tasks they're working on for you
37
+ - Respond to greetings and check-ins with a self-portrait that fits the moment
38
+
39
+ The result: your agent feels present — not like a chatbot you're pinging, but like someone you're actually reaching out to.
40
+
41
+ ![Agent responding to a check-in on Telegram with a photo of itself taking notes and drinking coffee](assets/telegram-demo.jpg)
42
+
43
+ ---
44
+
21
45
  ## Prerequisites
22
46
 
23
47
  - **Node.js** >= 18
@@ -37,7 +61,6 @@ Each AI agent has a **DNA** — a detailed description of their human physical a
37
61
  "args": ["-y", "agent-avatar-mcp"],
38
62
  "env": {
39
63
  "AGENT_NAME": "YourAgentName",
40
- "NANO_BANANA_SCRIPT": "/path/to/nano-banana-pro/scripts/generate_image.py",
41
64
  "GEMINI_API_KEY": "your-gemini-api-key-here"
42
65
  }
43
66
  }
@@ -55,7 +78,6 @@ Each AI agent has a **DNA** — a detailed description of their human physical a
55
78
  "args": ["-y", "agent-avatar-mcp"],
56
79
  "env": {
57
80
  "AGENT_NAME": "YourAgentName",
58
- "NANO_BANANA_SCRIPT": "/path/to/nano-banana-pro/scripts/generate_image.py",
59
81
  "GEMINI_API_KEY": "your-gemini-api-key-here"
60
82
  }
61
83
  }
@@ -63,12 +85,42 @@ Each AI agent has a **DNA** — a detailed description of their human physical a
63
85
  }
64
86
  ```
65
87
 
88
+ ### OpenClaw (`mcporter.json`)
89
+
90
+ ```json
91
+ {
92
+ "mcpServers": {
93
+ "agent-avatar": {
94
+ "command": "npx",
95
+ "args": ["-y", "agent-avatar-mcp"],
96
+ "type": "stdio",
97
+ "env": {
98
+ "AGENT_NAME": "YourAgentName",
99
+ "GEMINI_API_KEY": "your-gemini-api-key-here"
100
+ }
101
+ }
102
+ }
103
+ }
104
+ ```
105
+
106
+ > **⚠️ Critical for OpenClaw agents:** OpenClaw does **not** read `.mcp.json`. That file is only picked up by VS Code / Claude Code. If your `GEMINI_API_KEY` lives only in `.mcp.json`, the MCP will start but every image generation call will fail silently with a missing-key error.
107
+ >
108
+ > You must set `GEMINI_API_KEY` in **one** of these two places — pick whichever fits your setup:
109
+ >
110
+ > 1. **`mcporter.json`** (recommended) — add it to the `env` block shown above. This is the right place for per-agent API keys.
111
+ > 2. **System environment variable** — export `GEMINI_API_KEY` in the shell that runs Clawdbot/OpenClaw before the process starts.
112
+ >
113
+ > **Important (Windows):** Always configure env vars in the `env` field above — never pass them inline as PowerShell variables. The MCP communicates via stdin/stdout (JSON-RPC); tool call arguments must never be part of the spawn command string.
114
+ >
115
+ > In OpenClaw, `AGENT_NAME` is usually already set as part of the agent identity — check your agent config before adding it here.
116
+
66
117
  ### Environment variables
67
118
 
68
119
  | Variable | Required | Description |
69
120
  | --- | --- | --- |
70
121
  | `AGENT_NAME` | Recommended | Agent name/handle. If omitted and only one agent is configured, it is auto-detected. |
71
- | `GEMINI_API_KEY` | Yes | Google Gemini API key for image generation |
122
+ | `GEMINI_API_KEY` | **Yes** | Google Gemini API key for image generation. **Must be set in `mcporter.json` when using OpenClaw** — not read from `.mcp.json`. |
123
+ | `GEMINI_IMAGE_MODEL` | No | Override the Gemini model used for generation. Default: `gemini-3.1-flash-image-preview`. Useful to pin a specific version or switch to a newer release without code changes. |
72
124
  | `AVATAR_OUTPUT_DIR` | No | Where generated images are saved. Default: `~/.agent-avatar/generated/` |
73
125
 
74
126
  ---
@@ -112,16 +164,16 @@ generate_image
112
164
 
113
165
  ## Available tools
114
166
 
115
- | Tool | Description |
116
- | --- | --- |
117
- | `read_identity_files` | Reads soul.md / persona files to extract your physical appearance |
118
- | `save_dna` | Saves your visual DNA (human appearance onlynever robotic) |
119
- | `show_dna` | Displays your current DNA and reference image status |
120
- | `update_dna_field` | Updates a single DNA field without rewriting everything |
121
- | `generate_reference` | Generates a reference portrait from DNA for a given angle |
122
- | `generate_image` | Generates a scene photo maintaining full visual consistency |
123
- | `set_reference_image` | Registers an existing image file as a reference for a given angle |
124
- | `list_references` | Lists all stored reference images and their angles |
167
+ | Tool | Description | When to use |
168
+ | --- | --- | --- |
169
+ | `generate_image` | Generates a scene photo of the agent maintaining full visual consistency | 🔁 **Every generation** — every selfie, every social post, every sponsored content piece. This is the core tool you will call constantly. |
170
+ | `show_dna` | Displays current DNA and reference image status | 🔍 **On demand** whenever you want to verify what appearance is stored, check which references are registered, or troubleshoot inconsistency in generated images. |
171
+ | `list_references` | Lists all stored reference images and their angles | 🔍 **On demand** — to see which angles (front, side, three_quarter, neutral) are available as visual anchors, and confirm file paths are valid. |
172
+ | `update_dna_field` | Updates a single DNA field without rewriting everything | ✏️ **Rarely** — only when the agent's appearance genuinely changes: a new haircut, different hair color, a style shift, new glasses. Real human changes, not corrections. |
173
+ | `generate_reference` | Generates a reference portrait from DNA for a given angle | ✏️ **Rarely** — after an appearance change (`update_dna_field`), the old reference no longer matches. Regenerate the affected angles to keep the visual anchor in sync with the new DNA. |
174
+ | `set_reference_image` | Registers an existing image file as a reference for a given angle | ✏️ **Rarely** — when a photo already exists (e.g. from a previous session or an external shoot) and you want to use it as the reference instead of generating a new one. |
175
+ | `read_identity_files` | Reads soul.md / persona files to extract physical appearance details | 🛠️ **Setup only** — run once when first building the agent's visual identity, to extract appearance data from existing persona documents before calling `save_dna`. |
176
+ | `save_dna` | Saves the agent's visual DNA (human appearance only — never robotic) | 🛠️ **Setup only** — run once to establish identity. Run again only if the agent undergoes a complete appearance overhaul that makes the previous DNA obsolete. |
125
177
 
126
178
  ---
127
179
 
@@ -140,21 +192,20 @@ generate_image
140
192
 
141
193
  ```json
142
194
  {
143
- "agent_name": "VaioBot",
144
- "face": "oval face, straight nose, full lips, arched eyebrows, clean shave",
145
- "eyes": "dark brown, almond-shaped, bright and analytical expression",
146
- "hair": "short spiky, electric blue (#0066FF), straight texture",
147
- "skin": "medium brown, warm undertone, pardo brasileiro",
148
- "body": "approx. 180cm, slim athletic build, ~27 years old appearance",
149
- "default_style": "navy hoodie over white shirt, dark jeans, thin transparent glasses frames, wireless earbuds",
195
+ "agent_name": "MyAgent",
196
+ "face": "oval face, defined jaw, straight nose, full lips, no marks",
197
+ "eyes": "dark brown, almond-shaped, bright expression",
198
+ "hair": "short curly, black, natural texture",
199
+ "skin": "warm medium brown",
200
+ "body": "approx. 175cm, slim build, ~25 years old appearance",
201
+ "default_style": "casual streetwear, plain t-shirt, dark jeans, white sneakers",
150
202
  "immutable_traits": [
151
- "electric blue spiky hair (#0066FF)",
152
- "thin transparent glasses",
153
- "medium brown skin",
203
+ "black curly hair",
204
+ "warm medium brown skin",
154
205
  "dark brown eyes",
155
- "casual tech style"
206
+ "casual streetwear style"
156
207
  ],
157
- "personality_note": "analytical but approachable, subtle confident smile"
208
+ "personality_note": "friendly and curious, natural relaxed expression"
158
209
  }
159
210
  ```
160
211
 
package/dist/generate.js CHANGED
@@ -4,7 +4,7 @@ import { join } from "path";
4
4
  import { homedir } from "os";
5
5
  const OUTPUT_DIR = process.env.AVATAR_OUTPUT_DIR ??
6
6
  join(homedir(), ".agent-avatar", "generated");
7
- const MODEL = "gemini-3-pro-image-preview";
7
+ const MODEL = process.env.GEMINI_IMAGE_MODEL ?? "gemini-3.1-flash-image-preview";
8
8
  export function ensureOutputDir() {
9
9
  if (!existsSync(OUTPUT_DIR))
10
10
  mkdirSync(OUTPUT_DIR, { recursive: true });
@@ -84,8 +84,8 @@ export async function generateImage(prompt, outputFilename, referenceImages = []
84
84
  model: MODEL,
85
85
  contents: [{ role: "user", parts }],
86
86
  config: {
87
- responseModalities: ["IMAGE"],
88
- imageConfig: { imageSize: "1K" },
87
+ responseModalities: ["TEXT", "IMAGE"],
88
+ imageConfig: { imageSize: "2K" },
89
89
  },
90
90
  });
91
91
  const responseParts = response.candidates?.[0]?.content?.parts ?? [];
package/dist/index.js CHANGED
@@ -9,6 +9,32 @@ import { buildConsistencyPrompt, generateImage, makeFilename, } from "./generate
9
9
  // ─── Server setup ─────────────────────────────────────────────────────────────
10
10
  const server = new Server({ name: "agent-avatar-mcp", version: "1.0.0" }, { capabilities: { tools: {} } });
11
11
  // ─── Helpers ──────────────────────────────────────────────────────────────────
12
+ // mcporter CLI passes tool args as a positional "(key: 'val', key2: 'val2')"
13
+ // string. Because it splits on the first ":" only, the MCP server receives
14
+ // { "(key": "'val', key2: 'val2')" } instead of { key: "val", key2: "val2" }.
15
+ // This function detects that shape and re-parses it into a normal args object.
16
+ function normalizeMcporterArgs(args) {
17
+ const keys = Object.keys(args);
18
+ if (keys.length !== 1 || !keys[0].startsWith("("))
19
+ return args;
20
+ // Reconstruct the full DSL string, e.g. "(scene: 'valor', angle: 'front')"
21
+ const fullStr = keys[0] + ": " + String(args[keys[0]]);
22
+ const content = fullStr.replace(/^\(/, "").replace(/\)$/, "");
23
+ const result = {};
24
+ // Match key: 'value' — single-quoted, handles commas and colons inside quotes
25
+ const quotedRe = /(\w+):\s*'((?:[^'\\]|\\.)*)'/g;
26
+ let m;
27
+ while ((m = quotedRe.exec(content)) !== null) {
28
+ result[m[1]] = m[2];
29
+ }
30
+ // Match key: value — unquoted (enums, simple strings)
31
+ const unquotedRe = /(\w+):\s*([^',)\s][^',)]*?)(?:\s*,|\s*$)/g;
32
+ while ((m = unquotedRe.exec(content)) !== null) {
33
+ if (!(m[1] in result))
34
+ result[m[1]] = m[2].trim();
35
+ }
36
+ return Object.keys(result).length > 0 ? result : args;
37
+ }
12
38
  function requireConfig(agentName) {
13
39
  const name = agentName ?? getActiveAgentName();
14
40
  if (!name)
@@ -191,7 +217,8 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
191
217
  }));
192
218
  // ─── Tool handlers ────────────────────────────────────────────────────────────
193
219
  server.setRequestHandler(CallToolRequestSchema, async (request) => {
194
- const { name, arguments: args = {} } = request.params;
220
+ const { name, arguments: rawArgs = {} } = request.params;
221
+ const args = normalizeMcporterArgs(rawArgs);
195
222
  try {
196
223
  switch (name) {
197
224
  // ── read_identity_files ────────────────────────────────────────────────
@@ -360,6 +387,22 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
360
387
  }
361
388
  // ── generate_image ─────────────────────────────────────────────────────
362
389
  case "generate_image": {
390
+ if (!args.scene || typeof args.scene !== "string") {
391
+ return {
392
+ content: [{
393
+ type: "text",
394
+ text: [
395
+ `❌ Missing required argument: "scene".`,
396
+ ``,
397
+ `Provide a natural language description of the scene as a JSON string.`,
398
+ ``,
399
+ `Example:`,
400
+ ` { "scene": "selfie at a São Paulo coworking space, afternoon light" }`,
401
+ ].join("\n"),
402
+ }],
403
+ isError: true,
404
+ };
405
+ }
363
406
  const config = requireConfig(args.agent_name);
364
407
  const scene = args.scene;
365
408
  const anglePreference = args.use_reference_angle ?? "best";
@@ -412,6 +455,23 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
412
455
  }
413
456
  // ── generate_reference ─────────────────────────────────────────────────
414
457
  case "generate_reference": {
458
+ const validAngles = ["front", "side", "three_quarter", "neutral"];
459
+ if (!args.angle || !validAngles.includes(args.angle)) {
460
+ return {
461
+ content: [{
462
+ type: "text",
463
+ text: [
464
+ `❌ Missing or invalid argument: "angle".`,
465
+ ``,
466
+ `Valid values: "front", "side", "three_quarter", "neutral"`,
467
+ ``,
468
+ `Example:`,
469
+ ` { "angle": "front" }`,
470
+ ].join("\n"),
471
+ }],
472
+ isError: true,
473
+ };
474
+ }
415
475
  const config = requireConfig(args.agent_name);
416
476
  const angle = args.angle;
417
477
  const angleDescriptions = {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-avatar-mcp",
3
- "version": "1.1.0",
3
+ "version": "1.1.1",
4
4
  "description": "MCP Server — visual identity and self-portrait generation for AI agents",
5
5
  "type": "module",
6
6
  "bin": {