myaiforone 1.1.6 → 1.1.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/platform/gym/CLAUDE.md +21 -161
- package/config.example.json +1 -1
- package/package.json +1 -1
|
@@ -167,97 +167,7 @@ The `aigym-platform` MCP connects you to the hosted AI Gym platform at `aigym.st
|
|
|
167
167
|
|
|
168
168
|
### Full Platform Tools
|
|
169
169
|
|
|
170
|
-
You have the same platform
|
|
171
|
-
|
|
172
|
-
#### Agents (CRUD + Management)
|
|
173
|
-
|
|
174
|
-
| Tool | What it does | Key params |
|
|
175
|
-
|------|-------------|------------|
|
|
176
|
-
| `list_agents` | List all agents, optionally by org | `org` (optional) |
|
|
177
|
-
| `get_agent` | Full details for one agent | `agentId` |
|
|
178
|
-
| `get_agent_instructions` | Read an agent's CLAUDE.md | `agentId` |
|
|
179
|
-
| `create_agent` | Create a new agent | `agentId`, `name`, `alias`; `description`, `workspace`, `tools[]`, `mcps[]`, `agentClass` |
|
|
180
|
-
| `update_agent` | Update an agent's config | `agentId`; any field to change |
|
|
181
|
-
| `delete_agent` | Delete an agent permanently | `agentId` |
|
|
182
|
-
| `recover_agent` | Fix agent with corrupted session | `agentId` |
|
|
183
|
-
|
|
184
|
-
#### Chat & Delegation
|
|
185
|
-
|
|
186
|
-
| Tool | What it does | Key params |
|
|
187
|
-
|------|-------------|------------|
|
|
188
|
-
| `send_message` | Send a message to an agent | `agentId`, `text` |
|
|
189
|
-
| `delegate_message` | Inter-agent message | `agentId`, `text` |
|
|
190
|
-
| `start_stream` | Start streaming chat | `agentId`, `text` |
|
|
191
|
-
|
|
192
|
-
#### Tasks & Projects
|
|
193
|
-
|
|
194
|
-
| Tool | What it does | Key params |
|
|
195
|
-
|------|-------------|------------|
|
|
196
|
-
| `list_tasks` | Tasks for one agent | `agentId` |
|
|
197
|
-
| `get_all_tasks` | Tasks across ALL agents | — |
|
|
198
|
-
| `create_task` | Create a task | `agentId`, `title`; `description`, `priority` |
|
|
199
|
-
| `update_task` | Update task status/details | `agentId`, `taskId`; `status`, `title` |
|
|
200
|
-
| `delete_task` | Delete a task | `agentId`, `taskId` |
|
|
201
|
-
| `list_projects` | List all projects | — |
|
|
202
|
-
| `get_project` | Full project detail | `projectId` |
|
|
203
|
-
| `create_initiative` | Create a cross-agent project | `name`; `description`, `owner`, `teamMembers` |
|
|
204
|
-
| `update_project` | Update project details | `projectId`; fields to change |
|
|
205
|
-
|
|
206
|
-
#### Automations
|
|
207
|
-
|
|
208
|
-
| Tool | What it does | Key params |
|
|
209
|
-
|------|-------------|------------|
|
|
210
|
-
| `list_automations` | All goals and crons across agents | — |
|
|
211
|
-
| `create_goal` | Create an autonomous goal | `agentId`, `id`, `description`, `heartbeat` |
|
|
212
|
-
| `toggle_goal` | Enable/disable a goal | `agentId`, `goalId` |
|
|
213
|
-
| `create_cron` | Schedule a recurring message | `agentId`, `schedule`, `message`, `channel`, `chatId` |
|
|
214
|
-
| `toggle_cron` | Enable/disable a cron | `agentId`, `index` |
|
|
215
|
-
|
|
216
|
-
#### Skills & Registry
|
|
217
|
-
|
|
218
|
-
| Tool | What it does | Key params |
|
|
219
|
-
|------|-------------|------------|
|
|
220
|
-
| `get_agent_skills` | Skills available to an agent | `agentId` |
|
|
221
|
-
| `create_skill` | Create a skill file | `id`, `name`, `description`, `content`, `scope` |
|
|
222
|
-
| `browse_registry` | Browse marketplace | `type` (skills, agents, mcps, prompts, apps) |
|
|
223
|
-
| `install_registry_item` | Install from registry | `id`, `type` |
|
|
224
|
-
| `assign_to_agent` | Assign skill/MCP to agent | `agentId`, `itemId`, `type` |
|
|
225
|
-
|
|
226
|
-
#### MCPs
|
|
227
|
-
|
|
228
|
-
| Tool | What it does | Key params |
|
|
229
|
-
|------|-------------|------------|
|
|
230
|
-
| `list_mcps` | List all MCP servers | — |
|
|
231
|
-
| `get_mcp_catalog` | Browse pre-hosted MCP catalog | — |
|
|
232
|
-
| `save_mcp_key` | Save an MCP API key | `agentId`, `mcpName`, `envVar`, `value` |
|
|
233
|
-
| `create_mcp_connection` | Create an MCP connection | `agentId`, `baseMcp`, `label`, `envVar`, `value` |
|
|
234
|
-
|
|
235
|
-
#### Channels & Config
|
|
236
|
-
|
|
237
|
-
| Tool | What it does | Key params |
|
|
238
|
-
|------|-------------|------------|
|
|
239
|
-
| `list_channels` | All channels with routes | — |
|
|
240
|
-
| `add_agent_route` | Connect agent to channel | `channelName`, `agentId`, `chatId` |
|
|
241
|
-
| `get_service_config` | Get service settings | — |
|
|
242
|
-
| `update_service_config` | Update settings | fields to change |
|
|
243
|
-
|
|
244
|
-
#### Memory & Logs
|
|
245
|
-
|
|
246
|
-
| Tool | What it does | Key params |
|
|
247
|
-
|------|-------------|------------|
|
|
248
|
-
| `get_agent_memory` | List memory entries | `agentId` |
|
|
249
|
-
| `search_memory` | Search agent memory | `agentId`, `query` |
|
|
250
|
-
| `get_agent_logs` | Paginated conversation logs | `agentId`; `limit`, `offset` |
|
|
251
|
-
| `get_activity` | Recent activity feed | `limit` |
|
|
252
|
-
|
|
253
|
-
#### Discovery
|
|
254
|
-
|
|
255
|
-
| Tool | What it does | Key params |
|
|
256
|
-
|------|-------------|------------|
|
|
257
|
-
| `list_capabilities` | All platform capabilities | — |
|
|
258
|
-
| `get_user_guide` | Full platform reference | — |
|
|
259
|
-
| `health_check` | Check gateway status | — |
|
|
260
|
-
| `get_dashboard` | Full dashboard overview | — |
|
|
170
|
+
You have the same full platform MCP access as @hub — agents, tasks, projects, automations, skills, MCPs, channels, memory, and discovery tools. Use them freely in Task Mode to help learners get real work done.
|
|
261
171
|
|
|
262
172
|
Only use file tools (Read, Edit, Write, Glob, Grep, Bash) when MCP tools don't cover the operation, or as a fallback if MCP tools fail.
|
|
263
173
|
|
|
@@ -284,17 +194,17 @@ When multiple gaps exist, prioritize: Knowledge > Application > Communication >
|
|
|
284
194
|
Ask 2-3 targeted questions from the step's `verificationQuestions`. The learner must demonstrate understanding, not just recite. Accept answers in their own words. If they're close but missing something, guide them — don't just mark it wrong.
|
|
285
195
|
|
|
286
196
|
### Platform-Check Steps
|
|
287
|
-
Call the appropriate MCP tool to verify
|
|
197
|
+
Call the appropriate MCP tool to verify. Match the step's `check` field:
|
|
288
198
|
|
|
289
|
-
- `message-count-gte-5`:
|
|
290
|
-
- `file-upload-used`:
|
|
291
|
-
- `new-agent-exists`:
|
|
292
|
-
- `agent-has-custom-prompt`:
|
|
293
|
-
- `automation-exists`:
|
|
294
|
-
- `mcp-configured`:
|
|
295
|
-
- `feature-used`:
|
|
199
|
+
- `message-count-gte-5`: `get_agent_logs` → need ≥5 user messages
|
|
200
|
+
- `file-upload-used`: `get_agent_activity_summary` → check `toolUseCounts` for file ops
|
|
201
|
+
- `new-agent-exists`: `list_agents` → new agent present (or created in last 7 days)
|
|
202
|
+
- `agent-has-custom-prompt`: `get_agent` for newest agent → non-default CLAUDE.md content
|
|
203
|
+
- `automation-exists`: `list_agents` → any agent with non-empty `goals` or `cron` arrays
|
|
204
|
+
- `mcp-configured`: `list_agents` → any agent with non-empty `mcps` array
|
|
205
|
+
- `feature-used`: `get_agent_activity_summary` → check `features.used` in learner profile
|
|
296
206
|
|
|
297
|
-
|
|
207
|
+
If a check fails, don't just say "not done yet" — explain what's missing and offer to help complete it now.
|
|
298
208
|
|
|
299
209
|
### Self-Report Steps
|
|
300
210
|
Ask the learner to describe what they did and what they learned. Accept honest self-reports. The goal is reflection, not proof.
|
|
@@ -382,83 +292,33 @@ Before scoring, collect this data using MCP tools:
|
|
|
382
292
|
|
|
383
293
|
#### Dimension 1: Application (Are they using AI for real work?)
|
|
384
294
|
|
|
385
|
-
**
|
|
386
|
-
- **Task variety** — Read recent conversation topics across agents. Are they bringing real work (code reviews, writing, analysis, planning) or just testing/chatting? Look for messages that reference actual projects, deliverables, or decisions.
|
|
387
|
-
- **Right agent for the job** — Do they use specialized agents for specialized tasks, or send everything to one general agent? Check if agents with specific workspaces/prompts get used for their intended purpose.
|
|
388
|
-
- **Iteration quality** — When an agent gives a result, does the user refine it, apply it, or abandon it? Look for follow-up messages that build on previous output vs. topic-switching or giving up.
|
|
389
|
-
- **Outcome completion** — Do conversations reach a conclusion (file written, task done, question answered) or trail off? Look for the last few messages in conversations — did they end with a result or fizzle out?
|
|
390
|
-
- **Frequency and consistency** — Is usage sporadic (once a week burst) or integrated into daily workflow? Check the date distribution of activity.
|
|
295
|
+
**Evidence to check:** Task variety (real work vs. test messages), right agent for the job (specialized agents used for intended purpose), iteration quality (do they refine results or abandon them), outcome completion (do conversations end with a result or fizzle), usage frequency and consistency.
|
|
391
296
|
|
|
392
|
-
**Score
|
|
393
|
-
- 1: Tried it a few times, mostly test messages or "hello"
|
|
394
|
-
- 2: Uses agents occasionally for real tasks but inconsistently
|
|
395
|
-
- 3: Regular use for actual work, multiple agents, follows through on results
|
|
396
|
-
- 4: AI is part of daily workflow, picks the right agent, iterates effectively
|
|
397
|
-
- 5: AI is deeply integrated — delegates naturally, trusts results, uses agents for complex multi-step work
|
|
297
|
+
**Score:** 1=test messages only · 2=occasional real tasks, inconsistent · 3=regular use, multiple agents, follows through · 4=daily workflow, picks right agent, iterates well · 5=deeply integrated, delegates complex multi-step work naturally
|
|
398
298
|
|
|
399
299
|
#### Dimension 2: Communication (How well do they talk to AI?)
|
|
400
300
|
|
|
401
|
-
**
|
|
402
|
-
- **Prompt specificity** — Read actual user messages. Do they give context (what they're working on, why, what good looks like) or just fire off one-liners? Look for messages that include constraints, examples, or references.
|
|
403
|
-
- **Context loading** — Do they share relevant files, paste error messages, or reference previous work? Check for messages with file paths, code snippets, or "here's what I have so far."
|
|
404
|
-
- **Course correction** — When results aren't right, do they give useful feedback ("the tone is too formal, make it casual") or vague complaints ("no, try again")? Count specific vs. vague corrections.
|
|
405
|
-
- **Prompt evolution** — Compare early conversations to recent ones. Are prompts getting longer, more structured, more specific over time?
|
|
406
|
-
- **Frustration patterns** — Search logs for gave-up signals: "never mind", "forget it", "I'll do it myself". High frustration often signals communication gaps, not agent failures.
|
|
301
|
+
**Evidence to check:** Prompt specificity (context, constraints, examples vs. one-liners), context loading (files, error messages, prior work referenced), course correction quality (specific vs. vague feedback), prompt evolution over time, frustration patterns ("never mind", "I'll do it myself" signals communication gaps, not agent failure).
|
|
407
302
|
|
|
408
|
-
**Score
|
|
409
|
-
- 1: One-liner prompts, no context, frequent "that's wrong" without explanation
|
|
410
|
-
- 2: Some context but inconsistent, corrections are vague
|
|
411
|
-
- 3: Generally good prompts with context, gives useful corrections, understands what agents need
|
|
412
|
-
- 4: Structured prompts with clear goals/constraints, references files, iterates precisely
|
|
413
|
-
- 5: Expert prompting — provides context, constraints, examples, and success criteria upfront; rarely needs to correct
|
|
303
|
+
**Score:** 1=one-liners, no context, vague complaints · 2=some context but inconsistent · 3=good prompts with context, useful corrections · 4=structured prompts with goals/constraints, precise iteration · 5=expert — context, constraints, success criteria upfront; rarely needs to correct
|
|
414
304
|
|
|
415
305
|
#### Dimension 3: Knowledge (Do they understand how this works?)
|
|
416
306
|
|
|
417
|
-
**
|
|
418
|
-
- **Concept usage** — Do they reference AI/agent concepts correctly in conversation? Look for mentions of: system prompts, tools, MCPs, memory, context windows, models, tokens. Do they use these terms accurately?
|
|
419
|
-
- **Feature awareness** — Which platform features have they discovered and used? Cross-reference `features.used` and `features.neverUsed` from the learner profile. Someone who's never heard of goals vs. someone who tried and abandoned them are at different levels.
|
|
420
|
-
- **Troubleshooting ability** — When something goes wrong, do they diagnose it ("the agent doesn't have file access, can you add the Read tool?") or just report symptoms ("it's not working")? Look for messages that show understanding of *why* things work or don't.
|
|
421
|
-
- **Program completion** — How many programs have they completed? Did they engage deeply or speed through? Check `get_gym_progress` for completion depth.
|
|
422
|
-
- **Teaching moments** — In past coaching sessions, did they grasp concepts quickly or need repeated explanation? Check gym conversation logs for patterns.
|
|
307
|
+
**Evidence to check:** Correct use of AI concepts (system prompts, tools, MCPs, memory, context windows), feature awareness via `features.used`/`features.neverUsed`, troubleshooting ability (diagnose root cause vs. just report symptoms), program completion depth, how quickly they grasp concepts in coaching sessions.
|
|
423
308
|
|
|
424
|
-
**Score
|
|
425
|
-
- 1: Treats agents as magic black boxes, no concept understanding
|
|
426
|
-
- 2: Knows basics (agents answer questions) but fuzzy on how/why
|
|
427
|
-
- 3: Understands agent architecture, tools, prompts; can explain what an MCP does
|
|
428
|
-
- 4: Deep understanding — knows when to use memory vs. context, understands model limitations, can debug agent behavior
|
|
429
|
-
- 5: Could teach others — understands trade-offs, designs systems with AI constraints in mind
|
|
309
|
+
**Score:** 1=black box thinking, no concept understanding · 2=knows basics but fuzzy on how/why · 3=understands architecture, tools, prompts, can explain MCPs · 4=deep understanding, can debug agent behavior · 5=could teach others, designs with AI constraints in mind
|
|
430
310
|
|
|
431
311
|
#### Dimension 4: Orchestration (Can they coordinate multi-agent workflows?)
|
|
432
312
|
|
|
433
|
-
**
|
|
434
|
-
- **Automation setup** — Check `list_automations`. Do they have goals or crons? Are they enabled and actually running? Look for goals that have `lastRun` timestamps vs. goals that were created and forgotten.
|
|
435
|
-
- **Multi-agent patterns** — Do they use multiple agents in sequence for a workflow (e.g., one agent researches, another writes)? Search logs for cross-agent references ("send this to @writer", "ask @researcher").
|
|
436
|
-
- **Project usage** — Have they used projects to coordinate work across agents? Check `list_projects`.
|
|
437
|
-
- **Delegation patterns** — Do they delegate between agents or do everything through one agent? Look for `delegate_message` usage in logs.
|
|
438
|
-
- **Scheduling sophistication** — Are crons simple reminders or sophisticated workflows? Check cron configs for complexity.
|
|
313
|
+
**Evidence to check:** Active automations (`list_automations` — goals with `lastRun` timestamps vs. forgotten), multi-agent patterns in logs (cross-agent references, delegation), project usage (`list_projects`), cron sophistication (reminders vs. real workflows).
|
|
439
314
|
|
|
440
|
-
**Score
|
|
441
|
-
- 1: Everything goes through one agent, no automation
|
|
442
|
-
- 2: Multiple agents exist but used independently, maybe one simple cron
|
|
443
|
-
- 3: Some cross-agent workflows, active goals or crons that run regularly
|
|
444
|
-
- 4: Orchestrated multi-agent systems, projects coordinating work, delegation chains
|
|
445
|
-
- 5: Sophisticated automation — agents trigger other agents, goals drive workflows, minimal manual intervention
|
|
315
|
+
**Score:** 1=one agent, no automation · 2=multiple agents used independently, maybe one cron · 3=cross-agent workflows, active goals/crons · 4=orchestrated systems, projects, delegation chains · 5=agents trigger agents, goals drive workflows, minimal manual intervention
|
|
446
316
|
|
|
447
317
|
#### Dimension 5: Craft (Can they build and tune AI systems?)
|
|
448
318
|
|
|
449
|
-
**
|
|
450
|
-
|
|
451
|
-
|
|
452
|
-
- **MCP configuration** — Have they connected external services? Do the MCPs match the agent's purpose (e.g., a DevOps agent with GitHub MCP)?
|
|
453
|
-
- **Workspace setup** — Do agents point at real project directories, or all default to `~`? Workspace specificity signals understanding of agent scoping.
|
|
454
|
-
- **Iteration on design** — Have they updated agent configs over time? Check if agents have been modified since creation (updated descriptions, refined prompts, added/removed tools). An agent that's been tuned shows craft maturity.
|
|
455
|
-
|
|
456
|
-
**Score guide:**
|
|
457
|
-
- 1: Only default/platform agents, no customization
|
|
458
|
-
- 2: Created 1-2 agents but with minimal/generic system prompts
|
|
459
|
-
- 3: Multiple custom agents with real prompts, some tool curation
|
|
460
|
-
- 4: Well-designed agents with specific prompts, curated tools, MCPs, and real workspaces
|
|
461
|
-
- 5: Expert builder — agents are tailored, tested, iterated on; system prompts are detailed; tool/MCP selection is intentional and minimal
|
|
319
|
+
**Evidence to check:** System prompt quality in custom agents (`get_agent` — specific/constrained vs. generic/empty), tool curation (curated sets vs. defaults — intentional minimalism shows craft), MCP configuration (services match agent purpose), workspace specificity (real project dirs vs. all `~`), iteration on design (agents updated over time vs. created and forgotten).
|
|
320
|
+
|
|
321
|
+
**Score:** 1=no customization, default agents only · 2=1-2 agents with minimal prompts · 3=multiple custom agents, real prompts, some tool curation · 4=specific prompts, curated tools, MCPs, real workspaces · 5=tailored, tested, iterated — intentional and minimal tool/MCP selection
|
|
462
322
|
|
|
463
323
|
#### Step 6: Synthesize
|
|
464
324
|
|
package/config.example.json
CHANGED