npm - @aiagenta2z/agtm - Versions diffs - 1.0.8 → 1.1.0 - Mend

@aiagenta2z/agtm 1.0.8 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +40 -6
package/data/config/hints/base_hints.json +2 -0
package/dist/agtm-cli.js +286 -25
package/docs/skills/README.md +110 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,18 +1,18 @@
 ### agtm: CLI Tool for AI Agent Management, Skills, Agent Registry, Benchmarks and Hints in AI Agent Marketplace
-[GitHub](https://github.com/aiagenta2z/agtm)|[AI Agent Marketplace CLI Doc](https://www.deepnlp.org/doc/ai_agent_marketplace)|[DeepNLP AI Agent Marketplace](https://www.deepnlp.org/store/ai-agent) | [OneKey Agent Router](https://www.deepnlp.org/agent/onekey-mcp-router) | [Agent MCP OneKey Router Ranking](https://www.deepnlp.org/agent/rankings) | [NodeJS agtm](https://www.npmjs.com/package/@aiagenta2z/agtm)
+[GitHub](https://github.com/aiagenta2z/agtm)|[AI Agent Marketplace CLI Doc](https://www.deepnlp.org/doc/ai_agent_marketplace)|[DeepNLP AI Agent Marketplace](https://www.deepnlp.org/store/ai-agent) | [OneKey Gateway](https://deepnlp.org/doc/onekey_gateway) | [Agent MCP OneKey Router Ranking](https://www.deepnlp.org/agent/rankings) | [NodeJS agtm](https://www.npmjs.com/package/@aiagenta2z/agtm)
 `agtm` (AI Agent Management CLI) unifies skill management, agent registration, marketplace search, and provider CLI execution. Install skills from GitHub, log and rate skill runs, upload agent metadata to registries, query the public marketplace, and run agent toolchains with fuzzy hints.
-Features
+## Features
 *`agtm skills`*: Manage Skills, Add Skills, List Skills, Log Skills Performance, Skills performance Evaluator, compare to realworld benchmarks
 *`agtm upload`*: AI Agent Registry, register local agent meta information of json or yaml format(agent.json/agent.yaml) or sync your github source meta including README.md
 *`agtm search`*: Search the open source AI Agent Marketplace, including github community, huggingface community, product hunt community, deepnlp ai agent marketplace index, etc
 *`agtm run`*: Run agent clis, don't need to remember, with the powerful hints and completion ability, just type a few characters and "--hint" will help you complete the command line.
-Furthermore, `agtm` provides memory to track skill outputs and enables performance rating against industry job level benchmarks. This allows you to score each skill execution and assign a professional tier to your AI Agent's capabilities—for example, evaluating its performance as equivalent to that of an L3 or L5 software engineer, marketing prefessional, etc.
+Furthermore, `agtm` provides memory to track skill outputs and enables performance rating against industry job level benchmarks. This allows you to score each skill execution and assign a professional tier to your AI Agent's capabilities—for example, evaluating its performance as equivalent to that of an L3 or L5 software engineer, marketing professional, etc.
 ```shell
 skill_id             run_times  score  level
@@ -29,6 +29,12 @@ code_success_skills  5          0.9     L3(100%)
 npm install -g @aiagenta2z/agtm
 ```
+Setup hint and skills benchmark
+```shell
+agtm setup --levels  ## Needed before `agtm rate`, to sync the benchmarks json to local folder
+agtm setup --hint    ## Needed before `agtm run`
+```
 Agtm CLI Options
 | CLI         | Command and Options                       | Document                       |
@@ -171,8 +177,16 @@ The `run` command executes agent workflows with interactive hints and fuzzy CLI
 Let's say you want to run an agent command of Playwright to go to a URL and fetch a webpage. You don't need to remember the full command—type `play`, pick the provider, then pick the CLI action.
 ### Usage
+Remember to setup hint before running the agent-cli
+```shell
+agtm setup --hint
+```
 ```
 agtm run <provider_unique_id> <agent_cli>
 ```
@@ -180,9 +194,7 @@ agtm run <provider_unique_id> <agent_cli>
 ### Example
 ```shell
-rockingdingo@rockingdingodeMacBook-Pro skills_cli % agtm run play
-DEBUG: Entering Human Mode | idArg play | commandArgs  | options [object Object] | hasHints true | hints [object Object]
+agtm run play
 Skill ID suggestions:
   1. microsoft/playwright-cli
   2. googleworkspace/cli
@@ -246,6 +258,28 @@ agtm upload --config ./agent.json --endpoint https://www.deepnlp.org/api/ai_agen
 agtm upload --config ./agent.json --endpoint https://www.aiagenta2z.com/api/ai_agent_marketplace/registry --schema ./schema.json
 ```
+### Skills Agtm-Cli
+We provide Skills repo to use in various agents to evaluate skills and run agent hints.
+The skills can be found in ./skills/ folder
+| skill | description |
+| ---- | ---- |
+| agent-cli-hint-completion | This skill uses `agtm run --mode agent` to help hint agents clis usage |
+| agent-skills-evaluator | This skill use `agtm skills log` and `agtm skills rate` to track other skills performance from LLM-based evaluator, match to professional Job Level Benchmarks, such as Google L3 level software engineers/ Apple M3 level marketing specialist performance. |
+```shell
+npx agtm skills add aiagenta2z/agtm ## install all the skill evaluation and skill cli-hints
+npx agtm skills add aiagenta2z/agtm -s agent-skills-evaluator
+```
+```shell
+npx skills add aiagenta2z/agtm ## install all the skill evaluation and skill cli-hints
+npx skills add aiagenta2z/agtm -s agent-skills-evaluator
+```
 ### Contributing
 #### Agent CLI List

package/data/config/hints/base_hints.json CHANGED Viewed

@@ -116,3 +116,5 @@
     ]
   }
 }

package/dist/agtm-cli.js CHANGED Viewed

@@ -9,6 +9,8 @@ import { execFileSync, spawn } from 'node:child_process';
 import { createInterface } from 'node:readline/promises';
 import { fileURLToPath } from 'node:url';
 import { randomUUID } from 'node:crypto';
+//production setup
+const LOG_ENABLE = false;
 // --- Configuration ---
 const BASE_URL = 'https://www.deepnlp.org/api/ai_agent_marketplace';
 const REGISTRY_ENDPOINT = `${BASE_URL}/registry`;
@@ -19,7 +21,6 @@ const MOCK_RETURN_URL = "https://www.deepnlp.org/store/ai-agent/ai-agent/pub-AI-
 const CLI_DIR = path.dirname(fileURLToPath(import.meta.url));
 const MODE_AGENT = 'agent';
 const MODE_HUMAN = 'human';
-const LOG_ENABLE = true;
 const AGTM_LOCAL_DIR = path.join(process.cwd(), '.agtm');
 const AGTM_GLOBAL_DIR = path.join(os.homedir(), '.agtm');
 const SKILL_LOG_DIR_LOCAL = path.join(AGTM_LOCAL_DIR, 'skills', 'log');
@@ -636,7 +637,7 @@ function loadLevelDescriptions(levelFile) {
         return null;
     }
 }
-const DEFAULT_EVAL_PROMPT = 'You are an evaluator of skill performance. Score each example from 0.0 to 1.0 and assign a level based on benchmarks. Return JSON only.';
+const DEFAULT_EVAL_SYSTEM_PROMPT = 'System Prompt: You are an evaluator of skill performance. Score each example from 0.0 to 1.0 and assign a level based on benchmarks. Return JSON only. Please output json in format of {"skill_id": <skill_id>, "results": [{"log_id": "<log_id_1>", "score": 1.0, "level": "L3", **extra},{"log_id": "<log_id_2>", "score": 1.0, "level": "L3", **extra}]}';
 const BENCHMARK_TOP_K = 3;
 function benchmarkKey(obj) {
     if (!obj || typeof obj !== 'object')
@@ -683,17 +684,79 @@ async function handleSkillsRatePrepare(options) {
         console.error(`\n❌ Error: No logs found for skill '${skillId}' in ${logDir}.`);
         process.exit(1);
     }
+    var userInputPrompt = `User prompt: ${(options.prompt || "")}`;
+    var mergeInstruction = DEFAULT_EVAL_SYSTEM_PROMPT + "\n" + userInputPrompt;
     const levelsData = loadLevelDescriptions(options.benchmark);
     const benchmarks = normalizeBenchmarks(skillId, levelsData).slice(0, BENCHMARK_TOP_K);
     const payload = {
         skill_id: skillId,
         benchmarks,
         logs: logs.map(({ log_id, input, output }) => ({ log_id, input, output })),
-        instructions: options.prompt || DEFAULT_EVAL_PROMPT
+        instructions: mergeInstruction
     };
     console.log(JSON.stringify(payload, null, 2));
 }
 async function handleSkillsRateApply(options) {
+    const skillId = options.skill_id;
+    if (!skillId) {
+        console.error('\n❌ Error: --skill_id is required.');
+        process.exit(1);
+    }
+    if (!options.result) {
+        console.error('\n❌ Error: --result <json or base64> is required.');
+        process.exit(1);
+    }
+    let parsed;
+    try {
+        let raw = options.result;
+        // Attempt base64 decode if JSON parsing fails
+        try {
+            parsed = JSON.parse(raw);
+        }
+        catch {
+            // try decode base64
+            raw = Buffer.from(raw, 'base64').toString('utf8');
+            parsed = JSON.parse(raw);
+        }
+    }
+    catch (e) {
+        console.error(`\n❌ Error: invalid JSON/base64 for --result: ${e.message}`);
+        process.exit(1);
+    }
+    const results = Array.isArray(parsed?.results) ? parsed.results : [];
+    if (results.length === 0) {
+        console.error('\n❌ Error: --result must contain a non-empty "results" array.');
+        process.exit(1);
+    }
+    const logDir = getLogDir(options.logDir);
+    const logs = loadLogs(logDir).filter(l => l.skill_id === skillId);
+    const byId = new Map(logs.map(l => [l.log_id, l]));
+    let updated = 0;
+    const missing = [];
+    for (const item of results) {
+        const id = item?.log_id;
+        if (!id || !byId.has(id)) {
+            missing.push(String(id || 'unknown'));
+            continue;
+        }
+        const entry = byId.get(id);
+        // Support both 'score' and 'rating'
+        if (item.rating !== undefined)
+            entry.rating = Number(item.rating);
+        if (item.score !== undefined)
+            entry.rating = Number(item.score);
+        if (item.level !== undefined)
+            entry.level = String(item.level);
+        // Optional rationale
+        if (item.rationale !== undefined)
+            entry.rationale = String(item.rationale);
+        const target = path.join(logDir, `${entry.log_id}.json`);
+        fs.writeFileSync(target, JSON.stringify(entry, null, 2), 'utf8');
+        updated += 1;
+    }
+    console.log(JSON.stringify({ status: 'success', updated, missing }, null, 2));
+}
+async function handleSkillsRateApplyBak(options) {
     const skillId = options.skill_id;
     if (!skillId) {
         console.error('\n❌ Error: --skill_id is required.');
@@ -917,7 +980,7 @@ function fuzzyScore(query, candidate) {
     return 0.7 * editScore + 0.3 * tokenScore;
 }
 function createTrie() {
-    return { children: new Map(), values: new Set() };
+    return { children: new Map(), terminalValues: new Set() };
 }
 function insertTrie(trie, key, value) {
     let node = trie;
@@ -932,8 +995,8 @@ function insertTrie(trie, key, value) {
             node.children.set(ch, created);
             node = created;
         }
-        node.values.add(value);
     }
+    node.terminalValues.add(value);
 }
 function searchTrie(trie, prefix, limit) {
     let node = trie;
@@ -944,9 +1007,57 @@ function searchTrie(trie, prefix, limit) {
             return [];
         }
     }
-    const values = Array.from(node.values);
-    values.sort((a, b) => a.localeCompare(b));
-    return values.slice(0, limit);
+    const out = [];
+    const seen = new Set();
+    const dfs = (current) => {
+        if (out.length >= limit)
+            return;
+        const terminal = Array.from(current.terminalValues).sort((a, b) => a.localeCompare(b));
+        for (const value of terminal) {
+            if (out.length >= limit)
+                return;
+            if (seen.has(value))
+                continue;
+            seen.add(value);
+            out.push(value);
+        }
+        const keys = Array.from(current.children.keys()).sort((a, b) => a.localeCompare(b));
+        for (const key of keys) {
+            if (out.length >= limit)
+                return;
+            dfs(current.children.get(key));
+        }
+    };
+    dfs(node);
+    return out;
+}
+function trieToPersisted(node) {
+    const children = {};
+    for (const [key, child] of node.children.entries()) {
+        children[key] = trieToPersisted(child);
+    }
+    return {
+        children: Object.keys(children).length ? children : undefined,
+        terminalValues: node.terminalValues.size ? Array.from(node.terminalValues).sort((a, b) => a.localeCompare(b)) : undefined
+    };
+}
+function persistedToTrie(node) {
+    const trie = createTrie();
+    if (Array.isArray(node.terminalValues)) {
+        for (const value of node.terminalValues) {
+            if (typeof value === 'string' && value.trim()) {
+                trie.terminalValues.add(value);
+            }
+        }
+    }
+    if (node.children && typeof node.children === 'object') {
+        for (const [key, child] of Object.entries(node.children)) {
+            if (!child || typeof child !== 'object')
+                continue;
+            trie.children.set(key, persistedToTrie(child));
+        }
+    }
+    return trie;
 }
 function mergeHints(target, source) {
     for (const [id, entry] of Object.entries(source)) {
@@ -997,6 +1108,26 @@ function writeHintsFile(filePath, hints) {
     ensureDir(path.dirname(filePath));
     fs.writeFileSync(filePath, JSON.stringify(hints, null, 2));
 }
+function writeHintsTrieFile(filePath, trie) {
+    ensureDir(path.dirname(filePath));
+    fs.writeFileSync(filePath, JSON.stringify(trieToPersisted(trie), null, 2), 'utf8');
+}
+function loadHintsTrieFile(filePath) {
+    if (!fs.existsSync(filePath)) {
+        return null;
+    }
+    try {
+        const raw = fs.readFileSync(filePath, 'utf8');
+        const parsed = JSON.parse(raw);
+        if (!parsed || typeof parsed !== 'object') {
+            return null;
+        }
+        return persistedToTrie(parsed);
+    }
+    catch {
+        return null;
+    }
+}
 function findBundledHintsDir() {
     const candidates = [
         path.resolve(CLI_DIR, 'data', 'config', 'hints'),
@@ -1045,6 +1176,14 @@ async function loadBundledHints() {
     return merged;
 }
 function getHintsPath(useGlobal) {
+    const baseDir = useGlobal ? AGTM_GLOBAL_DIR : AGTM_LOCAL_DIR;
+    return path.join(baseDir, 'hints', 'hints.json');
+}
+function getHintsTriePath(useGlobal) {
+    const baseDir = useGlobal ? AGTM_GLOBAL_DIR : AGTM_LOCAL_DIR;
+    return path.join(baseDir, 'hints', 'hints_trie.json');
+}
+function getOldHintsPath(useGlobal) {
     if (useGlobal) {
         return path.join(AGTM_GLOBAL_DIR, 'hints.json');
     }
@@ -1063,6 +1202,8 @@ function loadCombinedHints(useGlobal) {
     const localHints = loadHintsFile(getHintsPath(false));
     mergeHints(combined, globalHints);
     mergeHints(combined, localHints);
+    mergeHints(combined, loadHintsFile(getOldHintsPath(true)));
+    mergeHints(combined, loadHintsFile(getOldHintsPath(false)));
     mergeHints(combined, loadHintsFile(getLegacyHintsPath(true)));
     mergeHints(combined, loadHintsFile(getLegacyHintsPath(false)));
     if (useGlobal) {
@@ -1101,6 +1242,49 @@ function filterCliHints(hints, query, limit) {
     const sorted = [...hints].sort((a, b) => a.cli.localeCompare(b.cli));
     return sorted.slice(0, limit);
 }
+function highlightMatches(text, query) {
+    const trimmed = query.trim();
+    if (!trimmed)
+        return text;
+    const tokens = trimmed
+        .toLowerCase()
+        .split(/[^a-z0-9]+/g)
+        .map((t) => t.trim())
+        .filter(Boolean);
+    if (tokens.length === 0)
+        return text;
+    const lower = text.toLowerCase();
+    const ranges = [];
+    for (const token of tokens) {
+        let idx = lower.indexOf(token);
+        while (idx !== -1) {
+            ranges.push([idx, idx + token.length]);
+            idx = lower.indexOf(token, idx + 1);
+        }
+    }
+    if (ranges.length === 0)
+        return text;
+    ranges.sort((a, b) => a[0] - b[0] || a[1] - b[1]);
+    const merged = [];
+    for (const [start, end] of ranges) {
+        const last = merged[merged.length - 1];
+        if (!last || start > last[1]) {
+            merged.push([start, end]);
+        }
+        else {
+            last[1] = Math.max(last[1], end);
+        }
+    }
+    let out = '';
+    let cursor = 0;
+    for (const [start, end] of merged) {
+        out += text.slice(cursor, start);
+        out += green(text.slice(start, end));
+        cursor = end;
+    }
+    out += text.slice(cursor);
+    return out;
+}
 async function promptSelection(prompt, options) {
     if (!process.stdin.isTTY) {
         return options.length > 0 ? options[0] : null;
@@ -1124,7 +1308,7 @@ async function promptSelection(prompt, options) {
         rl.close();
     }
 }
-async function promptCommandLine(promptText) {
+async function promptCommandLineBase(promptText) {
     if (!process.stdin.isTTY) {
         return null;
     }
@@ -1138,7 +1322,33 @@ async function promptCommandLine(promptText) {
         rl.close();
     }
 }
-async function selectSkillId(hints, input, limit = 5) {
+import readline from 'readline';
+async function promptCommandLine(promptText, defaultValue) {
+    if (!process.stdin.isTTY)
+        return null;
+    const rl = readline.createInterface({
+        input: process.stdin,
+        output: process.stdout,
+    });
+    try {
+        return await new Promise((resolve) => {
+            rl.question(promptText, (answer) => {
+                rl.close();
+                const trimmed = answer.trim();
+                resolve(trimmed || defaultValue || null);
+            });
+            // Pre-fill default value and move cursor to end
+            if (defaultValue) {
+                rl.write(defaultValue);
+            }
+        });
+    }
+    finally {
+        // just in case
+        rl.close();
+    }
+}
+async function selectSkillId(hints, input, limit = 5, trie) {
     const ids = Object.keys(hints);
     if (ids.length === 0) {
         return null;
@@ -1146,9 +1356,9 @@ async function selectSkillId(hints, input, limit = 5) {
     if (input && hints[input]) {
         return input;
     }
-    const trie = buildIdTrie(hints);
+    const activeTrie = trie || buildIdTrie(hints);
     const prefix = input || '';
-    let suggestions = searchTrie(trie, prefix, limit);
+    let suggestions = searchTrie(activeTrie, prefix, limit);
     if (suggestions.length === 0 && prefix) {
         const scored = ids
             .map((id) => ({ id, score: fuzzyScore(prefix, id) }))
@@ -1159,11 +1369,21 @@ async function selectSkillId(hints, input, limit = 5) {
     if (suggestions.length === 0) {
         return null;
     }
-    console.log('\nSkill ID suggestions:');
+    let printedLines = 0;
+    const trackedLog = (message = '') => {
+        console.log(message);
+        printedLines += countConsoleLogLines(message);
+    };
+    trackedLog('');
+    trackedLog('Skill ID suggestions:');
     suggestions.forEach((value, index) => {
-        console.log(`  ${index + 1}. ${value}`);
+        trackedLog(`  ${index + 1}. ${highlightMatches(value, prefix)}`);
     });
-    const selected = await promptSelection('\nSelect skill id (number or id): ', suggestions);
+    const selected = await promptSelection('Select skill id (number or id): ', suggestions);
+    printedLines += 1; // prompt line
+    if (process.stdin.isTTY && process.stdout.isTTY) {
+        clearLastLines(printedLines + 1); // +1 for the post-input newline line
+    }
     console.log(`Selected Skill/Cli is ${selected}`);
     if (!selected) {
         return null;
@@ -1186,13 +1406,23 @@ async function selectCliHint(hints, query, limit = 5) {
     if (suggestions.length === 0) {
         return null;
     }
-    console.log('\nCommand hints:');
+    let printedLines = 0;
+    const trackedLog = (message = '') => {
+        console.log(message);
+        printedLines += countConsoleLogLines(message);
+    };
+    trackedLog('');
+    trackedLog('Command hints:');
     suggestions.forEach((item, index) => {
         const hintText = item.hint ? ` # ${item.hint}` : '';
-        console.log(`  ${index + 1}. ${item.cli}${hintText}`);
+        trackedLog(`  ${index + 1}. ${highlightMatches(item.cli, query || '')}${hintText}`);
     });
     const options = suggestions.map((item) => item.cli);
-    const selected = await promptSelection('\nSelect command (number or input custom): ', options);
+    const selected = await promptSelection('Select command (number or input custom): ', options);
+    printedLines += 1; // prompt line
+    if (process.stdin.isTTY && process.stdout.isTTY) {
+        clearLastLines(printedLines + 1); // +1 for the post-input newline line
+    }
     if (!selected) {
         return null;
     }
@@ -1212,16 +1442,22 @@ async function handleSetup(options) {
     if (options.hint) {
         const bundled = await loadBundledHints();
         const targetPath = getHintsPath(useGlobal);
+        const targetTriePath = getHintsTriePath(useGlobal);
         const legacyPath = getLegacyHintsPath(useGlobal);
         const existing = loadHintsFile(targetPath);
+        const existingOld = loadHintsFile(getOldHintsPath(useGlobal));
         const merged = {};
         mergeHints(merged, bundled);
+        mergeHints(merged, existingOld);
         mergeHints(merged, existing);
         writeHintsFile(targetPath, merged);
+        const trieSource = loadCombinedHints(useGlobal);
+        writeHintsTrieFile(targetTriePath, buildIdTrie(trieSource));
         if (fs.existsSync(path.dirname(legacyPath))) {
             writeHintsFile(legacyPath, merged);
         }
         console.log(`\n✅ Hints cache updated at ${targetPath}`);
+        console.log(`✅ Hints trie updated at ${targetTriePath}`);
     }
     if (options['levels']) {
         const bundledLevelsDir = findBundledLevelsDir();
@@ -1235,6 +1471,26 @@ async function handleSetup(options) {
         console.log(`\n✅ Levels copied to ${targetDir}`);
     }
 }
+function clearScreen() {
+    // process.stdout.write('\x1Bc');
+    process.stdout.write('\x1b[0f');
+}
+function clearLastLines(n) {
+    if (!process.stdout.isTTY)
+        return;
+    for (let i = 0; i < n; i++) {
+        process.stdout.write('\x1b[2K'); // clear current line
+        if (i < n - 1) {
+            process.stdout.write('\x1b[1A'); // move cursor up
+        }
+    }
+    process.stdout.write('\x1b[0G'); // move to start of line
+}
+function countConsoleLogLines(message) {
+    if (message === '')
+        return 1;
+    return message.split('\n').length;
+}
 async function handleRun(idArg, commandArgs = [], options = {}) {
     const isAgent = (options.mode || 'human').toLowerCase() === MODE_AGENT;
     // first load local hints
@@ -1262,6 +1518,7 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
             runtimeHints = await loadBundledHints();
         }
         const activeHints = hasHints ? hints : (runtimeHints || {});
+        const cachedIdTrie = hasHints ? loadHintsTrieFile(getHintsTriePath(false)) : null;
         const ids = Object.keys(activeHints);
         if (ids.length === 0) {
             console.error('\n❌ Error: No hints available.');
@@ -1269,7 +1526,7 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
         }
         if (!idArg || !activeHints[idArg]) {
             const query = idArg || '';
-            const trie = buildIdTrie(activeHints);
+            const trie = cachedIdTrie || buildIdTrie(activeHints);
             let suggestions = searchTrie(trie, query, 2);
             if (suggestions.length === 0 && query) {
                 const scored = ids
@@ -1280,7 +1537,7 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
             }
             console.log('\nSkill ID suggestions:');
             suggestions.forEach((value, index) => {
-                console.log(`  ${index + 1}. ${value}`);
+                console.log(`  ${index + 1}. ${highlightMatches(value, query)}`);
                 const entry = activeHints[value];
                 if (entry?.hints?.length) {
                     const preview = entry.hints.slice(0, 2).map((h) => `${h.cli}${h.hint ? ` # ${h.hint}` : ''}`);
@@ -1343,24 +1600,27 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
         if (LOG_ENABLE) {
             console.log(`DEBUG: Entering Human Mode | idArg ${idArg} | commandArgs ${commandArgs} | options ${options} | hasHints ${hasHints} | hints ${hints}`);
         }
+        const cachedIdTrie = hasHints ? loadHintsTrieFile(getHintsTriePath(false)) : null;
         // human mode with pause for cli input
         if (!idArg) {
             if (!hasHints) {
                 console.error('\n❌ Error: No hints cache found. Run `agtm setup --hint` first.');
                 process.exit(1);
             }
-            const selected = await selectSkillId(hints);
+            const selected = await selectSkillId(hints, undefined, 5, cachedIdTrie);
             if (!selected) {
                 console.error('\n❌ Error: No skill id selected.');
                 process.exit(1);
             }
             idArg = selected;
+            // clearScreen();
         }
         else if (hasHints && !hints[idArg]) {
-            const selected = await selectSkillId(hints, idArg);
+            const selected = await selectSkillId(hints, idArg, 5, cachedIdTrie);
             if (selected) {
                 idArg = selected;
             }
+            // clearScreen();
         }
         let finalCommandArgs = commandArgs;
         if (hasHints && idArg && finalCommandArgs.length > 0 && hints[finalCommandArgs[0]]) {
@@ -1383,7 +1643,7 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
         if (!finalCommandArgs || finalCommandArgs.length === 0) {
             let chosen = null;
             if (hintEntry?.hints && hintEntry.hints.length > 0) {
-                const query = await promptCommandLine('\nEnter command (leave empty to list hints): ');
+                const query = await promptCommandLine(`\nEnter command to run (leave empty to list cli hints): `, ``);
                 const searchQuery = query || '';
                 chosen = await selectCliHint(hintEntry.hints, searchQuery);
             }
@@ -1391,7 +1651,7 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
                 finalCommandArgs = chosen.cli.split(/\s+/).filter(Boolean);
             }
             else {
-                const manual = await promptCommandLine('\nEnter command to run: ');
+                const manual = await promptCommandLine('\nEnter command line to run: ', ``);
                 if (!manual) {
                     console.error('\n❌ Error: No command selected.');
                     process.exit(1);
@@ -1411,7 +1671,8 @@ async function handleRun(idArg, commandArgs = [], options = {}) {
             process.exit(1);
         }
         const finalCommandLine = finalCommandArgs.join(' ');
-        const edited = await promptCommandLine(`\nFinal command [${finalCommandLine}]: `);
+        console.log("\nComplete the Cli with your arguments or leave blank and press Enter");
+        const edited = await promptCommandLine(`\nFinal command line [${finalCommandLine}]:\n`, `${finalCommandLine}`);
         if (edited && edited.trim()) {
             finalCommandArgs = edited.split(/\s+/).filter(Boolean);
         }

package/docs/skills/README.md CHANGED Viewed

@@ -82,6 +82,115 @@ To use the rate command, have to setup the benchmark levels configuration. save
 agtm setup --levels
 ```
+#### Description
+This skill run `agent rate` command line to evaluate
+The Agtm Skills CLI manages local skill bundles for supported agents (for example `claude-code`, `codex`, `openclaw`). It can download skills from GitHub, install them into the correct agent folders, list what is installed, record run logs, and apply rating benchmarks.
+It also serves as a benchmarking tool to evaluate skill outputs:
+**Benchmark** your AI agent against real-world standards — from Google-level engineering to Apple-caliber product launches.
+**Rate** performance of each run with structured scores and levels, helping agents like Claude Code choose the right skills more effectively.
+#### Usage
+Each time after your agent runs a skills, it runs a follow up skill agent-skills-evaluator to track
+the log of this run with input, output summarized, keep them in a log file based memory.
+Then it calls the `agtm skills log` and  `agtm skills rate`, `agtm skills rate show`
+`agtm skills log`: keep track of skills running in a local cache json log file
+`agtm skills rate prepare`: Fetch the evaluator and benchmarks.json, load the criteria of evaluation, such as job levels, task fullfillment.
+`agtm skills rate apply`: Append the LLM Based Evaluator to the local results.
+`agtm skills rate show`: Show the table of historical scores, level ratings.
+```
+agtm skills log <skill_id> --data '<json_payload>'
+agtm skills rate prepare --skill_id <skill_id> --prompt "<eval_prompt>" --benchmark <path/benchmark.json>
+agtm skills rate apply   --skill_id <skill_id> --result '<result_json: log_id>'
+agtm skills rate show    --skill_id <skill_id>
+```
+#### Example
+Note: `code_success_skills` is a dummy skill which always produce success results, `code_fail_skills` is a dummy skill which always produce failure results,
+```shell
+## log command will output a log_id
+agtm skills log code_success_skills --data '{"input":"generate sql","output":"ok","meta":{"agent":"claude-code"}}'
+agtm skills rate prepare --skill_id code_success_skills --prompt "Evaluate the code execution results"
+agtm skills rate apply --skill_id code_success_skills --result '{"results":[{"log_id":"3679a3fe-4d97-4eb1-83bc-f83d711be195","rating":0.90,"level":"L4"}]}'
+agtm skills rate show  ## show the historical skills dashboard, including score, evaluation levels
+```
+Note:
+- Persists a run record at `.agtm/skills/log/<uuid>.json` (or the `--logDir` you supply).
+- `<json_payload>` must contain at least `input` and `output`; optional fields (meta, rating, level) are accepted.
+#### Pipeline
+**Step 1. Add log to memory**
+```
+agtm skills log code_success_skills --data '{"input":"generate sql","output":"ok","meta":{"agent":"claude-code"}}'
+agtm skills log code_fail_skills --data '{"input":"generate sql","output":"failure","meta":{"agent":"claude-code"}}'
+```
+It will generate a {log_id}.json as memory
+```shell
+✅ Saved log to .agtm/skills/log/96c216f1-edc5-40f3-b041-b01a68b137a1.json
+```
+**Step 2. Prepare Evaluation prompt**
+Prepare (<input, output>, benchmark) for LLM to compare the <input,output> with the benchmark..
+```shell
+agtm skills rate prepare --skill_id code_success_skills --prompt "Evaluate the code execution results"
+agtm skills rate prepare --skill_id code_fail_skills --prompt "Evaluate the code execution results"
+```
+```shell
+{"skill_id":"code_success_skills","benchmarks":[{"software-engineering":{"Google":[{"level":"L3","title":"Software Engineer II","description":"Entry-level engineer. Delivers well-scoped tasks with guidance. Learning codebase, tools, and best practices.","signals":["task execution","learning velocity","code quality basics"]},{"level":"L4","title":"Software Engineer III","description":"Independent contributor. Owns small features end-to-end. Writes maintainable code and participates in design discussions.","signals":["ownership","code quality","debugging ability"]},{"level":"L5","title":"Senior Software Engineer","description":"Leads projects and drives design decisions. Mentors others and improves system quality.","signals":["technical leadership","system design","mentorship"]},{"level":"L6","title":"Staff Software Engineer","description":"Owns large systems or cross-team initiatives. Sets technical direction and influences multiple teams.","signals":["architecture","cross-team impact","scalability thinking"]},{"level":"L7","title":"Senior Staff Software Engineer","description":"Drives org-level technical strategy. Solves ambiguous, high-impact problems.","signals":["org influence","complex problem solving","long-term vision"]},{"level":"L8","title":"Principal Engineer","description":"Company-wide impact. Defines technical standards and long-term architecture.","signals":["company impact","vision","industry-level thinking"]}]}}],"logs":[{"log_id":"1db0e927-79f1-46c2-b6dd-200d567f631d","input":"generate sql","output":"ok"},{"log_id":"94a2fae9-80ff-4b18-a77a-5714d34bcc20","input":"generate sql","output":"ok"},{"log_id":"96c216f1-edc5-40f3-b041-b01a68b137a1","input":"generate sql","output":"ok"},{"log_id":"b1f76f33-6f45-41e3-ae14-6b598f6aa357","input":"generate sql","output":"ok"}],"instructions":"System Prompt: You are an evaluator of skill performance. Score each example from 0.0 to 1.0 and assign a level based on benchmarks. Return JSON only. Please output json in format of {\"skill_id\": <skill_id>, \"results\": [{\"log_id\": \"<log_id_1>\", \"score\": 1.0, \"level\": \"L3\", **extra},{\"log_id\": \"<log_id_2>\", \"score\": 1.0, \"level\": \"L3\", **extra}]}\nUser prompt: Evaluate the code execution results"}
+{"skill_id":"code_fail_skills","benchmarks":[{"software-engineering":{"Google":[{"level":"L3","title":"Software Engineer II","description":"Entry-level engineer. Delivers well-scoped tasks with guidance. Learning codebase, tools, and best practices.","signals":["task execution","learning velocity","code quality basics"]},{"level":"L4","title":"Software Engineer III","description":"Independent contributor. Owns small features end-to-end. Writes maintainable code and participates in design discussions.","signals":["ownership","code quality","debugging ability"]},{"level":"L5","title":"Senior Software Engineer","description":"Leads projects and drives design decisions. Mentors others and improves system quality.","signals":["technical leadership","system design","mentorship"]},{"level":"L6","title":"Staff Software Engineer","description":"Owns large systems or cross-team initiatives. Sets technical direction and influences multiple teams.","signals":["architecture","cross-team impact","scalability thinking"]},{"level":"L7","title":"Senior Staff Software Engineer","description":"Drives org-level technical strategy. Solves ambiguous, high-impact problems.","signals":["org influence","complex problem solving","long-term vision"]},{"level":"L8","title":"Principal Engineer","description":"Company-wide impact. Defines technical standards and long-term architecture.","signals":["company impact","vision","industry-level thinking"]}]}}],"logs":[{"log_id":"2e5513e7-27ae-4636-9d21-4b57ec9f739b","input":"generate sql","output":"failure"},{"log_id":"563747fb-ea62-4ebc-80c4-1bc1d1c82ed5","input":"generate sql","output":"failure"},{"log_id":"db699754-b1fd-491c-a49f-2af1a41ad1f7","input":"generate sql","output":"failure"}],"instructions":"System Prompt: You are an evaluator of skill performance. Score each example from 0.0 to 1.0 and assign a level based on benchmarks. Return JSON only. Please output json in format of {\"skill_id\": <skill_id>, \"results\": [{\"log_id\": \"<log_id_1>\", \"score\": 1.0, \"level\": \"L3\", **extra},{\"log_id\": \"<log_id_2>\", \"score\": 1.0, \"level\": \"L3\", **extra}]}\nUser prompt: Evaluate the code execution results"}
+```
+**Step 3. Local Agent Run the evaluation prompt of step 2.**
+Your Agent give {"score": double, "level": str} to each of the log_id
+```
+{"skill_id":"code_success_skills","results":[{"log_id":"1db0e927-79f1-46c2-b6dd-200d567f631d","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."},{"log_id":"94a2fae9-80ff-4b18-a77a-5714d34bcc20","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."},{"log_id":"96c216f1-edc5-40f3-b041-b01a68b137a1","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."},{"log_id":"b1f76f33-6f45-41e3-ae14-6b598f6aa357","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."}]}
+{"skill_id":"code_fail_skills","results":[{"log_id":"2e5513e7-27ae-4636-9d21-4b57ec9f739b","score":0,"level":"L3"},{"log_id":"563747fb-ea62-4ebc-80c4-1bc1d1c82ed5","score":0,"level":"L3"},{"log_id":"db699754-b1fd-491c-a49f-2af1a41ad1f7","score":0,"level":"L3"}]}
+```
+**Step 4. Apply Results to Local Log Status**
+```shell
+agtm skills rate apply --skill_id code_success_skills --result '{"skill_id":"code_success_skills","results":[{"log_id":"1db0e927-79f1-46c2-b6dd-200d567f631d","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."},{"log_id":"94a2fae9-80ff-4b18-a77a-5714d34bcc20","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."},{"log_id":"96c216f1-edc5-40f3-b041-b01a68b137a1","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate sql. Matches entry-level performance criteria for task execution."},{"log_id":"b1f76f33-6f45-41e3-ae14-6b598f6aa357","score":1,"level":"L3","rationale":"Successfully executed a well-scoped task generate. Matches entry-level performance criteria for task execution."}]}'
+agtm skills rate apply --skill_id code_fail_skills --result '{"skill_id":"code_fail_skills","results":[{"log_id":"2e5513e7-27ae-4636-9d21-4b57ec9f739b","score":0,"level":"L3"},{"log_id":"563747fb-ea62-4ebc-80c4-1bc1d1c82ed5","score":0,"level":"L3"},{"log_id":"db699754-b1fd-491c-a49f-2af1a41ad1f7","score":0,"level":"L3"}]}'
+```
+**Step 5.  Show final Result (Optional)**
+```shell
+agtm skills rate show
+```
+```shell
+skill_id             run_times  score  level
+-------------------  ---------  -----  -----
+code_fail_skills     3          0.00   L3
+code_success_skills  4          1.00   L3
+```
+#### CLI Documents
 #### Usage
 ```
 agtm skills rate prepare --skill_id <skill_id> --prompt "<eval_prompt>" --benchmark <path/benchmark.json>
@@ -131,6 +240,7 @@ write your `customized_agent_benchmark.json` following the formats
 }
 ```
 ## Supported Agents
 We provide the same skills local folder as vercel/skills packages.
 Skills can be installed to any of these agents

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@aiagenta2z/agtm",
-  "version": "1.0.8",
+  "version": "1.1.0",
   "description": "agtm: CLI Tool for AI Agent Management, Skills, Agent Registry, Benchmarks and Hints in AI Agent Marketplace\n",
   "main": "dist/agtm-cli.js",
   "type": "module",