npm - hackmyagent - Versions diffs - 0.11.14 → 0.11.15 - Mend

hackmyagent 0.11.14 → 0.11.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/README.md CHANGED Viewed

@@ -3,11 +3,11 @@
 [![npm version](https://img.shields.io/npm/v/hackmyagent.svg)](https://www.npmjs.com/package/hackmyagent)
 [![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
-[![Tests](https://img.shields.io/badge/tests-1051%20passing-brightgreen)](https://github.com/opena2a-org/hackmyagent)
+[![Tests](https://img.shields.io/badge/tests-1113%20passing-brightgreen)](https://github.com/opena2a-org/hackmyagent)
-**204 security checks for AI agents. Find what can go wrong before an attacker does.**
+**204 security checks + behavioral simulation for AI agents. Find what can go wrong before an attacker does.**
-Security scanner and red-team toolkit for Claude Code, Cursor, VS Code, and any MCP server setup.
+Security scanner, red-team toolkit, and behavioral simulation engine for Claude Code, Cursor, VS Code, and any MCP server setup. NanoMind-powered semantic analysis runs by default when available.
 ```bash
 npx hackmyagent secure
@@ -33,6 +33,10 @@ npx opena2a-cli review
 ## What It Finds
+**Behavioral simulation** (NEW) -- 20-probe simulation battery that observes what skills actually do, not what they look like. Targets < 1% false positive rate vs industry 95.8%. Run with `--deep`.
+**Adaptive red team** (NEW) -- `hackmyagent red-team <file>` generates target-specific attack payloads, observes responses, adapts after failures, and maps all defenses. NanoMind-powered.
 **Attack testing** -- 115 adversarial payloads across 11 categories (prompt injection, data exfiltration, jailbreak, MCP exploitation, supply chain, memory weaponization, A2A protocol attacks, context window attacks).
 **Static analysis** -- 204 security checks across 60 categories covering credentials, MCP configs, OpenClaw/NemoClaw, Unicode steganography, CVE detection, governance, supply chain, memory poisoning, agent identity, and sandbox escape patterns.
@@ -136,6 +140,8 @@ hackmyagent secure --ci     # Non-interactive mode for CI/CD
 ```bash
 hackmyagent secure                            # scan current directory
 hackmyagent secure ./my-project               # scan specific directory
+hackmyagent secure --deep                     # full behavioral simulation (20 probes)
+hackmyagent secure --static-only              # static checks only (fast, CI mode)
 hackmyagent secure --fix                      # auto-fix issues
 hackmyagent secure --fix --dry-run            # preview fixes before applying
 hackmyagent secure --ignore CRED-001,GIT-002  # skip specific checks
@@ -144,6 +150,23 @@ hackmyagent secure --verbose                  # show all checks including passed
 hackmyagent secure --publish                  # push results to OpenA2A Registry
 ```
+### `hackmyagent red-team` -- Adaptive Attack Engine
+```bash
+hackmyagent red-team ./my-skill.md            # red-team a skill file
+hackmyagent red-team ./SOUL.md --iterations 10 # more attack iterations
+hackmyagent red-team ./mcp-config.json --json  # JSON output
+```
+Generates target-specific attacks from the skill's own language and constraints. Iterates up to 5x per attack category, maps all defenses, and produces specific remediation.
+### `hackmyagent explain` -- Finding Explanations
+```bash
+hackmyagent explain CRED-001                  # explain a finding
+hackmyagent explain SKILL-SEMANTIC-007        # explain NanoMind finding
+```
 <details>
 <summary>All 35 security categories</summary>

package/dist/cli.js CHANGED Viewed

@@ -1870,6 +1870,34 @@ Examples:
             onProgress,
         });
         const scanDurationMs = Date.now() - scanStartMs;
+        // NanoMind Semantic Compiler: AST-based analysis runs alongside static checks
+        // Defense-in-depth: static findings can NEVER be suppressed, only upgraded
+        if (!isStaticOnly && !options.ci) {
+            try {
+                const { runNanoMindScan } = await Promise.resolve().then(() => __importStar(require('./nanomind-core/scanner-bridge.js')));
+                const existingFindings = result.allFindings || result.findings || [];
+                const nmResult = await runNanoMindScan(targetDir, existingFindings);
+                if (format === 'text' && nmResult.astFindings.length > 0) {
+                    const newFindings = nmResult.astFindings.filter(f => !f.passed);
+                    if (newFindings.length > 0) {
+                        process.stdout.write(`\nNanoMind: ${nmResult.compiledArtifacts} artifact(s) compiled, ${newFindings.length} semantic finding(s) added\n`);
+                    }
+                    if (nmResult.integrityStatus !== 'CLEAN') {
+                        process.stdout.write(`  Integrity: ${nmResult.integrityStatus}\n`);
+                    }
+                }
+                // Merge: AST findings ADD to static (never remove)
+                if (result.allFindings) {
+                    result.allFindings = nmResult.mergedFindings;
+                }
+                if (result.findings) {
+                    result.findings = nmResult.mergedFindings.filter((f) => !f.passed);
+                }
+            }
+            catch {
+                // NanoMind unavailable -- static results are still valid
+            }
+        }
         // Behavioral simulation: auto-runs on --deep, or when NanoMind detects ambiguity
         if (isDeep && format === 'text') {
             try {
@@ -5223,9 +5251,38 @@ program
         console.log(`\n${trainingCount} training samples exported to NanoMind corpus.`);
     }
 });
-if (process.argv.length <= 2) {
-    program.outputHelp();
-    process.exit(0);
-}
-program.parse();
+// Self-securing: verify own integrity before running any command
+// A security tool that doesn't verify itself is worse than no security tool
+(async () => {
+    try {
+        const { verifyAll } = await Promise.resolve().then(() => __importStar(require('./nanomind-core/security/integrity-verifier.js')));
+        const integrity = await verifyAll();
+        if (integrity.status === 'QUARANTINE') {
+            // Binary tampered -- refuse to run
+            process.stderr.write('\nINTEGRITY CHECK FAILED: HackMyAgent binary may have been tampered with.\n' +
+                'This could indicate a supply chain attack.\n\n' +
+                'Actions:\n' +
+                '  1. Reinstall: npm install -g hackmyagent\n' +
+                '  2. Verify: npm audit signatures\n' +
+                '  3. Report: https://github.com/opena2a-org/hackmyagent/security\n\n');
+            for (const check of integrity.checks.filter(c => !c.passed)) {
+                process.stderr.write(`  Failed: ${check.name} -- ${check.reason}\n`);
+            }
+            process.exit(3); // Exit code 3 = integrity failure
+        }
+        if (integrity.status === 'DEGRADE') {
+            // Model or rules tampered -- warn but continue with fallback
+            process.stderr.write('\nIntegrity warning: some components could not be verified.\n' +
+                'Continuing with baseline analysis (reduced accuracy).\n\n');
+        }
+    }
+    catch {
+        // Integrity check itself failed -- continue (don't block on missing manifest in dev)
+    }
+    if (process.argv.length <= 2) {
+        program.outputHelp();
+        process.exit(0);
+    }
+    program.parse();
+})();
 //# sourceMappingURL=cli.js.map