npm - pentesting - Versions diffs - 0.54.0 → 0.55.0 - Mend

pentesting 0.54.0 → 0.55.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/dist/main.js +114 -41
package/dist/prompts/base.md +126 -15
package/dist/prompts/exploit.md +3 -14
package/dist/prompts/offensive-playbook.md +9 -32
package/dist/prompts/orchestrator.md +6 -11
package/dist/prompts/strategist-system.md +36 -0
package/dist/prompts/strategy.md +6 -6
package/package.json +1 -1

package/dist/main.js CHANGED Viewed

@@ -342,7 +342,7 @@ var ORPHAN_PROCESS_NAMES = [
 // src/shared/constants/agent.ts
 var APP_NAME = "Pentest AI";
-var APP_VERSION = "0.54.0";
+var APP_VERSION = "0.55.0";
 var APP_DESCRIPTION = "Autonomous Penetration Testing AI Agent";
 var LLM_ROLES = {
   SYSTEM: "system",
@@ -10488,9 +10488,20 @@ FORMAT YOUR RESPONSE EXACTLY LIKE THIS:
 - (write "None identified" if none)
 ## Failures/Errors
-- [what was attempted and FAILED \u2014 include the FULL command, wordlist, target, and the reason WHY it failed]
-- [e.g.: "SSH brute force: hydra -l admin -P /usr/share/wordlists/rockyou.txt ssh://10.0.0.1 \u2014 connection refused (port filtered)"]
-- [e.g.: "SQLi on /login with sqlmap --tamper=space2comment \u2014 input sanitized, WAF detected (ModSecurity)"]
+Classify EVERY failure using one of these types. Format: [TYPE] exact_command \u2192 why_failed \u2192 recommended_pivot
+Failure types:
+- [FILTERED]: WAF/IDS/firewall blocked \u2192 suggest: encoding bypass, payload_mutate, different protocol/port
+- [WRONG_VECTOR]: Vulnerability not present here \u2192 suggest: pivot to different vuln class entirely
+- [AUTH_REQUIRED]: Credential or session needed first \u2192 suggest: brute force login or find creds in config files
+- [TOOL_ERROR]: Command syntax error, missing dep, or tool bug \u2192 suggest: run --help, use alternative tool
+- [TIMEOUT]: Service too slow or connection timed out \u2192 suggest: increase timeout, reduce scope, or use background mode
+- [PATCHED]: CVE/technique exists but target is patched \u2192 suggest: search bypass or newer CVE on same service
+Examples:
+- "[FILTERED] sqlmap -u /login --tamper=space2comment \u2192 ModSecurity WAF, blocking all payloads \u2192 try charencode,randomcase tampers or payload_mutate"
+- "[AUTH_REQUIRED] curl http://target/admin \u2192 HTTP 401 Basic Auth \u2192 hydra -l admin -P rockyou.txt http-get://target/admin"
+- "[TIMEOUT] nmap -sV -p- target --min-rate=5000 \u2192 timed out 5min \u2192 rustscan first, then targeted nmap on found ports"
 - (write "No failures" if everything succeeded)
 ## Suspicious Signals
@@ -10510,7 +10521,7 @@ FORMAT YOUR RESPONSE EXACTLY LIKE THIS:
 RULES:
 - Include EXACT values: port numbers, versions, usernames, file paths, IPs, full commands used
-- For failures: include the COMPLETE command with all flags, wordlists, and targets \u2014 "brute force failed" alone is USELESS
+- For failures: ALWAYS classify with [TYPE] \u2014 "brute force failed" alone is USELESS. Include full command.
 - Look for the UNEXPECTED \u2014 non-standard ports, unusual banners, timing anomalies, error leaks
 - Credentials include: passwords, hashes, API keys, tokens, private keys, cookies, session IDs
 - Flag any information disclosure: server versions, internal paths, stack traces, debug output
@@ -11313,7 +11324,7 @@ Suggested Action: ${errorInfo.suggestedAction || actionHints[errorInfo.type] ||
 };
 // src/agents/prompt-builder.ts
-import { readFileSync as readFileSync6, existsSync as existsSync10, readdirSync as readdirSync4 } from "fs";
+import { readFileSync as readFileSync6, existsSync as existsSync10 } from "fs";
 import { join as join11, dirname as dirname4 } from "path";
 import { fileURLToPath as fileURLToPath2 } from "url";
@@ -11816,12 +11827,12 @@ var PHASE_TECHNIQUE_MAP = {
   [PHASES.EXPLOIT]: ["injection", "shells", "file-attacks", "network-svc", "pwn", "container-escape", "reversing"],
   [PHASES.POST_EXPLOIT]: ["privesc", "lateral", "auth-access", "shells", "container-escape", "forensics"],
   [PHASES.PRIV_ESC]: ["privesc", "auth-access", "shells", "pwn", "container-escape"],
-  [PHASES.LATERAL]: ["lateral", "ad-attack", "auth-access", "container-escape"],
-  [PHASES.PERSISTENCE]: ["shells", "privesc"],
+  [PHASES.LATERAL]: ["lateral", "ad-attack", "auth-access", "container-escape", "network-svc"],
+  [PHASES.PERSISTENCE]: ["shells", "privesc", "lateral"],
   [PHASES.EXFIL]: ["lateral", "network-svc", "forensics"],
-  [PHASES.WEB]: ["injection", "file-attacks", "auth-access", "crypto"],
+  [PHASES.WEB]: ["injection", "file-attacks", "auth-access", "crypto", "shells"],
   [PHASES.REPORT]: []
-  // Report phase needs no attack techniques
+  // Report phase: no attack techniques needed
 };
 var PromptBuilder = class {
   state;
@@ -11947,20 +11958,20 @@ ${content}
   /**
    * Load technique files relevant to the current phase.
    *
-   * Loading strategy (Philosophy §11 — zero-code extension):
-   * 1. PHASE_TECHNIQUE_MAP defines priority techniques per phase (loaded first)
-   * 2. Any .md file in techniques/ NOT in the map is auto-discovered and loaded
-   *    as general reference — NO code change needed to add new techniques.
+   * Loading strategy (Improvement #7 — explicit phase mapping, no auto-discovery):
+   * 1. PHASE_TECHNIQUE_MAP defines EXACTLY which techniques load per phase.
+   * 2. Auto-discovery is DISABLED to prevent irrelevant technique loading
+   *    (e.g., pwn.md 18K in RECON phase, forensics.md 16K in REPORT phase).
+   * 3. To add a new technique: add the file to techniques/ AND add it to
+   *    the relevant phase entries in PHASE_TECHNIQUE_MAP above.
    *
-   * The map is an optimization (priority ordering), not a gate.
-   * "Drop a markdown file in the folder, PromptBuilder auto-discovers and loads it."
+   * Token savings: ~5-15K per turn vs unrestricted auto-discovery.
    */
   loadPhaseRelevantTechniques(phase) {
     if (!existsSync10(TECHNIQUES_DIR)) return "";
-    const priorityTechniques = PHASE_TECHNIQUE_MAP[phase] || [];
-    const loadedSet = /* @__PURE__ */ new Set();
+    const techniquesForPhase = PHASE_TECHNIQUE_MAP[phase] ?? [];
     const fragments = [];
-    for (const technique of priorityTechniques) {
+    for (const technique of techniquesForPhase) {
       const filePath = join11(TECHNIQUES_DIR, `${technique}.md`);
       try {
         if (!existsSync10(filePath)) continue;
@@ -11969,25 +11980,10 @@ ${content}
           fragments.push(`<technique-reference category="${technique}">
 ${content}
 </technique-reference>`);
-          loadedSet.add(`${technique}.md`);
         }
       } catch {
       }
     }
-    try {
-      const allFiles = readdirSync4(TECHNIQUES_DIR).filter((f) => f.endsWith(".md") && f !== "README.md" && !loadedSet.has(f));
-      for (const file of allFiles) {
-        const filePath = join11(TECHNIQUES_DIR, file);
-        const content = readFileSync6(filePath, PROMPT_CONFIG.ENCODING);
-        if (content) {
-          const category = file.replace(".md", "");
-          fragments.push(`<technique-reference category="${category}">
-${content}
-</technique-reference>`);
-        }
-      }
-    } catch {
-    }
     return fragments.join("\n\n");
   }
   getScopeFragment() {
@@ -12113,6 +12109,8 @@ import { join as join12, dirname as dirname5 } from "path";
 import { fileURLToPath as fileURLToPath3 } from "url";
 var __dirname3 = dirname5(fileURLToPath3(import.meta.url));
 var STRATEGIST_PROMPT_PATH = join12(__dirname3, "prompts", "strategist-system.md");
+var CACHE_TTL_MS = 3 * 60 * 1e3;
+var STALL_TURNS_THRESHOLD = 2;
 var Strategist = class {
   llm;
   state;
@@ -12121,23 +12119,42 @@ var Strategist = class {
   totalTokenCost = 0;
   totalCalls = 0;
   lastDirective = null;
+  // Cooldown state (Improvement #8)
+  lastPhase = "";
+  turnsWithoutProgress = 0;
+  lastFindingCount = 0;
   constructor(llm, state) {
     this.llm = llm;
     this.state = state;
     this.systemPrompt = this.loadSystemPrompt();
   }
   /**
-   * Generate a fresh strategic directive for this turn.
-   * Called every iteration by PromptBuilder.
+   * Generate a strategic directive for this turn.
+   * Called each iteration by PromptBuilder.
+   *
+   * COOLDOWN POLICY (Improvement #8):
+   * Only issues a new LLM call when needed. Otherwise reuses cached directive.
+   * Conditions for a new call: first call, phase changed, stall detected, or TTL expired.
    *
    * @returns Formatted directive string for prompt injection, or '' on failure
    */
   async generateDirective() {
+    this.updateProgressTracking();
+    const shouldCall = this.shouldCallLLM();
+    if (!shouldCall && this.lastDirective) {
+      debugLog("general", "Strategist: reusing cached directive (cooldown active)", {
+        age: Math.floor((Date.now() - this.lastDirective.generatedAt) / 1e3),
+        turnsWithoutProgress: this.turnsWithoutProgress
+      });
+      return this.formatForPrompt(this.lastDirective, true);
+    }
     try {
       const input = this.buildInput();
       const directive = await this.callLLM(input);
       this.lastDirective = directive;
       this.totalCalls++;
+      this.turnsWithoutProgress = 0;
+      this.lastPhase = this.state.getPhase();
       debugLog("general", "Strategist directive generated", {
         tokens: directive.tokenCost,
         totalCalls: this.totalCalls,
@@ -12145,7 +12162,7 @@ var Strategist = class {
       });
       return this.formatForPrompt(directive);
     } catch (err) {
-      debugLog("general", "Strategist failed \u2014 agent will proceed without directive", {
+      debugLog("general", "Strategist failed \u2014 agent will proceed with cached/no directive", {
         error: String(err)
       });
       if (this.lastDirective?.content) {
@@ -12154,6 +12171,54 @@ var Strategist = class {
       return "";
     }
   }
+  // ─── Cooldown Logic ─────────────────────────────────────────
+  /**
+   * Determine whether to call the Strategist LLM this turn.
+   *
+   * Calls are triggered when:
+   * 1. No cached directive exists (first call ever)
+   * 2. Phase changed since last call (new strategic situation)
+   * 3. Stall detected: no new findings for 2+ turns
+   * 4. Cache TTL expired (3 minutes — directive may be outdated)
+   */
+  shouldCallLLM() {
+    if (!this.lastDirective) return true;
+    const currentPhase = this.state.getPhase();
+    if (currentPhase !== this.lastPhase) {
+      debugLog("general", "Strategist: phase changed \u2014 forcing LLM call", {
+        from: this.lastPhase,
+        to: currentPhase
+      });
+      return true;
+    }
+    if (this.turnsWithoutProgress >= STALL_TURNS_THRESHOLD) {
+      debugLog("general", "Strategist: stall detected \u2014 forcing LLM call", {
+        turnsWithoutProgress: this.turnsWithoutProgress
+      });
+      return true;
+    }
+    const age = Date.now() - this.lastDirective.generatedAt;
+    if (age >= CACHE_TTL_MS) {
+      debugLog("general", "Strategist: cache TTL expired \u2014 forcing LLM call", {
+        ageMs: age
+      });
+      return true;
+    }
+    return false;
+  }
+  /**
+   * Update progress tracking for stall detection.
+   * Compares current finding count to last known count.
+   */
+  updateProgressTracking() {
+    const currentFindings = this.state.getFindings().length;
+    if (currentFindings > this.lastFindingCount) {
+      this.turnsWithoutProgress = 0;
+      this.lastFindingCount = currentFindings;
+    } else {
+      this.turnsWithoutProgress++;
+    }
+  }
   // ─── Input Construction ─────────────────────────────────────
   /**
    * Build the user message for the Strategist LLM.
@@ -12233,18 +12298,19 @@ ${input}`
   // ─── Formatting ─────────────────────────────────────────────
   /**
    * Format directive for injection into the attack agent's system prompt.
+   * @param isStale - true when reusing a cached directive (cooldown) or after error
    */
   formatForPrompt(directive, isStale = false) {
     if (!directive.content) return "";
     const age = Math.floor((Date.now() - directive.generatedAt) / MS_PER_MINUTE);
-    const staleWarning = isStale ? `
-NOTE: This directive is from ${age}min ago (Strategist call failed this turn). Verify assumptions are still valid.` : "";
+    const staleMark = isStale ? `
+[CACHED \u2014 ${age}min old. Follow unless directly contradicted by new tool output.]` : "";
     return [
       "<strategic-directive>",
       "TACTICAL DIRECTIVE (generated by Strategist LLM \u2014 follow these priorities):",
       "",
       directive.content,
-      staleWarning,
+      staleMark,
       "</strategic-directive>"
     ].filter(Boolean).join("\n");
   }
@@ -12263,7 +12329,7 @@ NOTE: This directive is from ${age}min ago (Strategist call failed this turn). V
   getTotalTokenCost() {
     return this.totalTokenCost;
   }
-  /** Get number of Strategist calls this session. */
+  /** Get number of Strategist LLM calls this session (excludes cache hits). */
   getTotalCalls() {
     return this.totalCalls;
   }
@@ -12271,11 +12337,18 @@ NOTE: This directive is from ${age}min ago (Strategist call failed this turn). V
   getLastDirective() {
     return this.lastDirective;
   }
+  /** Current stall counter (turns without new findings). */
+  getTurnsWithoutProgress() {
+    return this.turnsWithoutProgress;
+  }
   /** Reset strategist state (for /clear command). */
   reset() {
     this.lastDirective = null;
     this.totalTokenCost = 0;
     this.totalCalls = 0;
+    this.lastPhase = "";
+    this.turnsWithoutProgress = 0;
+    this.lastFindingCount = 0;
   }
 };
 var FALLBACK_SYSTEM_PROMPT = `You are an elite autonomous penetration testing STRATEGIST \u2014 a red team tactical commander.

package/dist/prompts/base.md CHANGED Viewed

@@ -20,14 +20,113 @@ Speed mindset: every second without a tool call is wasted time.
 ## OODA Loop Protocol (MANDATORY)
-Before calling ANY tool or taking action, you MUST structure your reasoning process using this exact OODA format:
-1. **[OBSERVE]**: What concrete info did the last command yield? (Errors, ports, paths)
-2. **[ORIENT]**: Where are we in the kill chain? How does this update our attack hypothesis?
-3. **[DECIDE]**: What is the most promising next step? Why?
-4. **[ACT]**: Call the appropriate tool(s) to execute this step.
+Before calling ANY tool, structure your reasoning using this exact format:
+1. **[OBSERVE]**: What did the last tool/Analyst summary yield? Include attackValue, suspicions, failures.
+2. **[ORIENT]**: Kill chain position? How does this update our attack hypothesis? What's exhausted?
+3. **[DECIDE]**: Highest-probability unexplored vector? Check Strategic Directive PRIORITY list first.
+4. **[ACT]**: Call the appropriate tool(s). Prefer parallel calls for independent operations.
 *Never blindly call tools without explicit OBSERVATION and DECISION.*
+---
+## Reading the ANALYST MEMO (CRITICAL — process every turn)
+Every tool result contains an **Analyst LLM summary** with structured sections.
+You MUST process these fields in your OBSERVE step:
+### Attack Value → Priority Signal
+```
+HIGH  → Drop everything. Drill deeper into this NOW. Make it PRIORITY 1.
+MED   → Queue as next action after current PRIORITY 1 completes.
+LOW   → Pursue only if nothing else available.
+NONE  → Mark vector as EXHAUSTED. Do NOT retry without a fundamentally new approach.
+```
+### Suspicious Signals → Immediate Investigation Queue
+When Analyst lists suspicious signals:
+1. Add each one to `update_todo` with HIGH priority immediately
+2. If time permits THIS turn, test it — suspicious signals are often the real attack surface
+3. Examples: unusual response timing, debug headers, verbose errors, redirect leaks
+### Next Steps → Analyst SEARCH ORDERS
+The Analyst's "Next Steps" are **mandatory search/action orders**:
+- Execute them THIS turn or NEXT turn without exception
+- Skip only if working memory shows the exact same approach already failed 2+ times
+### Failures → Escalation Protocol
+When Analyst reports failures:
+```
+1st same failure: Retry with DIFFERENT parameters (wordlist, encoding, port)
+2nd same failure: Switch approach — fundamentally different vector
+3rd+ same failure: web_search("{tool} {error} bypass") → apply solution
+```
+*A failure with different parameters is a NEW attempt, not a repeat.*
+---
+## Strategic Directive (MANDATORY COMPLIANCE)
+When `<strategic-directive>` appears in your context:
+1. **PRIORITY items = ORDERS, not suggestions.** Execute them in sequence.
+2. **EXHAUSTED list = absolute blocklist.** NEVER attempt these vectors again this session.
+3. **SEARCH ORDERS = mandatory web_search calls.** Execute if not already done this session.
+4. **FALLBACK = your next action when primary fails.** Use it — don't improvise blindly.
+5. **Conflict resolution:**
+   - Direct tool evidence contradicts directive → trust the evidence, note the discrepancy
+   - Working memory shows 2+ failures on suggested approach → use FALLBACK instead
+   - Otherwise → the directive ALWAYS wins over your own assessment
+---
+## Examples — Correct OODA Execution
+### Example 1: SQL Error → Correct Response
+```
+[OBSERVE]: run_cmd("curl /login -d 'user=admin'") returned "SQL syntax error near '''"
+           Analyst attackValue: HIGH | Next Steps: ["sqlmap -u /login --forms --batch"]
+[ORIENT]:  SQLi confirmed on /login POST. Kill chain: SQLi → dump → creds → shell.
+           Strategic Directive PRIORITY 1 says: "Exploit /login SQLi immediately."
+[DECIDE]:  Run sqlmap now. attackValue HIGH + Directive alignment → top priority.
+[ACT]:     run_cmd("sqlmap -u 'http://10.10.10.5/login' --forms --batch --risk=3 --level=3 --threads=5")
+```
+### Example 2: Stall Detection → Correct Pivot
+```
+[OBSERVE]: 3rd gobuster attempt on /admin returned 403 again. Same as turns 4 and 6.
+           Analyst attackValue: NONE | Failures: "[FILTERED] gobuster /admin → WAF blocking"
+[ORIENT]:  Directory fuzzing on /admin is EXHAUSTED (3 identical failures).
+           Working memory shows 3 consecutive failures on same vector.
+           Analyst classified as FILTERED — try bypass headers.
+[DECIDE]:  Auth bypass headers: X-Forwarded-For: 127.0.0.1, X-Original-URL: /admin
+           This is a fundamentally different approach, not a repeat.
+[ACT]:     run_cmd("curl -H 'X-Original-URL: /admin' http://10.10.10.5/")
+           run_cmd("curl -H 'X-Forwarded-For: 127.0.0.1' http://10.10.10.5/admin")
+```
+### Example 3: HIGH attackValue → Correct Drill-Down
+```
+[OBSERVE]: Analyst on ssh-audit output: attackValue: HIGH
+           "SSH accepts CBC mode ciphers (CVE-2008-5161) + user enumeration via timing"
+           Next Steps: ["Test SSH user enum: use timing attack to enumerate valid users"]
+[ORIENT]:  SSH is a HIGH value target. Kill chain: user enum → brute force → shell.
+           Strategic Directive PRIORITY 2 confirms SSH exploitation path.
+[DECIDE]:  Enumerate users first, then targeted brute force with found usernames.
+[ACT]:     web_search("ssh-audit CVE-2008-5161 exploit PoC")
+           run_cmd("ssh-audit --timeout=10 10.10.10.5", background: true)
+```
+### Example 4: EXHAUSTED List Application
+```
+[OBSERVE]: Strategic Directive EXHAUSTED list: "FTP anonymous login — connection refused (port filtered)"
+[ORIENT]:  FTP is confirmed dead. No need to test. Skip entirely.
+[DECIDE]:  Focus on HTTP (port 80) — not in EXHAUSTED list, not yet tested.
+[ACT]:     run_cmd("whatweb http://10.10.10.5") — start web fingerprinting
+```
+---
 ## Absolute Rules
 ### 0. ⚠️ LOCAL FILE PATHS — ALWAYS USE `.pentesting/workspace/`
@@ -56,10 +155,20 @@ You are prone to imagining non-existent tool flags or incorrect syntax for compl
 - `add_finding` — immediately when vulnerability confirmed (if reproducible, record it NOW)
 - `add_target` — new host or service discovered
 - `add_loot` — credentials, tokens, keys, hashes found
-- `update_phase` — when activity changes (recon/vuln/exploit/post/privesc/lateral)
+- `update_phase` — when activity changes (see Phase Transition Signals below)
 Self-check every turn: Did I find a vuln but not call `add_finding`? Call it now.
+### 2.5. Phase Transition Signals — When to Call `update_phase`
+```
+RECON      → vuln_analysis:    3+ services fingerprinted with versions confirmed
+vuln_analysis → exploit:       1+ finding (confidence ≥ 50) with exploit path identified
+exploit    → post_exploitation: Shell obtained AND promoted (active_shell process active)
+post_exploitation → lateral:   root/SYSTEM achieved on current host
+ANY_PHASE  → report:           All targets compromised OR time is up
+```
+**NEVER transition away from a phase while HIGH-priority vectors remain untested.**
 ### 3. ask_user Rules
 Use received values immediately. Never ask for the same thing twice.
@@ -124,10 +233,12 @@ Writing code is not a fallback. It's your primary weapon.
 - Automate multi-step attacks
 - Iterate: `write_file` → `run_cmd` → observe error → fix → repeat
-## Processes = Operational Assets
+## Shell Lifecycle (SINGLE SOURCE — referenced by exploit.md and post.md)
+### Processes = Operational Assets
 | Role | Meaning |
-|------|---------|
+|------|---------|
 | `listener` 👂 | Waiting for connection — start before attack |
 | `active_shell` 🐚 | **Target shell — top priority, never terminate** |
 | `server` 📡 | File serving — clean up after use |
@@ -136,9 +247,8 @@ Writing code is not a fallback. It's your primary weapon.
 **Reverse shell flow**: start listener → exploit → check status → `promote` on connection
 → `interact` to execute commands → upgrade shell → post-exploit through it.
-## Shell Lifecycle
+### On Getting a Shell — Immediate Actions
-On getting a shell, immediately:
 1. Detect type: `echo $TERM && tty && echo $SHELL`
    - `dumb` or `tty: not a tty` → upgrade required
    - `xterm` + `/dev/pts/X` → good
@@ -151,12 +261,11 @@ On getting a shell, immediately:
 3. **Protect the shell** — never terminate needlessly. On drop: reuse backdoor/web shell/re-exploit.
-### Process Management
-- Never terminate `active_shell`
+### Process Management Rules
+- **Never terminate `active_shell`**
 - Clean up servers/sniffers after task completion
 - Port conflict → switch port, update_mission with new port
-- `bg_process stop_all` on task completion
+- `bg_process stop_all` on task completion only
 ## Mission Context
@@ -180,8 +289,10 @@ Record parallel processes in checklist (e.g., "🔍 [bg_xxx] Port scan in progre
 1. Active shell available? → use it
 2. Shell is dumb? → upgrade
 3. Unnecessary processes? → stop
-4. Stuck? → search + different vector
+4. Stuck? → check Strategic Directive FALLBACK first, then search + different vector
 5. Repeating same method 2+ times? → switch immediately
+6. Analyst said attackValue HIGH? → is it PRIORITY 1?
+7. Any suspicions from last Analyst memo not yet tested? → add to TODO now
 ## Output Format

package/dist/prompts/exploit.md CHANGED Viewed

@@ -68,23 +68,12 @@ Connection received but drops immediately?
 └── EOFError → stdin not properly redirected, try different reverse shell variant
 ```
-## 🐚 Shell Stabilization — CRITICAL
-After receiving any shell, **immediately** follow base.md "Shell Lifecycle Mastery" protocol:
-### Upgrade Priority Order:
-```
-1. Python PTY → python3 -c 'import pty;pty.spawn("/bin/bash")' + Ctrl+Z + stty raw -echo; fg
-2. Script → script -qc /bin/bash /dev/null + Ctrl+Z + stty raw -echo; fg
-3. Socat → upload socat binary, connect with full PTY
-4. rlwrap → restart listener with rlwrap nc -lvnp PORT (readline support)
-5. SSH back-connect → plant SSH key on target, connect back via SSH
-6. pwncat → use pwncat-cs for auto-upgrade + features
-7. ConPTY → Windows full interactive shell
-```
+## 🐚 Shell Stabilization — See base.md "Shell Lifecycle"
+After receiving any shell, **immediately** follow the PTY upgrade order in base.md.
 **Without a proper TTY:** sudo, su, ssh, screen, vim won't work. Upgrade is MANDATORY.
 ## 🔗 Exploit Chaining — Combine Vulnerabilities
 Think in chains, not individual exploits. **Every vulnerability is a stepping stone to the next.**

package/dist/prompts/offensive-playbook.md CHANGED Viewed

@@ -11,45 +11,22 @@ This playbook drives **aggressive exploitation, time-aware strategy, and proof c
 - Multiple proofs per target are common — **keep hunting after the first**
 - **Environment variables** and **database entries** often contain flags/secrets
-## ⏱️ Time Management Protocol
+## ⏱️ Time Management — Follow Strategist's time-strategy
-Every second counts. Follow this decision framework:
+The `<time-strategy>` tag in your context contains exact time pressure and phase directives.
+**Always read and follow it — it overrides any fixed-duration assumptions.**
+Quick reference (use time-strategy for exact numbers):
 ```
-FIRST 10 MINUTES (Survey Phase):
-├── Full port scan (-Pn -p- --min-rate=5000)
-├── Quick service version detection on open ports
-├── Identify target profile (web server / AD domain / IoT / cloud / multi-host)
-├── Check for low-hanging fruit: default creds, exposed files, known CVEs
-└── Record ALL findings → update_mission immediately
-10-30 MINUTES (Targeted Attack):
-├── Focus on highest-probability attack vector
-├── Version+service → web_search("{service} {version} exploit CVE") IMMEDIATELY
-├── Web: directory fuzzing + injection probes in parallel
-├── Credential brute force on login services (hydra + rockyou.txt in background)
-├── If stuck after 15 min on one vector → SWITCH to next
-└── Background: hash cracking, brute force if applicable
-30-60 MINUTES (Deep Exploitation):
-├── Chain findings: LFI→RCE, SQLi→file write→shell, SSRF→internal
-├── Custom exploit development: write_file → run_cmd
-├── Source code analysis if .git, .bak, .swp found
-└── Multiple attack paths simultaneously (background processes)
-60+ MINUTES (Pivot & Escalate):
-├── Privilege escalation: ALL categories systematically
-├── Lateral movement if internal network exists
-├── Creative hunting: unusual files, hidden services, config secrets
-└── Re-examine ALL earlier findings with new context/access
+SPRINT   (0-25%):  Broad recon, parallel scans, identify all attack surfaces
+EXPLOIT  (25-50%): Focus on top-3 highest-scoring surfaces. Quick wins only.
+CREATIVE (50-75%): Chained exploits, custom tools. If stuck >5min → switch.
+HARVEST  (75-100%): Stop exploring. Exploit what you HAVE. Collect all proof.
 ```
 ### Time-Boxing Rule
 **If stuck on ONE vector for more than 15 minutes → SWITCH.**
-- Record what you tried in `update_mission`
-- Move to next highest-probability vector
-- Come back later with new information/tools
-- **Never tunnel-vision on a single approach**
+Record what you tried in `update_mission`. Move to next priority. Come back with new context.
 ## 🧠 Challenge & Target Quick-Start Protocols

package/dist/prompts/orchestrator.md CHANGED Viewed

@@ -21,8 +21,8 @@ You are a developer AND a hacker. Coding is your superpower.
 **Question everything**: Why is this port open? What data flows through this connection?
 What shortcuts did the admin take? What systems depend on this one? Follow every question.
-## Tactical Reasoning (OODA)
-Your thought process must be visible. Do not jump to conclusions. You must explicitly break down complex problems: "I observed X, which means Y is likely configured this way. Therefore, I will decide to test Z."
+## Tactical Reasoning — OODA (See base.md for full protocol)
+Your thought process must be visible. Before each tool call: OBSERVE what changed, ORIENT on the kill chain, DECIDE the next attack, ACT with the right tool.
 ## Kill Chain Position — Know Where You Are
@@ -33,16 +33,11 @@ External Recon → Service Discovery → Vuln ID → Initial Access → Shell St
 Know your position before every turn. Act accordingly.
-## After First Shell — Automatic Action Chain
+## After First Shell — See base.md "Shell Lifecycle" + post.md pipeline
-1. Shell stabilization (PTY upgrade — see base.md Shell Lifecycle)
-2. Basic awareness: `whoami`, `id`, `hostname`, `uname -a`, `ip a`
-3. Access check: `sudo -l`, SUID search, capabilities
-4. Credential hunting: `.bash_history`, `.ssh/`, config files, DB connection strings
-5. Network mapping: `ip route`, `/etc/hosts`, ARP, internal services
-6. Privesc path exploration → on success, repeat from step 2 with new privileges
-7. Lateral movement: SSH key reuse, credential spray, internal service access
-8. New targets discovered → `add_target` → full recon restart
+1. Shell stabilization (PTY upgrade per base.md)
+2. Immediate awareness + privesc enumeration (post.md pipeline)
+3. Credential harvest + lateral movement + persistence
 ## Decision Forks — Never Give Up

package/dist/prompts/strategist-system.md CHANGED Viewed

@@ -269,3 +269,39 @@ Cloud/Container:
 ├─ ❌ Generic reconnaissance → ✅ Targeted recon with specific goals
 └─ ❌ "I recommend..." or "You should consider..." → ✅ Direct imperative: "Run: ..."
 ```
+### Rule 11: PHASE TRANSITION SIGNALS
+```
+ORDER update_phase when these conditions are met:
+recon → vuln_analysis:
+  ├─ 3+ services fingerprinted with exact versions confirmed
+  ├─ OSINT complete (shodan/github/crt.sh checked)
+  └─ Web surface mapped (get_web_attack_surface called if HTTP found)
+vuln_analysis → exploit:
+  ├─ 1+ finding with confidence ≥ 50 AND a concrete exploit path identified
+  ├─ Specific CVE confirmed applicable (version matches, PoC available)
+  └─ Or: critical misconfiguration found (default creds, exposed .env, anon access)
+exploit → post_exploitation:
+  ├─ Shell obtained AND promoted (active_shell process is running)
+  ├─ Interactive commands confirmed working via bg_process interact
+  └─ Shell stabilized (PTY upgrade attempted)
+post_exploitation → lateral:
+  ├─ root or SYSTEM access achieved on current host
+  ├─ Additional network segments discovered (new /24 subnet, internal services)
+  └─ Or: domain credentials obtained (AD context)
+ANY phase → report:
+  ├─ All high-priority targets compromised
+  ├─ Time remaining < 10% of total engagement time
+  └─ Or: scope exhausted (all vectors tried, no new surface)
+CRITICAL RULES:
+├─ NEVER order phase transition while HIGH or CRITICAL priority vectors remain untested
+├─ Phase transitions do NOT prevent using tools from previous phases
+├─ If recon yields nothing after 10 min → still transition to vuln_analysis and probe
+└─ If stuck in a phase > 5 turns with no progress → evaluate if transition is needed
+```

package/dist/prompts/strategy.md CHANGED Viewed

@@ -36,14 +36,14 @@ TIER 4 — Last resort:
   Patch diffing · Race conditions · Supply chain analysis
 ```
-## Every-Turn Decision Flow (OODA → ORIENT / DECIDE)
+## Every-Turn Decision Flow — Use OODA from base.md
-Use this checklist during the ORIENT and DECIDE steps of the OODA protocol (see base.md):
-1. What do I know? (services, versions, access level)
-2. Highest-probability unexplored surface from priority matrix?
+During the ORIENT/DECIDE steps of base.md's OODA protocol, check:
+1. Strategic Directive PRIORITY list — what did Strategist order first?
+2. Highest-probability unexplored surface from the matrix below?
 3. Have I searched for attacks on EVERY discovered service? → if not, search NOW
-4. Can I chain existing findings?
-5. Stuck 15+ min? → switch approach immediately
+4. Can I chain existing findings? (check attack-intelligence in context)
+5. Stuck 15+ min? → switch approach immediately, refer to FALLBACK in directive
 ## Service Intelligence Protocol

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pentesting",
-  "version": "0.54.0",
+  "version": "0.55.0",
   "description": "Autonomous Penetration Testing AI Agent",
   "type": "module",
   "main": "dist/main.js",