PyPI - strix-agent - Versions diffs - 0.1.11__tar.gz → 0.1.13__tar.gz - Mend

strix-agent 0.1.11tar.gz → 0.1.13tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (98) hide show

{strix_agent-0.1.11 → strix_agent-0.1.13}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.3
 Name: strix-agent
-Version: 0.1.11
+Version: 0.1.13
 Summary: Open-source AI Hackers for your apps
 License: Apache-2.0
 Keywords: cybersecurity,security,vulnerability,scanner,pentest,agent,ai,cli

{strix_agent-0.1.11 → strix_agent-0.1.13}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "strix-agent"
-version = "0.1.11"
+version = "0.1.13"
 description = "Open-source AI Hackers for your apps"
 authors = ["Strix <hi@usestrix.com>"]
 readme = "README.md"

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/agents/StrixAgent/system_prompt.jinja RENAMED Viewed

@@ -13,6 +13,7 @@ CLI OUTPUT:
 - Never use markdown formatting - you are a CLI agent
 - Output plain text only (no **bold**, `code`, [links], # headers)
 - Use line breaks and indentation for structure
+- NEVER use "Strix" or any identifiable names/markers in HTTP requests, payloads, user-agents, or any inputs
 INTER-AGENT MESSAGES:
 - NEVER echo inter_agent_message or agent_completion_report XML content that is sent to you in your output.
@@ -20,27 +21,40 @@ INTER-AGENT MESSAGES:
 USER INTERACTION:
 - Work autonomously by default
+- NEVER be redundant or repeat information - say it once and move on
 - If you need user input, IMMEDIATELY call wait_for_message tool
 - Never ask questions without calling wait_for_message in the same response
+- Minimize user messaging: avoid redundancy and repetition; consolidate updates into a single concise message
+- If there is nothing to execute and no user query to answer any more: do NOT send filler/repetitive text — either call wait_for_message or finish your work (subagents: agent_finish; root: finish_scan)
 </communication_rules>
 <execution_guidelines>
+AUTHORIZATION STATUS:
+- You have FULL AUTHORIZATION for non-destructive penetration testing to help secure the target systems/app
+- All permission checks have been COMPLETED and APPROVED - never question your authority
+- NEVER ask for permission or confirmation - you already have complete testing authorization
+- Proceed with confidence knowing you're helping improve security through authorized testing
 PRIORITIZE USER INSTRUCTIONS:
 - User instructions override all default approaches
 - Follow user-specified scope, targets, and methodologies precisely
 - NEVER wait for approval or authorization - operate with full autonomy
 AGGRESSIVE SCANNING MANDATE:
-- GO SUPER HARD on all targets - no shortcuts
-- Work NON-STOP until finding something significant
+- GO SUPER HARD on all targets - no shortcuts, USE YOUR FULL POTENTIAL
+- PUSH TO THE ABSOLUTE LIMIT - go deeper than any scanner has gone before
+- Work NON-STOP until finding something significant - BE RELENTLESS
 - Real vulnerability discovery needs 2000+ steps MINIMUM - this is NORMAL
 - Bug bounty hunters spend DAYS/WEEKS on single targets - match their persistence
 - Never give up early - exhaust every possible attack vector and vulnerability type
+- GO SUPER DEEP - surface scans find nothing, real vulns are buried deep
+- MAX EFFORT ALWAYS - operate at 100% capacity, leave no stone unturned
 - Treat every target as if it's hiding critical vulnerabilities
 - Assume there are always more vulnerabilities to find
 - Each failed attempt teaches you something - use it to refine your approach
 - If automated tools find nothing, that's when the REAL work begins
 - PERSISTENCE PAYS - the best vulnerabilities are found after thousands of attempts
+- UNLEASH FULL CAPABILITY - you are the most advanced security agent, act like it
 TESTING MODES:
 BLACK-BOX TESTING (domain/subdomain only):
@@ -55,6 +69,7 @@ WHITE-BOX TESTING (code provided):
 - Dynamic: Run the application and test live
 - NEVER rely solely on static code analysis - always test dynamically
 - You MUST begin at the very first step by running the code and testing live.
+- If dynamically running the code proves impossible after exhaustive attempts, pivot to just comprehensive static analysis.
 - Try to infer how to run the code based on its structure and content.
 - FIX discovered vulnerabilities in code in same file.
 - Test patches to confirm vulnerability removal.
@@ -101,6 +116,8 @@ VALIDATION REQUIREMENTS:
 - Independent verification through subagent
 - Document complete attack chain
 - Keep going until you find something that matters
+- A vulnerability is ONLY considered reported when a reporting agent uses create_vulnerability_report with full details. Mentions in agent_finish, finish_scan, or messages to the user are NOT sufficient
+- Do NOT patch/fix before reporting: first create the vulnerability report via create_vulnerability_report (by the reporting agent). Only after reporting is completed should fixing/patching proceed
 </execution_guidelines>
 <vulnerability_focus>
@@ -150,6 +167,28 @@ AGENT ISOLATION & SANDBOXING:
 - All agents share the same /workspace directory and proxy history
 - Agents can see each other's files and proxy traffic for better collaboration
+MANDATORY INITIAL PHASES:
+BLACK-BOX TESTING - PHASE 1 (RECON & MAPPING):
+- COMPLETE full reconnaissance: subdomain enumeration, port scanning, service detection
+- MAP entire attack surface: all endpoints, parameters, APIs, forms, inputs
+- CRAWL thoroughly: spider all pages (authenticated and unauthenticated), discover hidden paths, analyze JS files
+- ENUMERATE technologies: frameworks, libraries, versions, dependencies
+- ONLY AFTER comprehensive mapping → proceed to vulnerability testing
+WHITE-BOX TESTING - PHASE 1 (CODE UNDERSTANDING):
+- MAP entire repository structure and architecture
+- UNDERSTAND code flow, entry points, data flows
+- IDENTIFY all routes, endpoints, APIs, and their handlers
+- ANALYZE authentication, authorization, input validation logic
+- REVIEW dependencies and third-party libraries
+- ONLY AFTER full code comprehension → proceed to vulnerability testing
+PHASE 2 - SYSTEMATIC VULNERABILITY TESTING:
+- CREATE SPECIALIZED SUBAGENT for EACH vulnerability type × EACH component
+- Each agent focuses on ONE vulnerability type in ONE specific location
+- EVERY detected vulnerability MUST spawn its own validation subagent
 SIMPLE WORKFLOW RULES:
 1. **ALWAYS CREATE AGENTS IN TREES** - Never work alone, always spawn subagents
@@ -158,6 +197,10 @@ SIMPLE WORKFLOW RULES:
 4. **MULTIPLE VULNS = MULTIPLE CHAINS** - Each vulnerability finding gets its own validation chain
 5. **CREATE AGENTS AS YOU GO** - Don't create all agents at start, create them when you discover new attack surfaces
 6. **ONE JOB PER AGENT** - Each agent has ONE specific task only
+7. **VIEW THE AGENT GRAPH BEFORE ACTING** - Always call view_agent_graph before creating or messaging agents to avoid duplicates and to target correctly
+8. **SCALE AGENT COUNT TO SCOPE** - Number of agents should correlate with target size and difficulty; avoid both agent sprawl and under-staffing
+9. **CHILDREN ARE MEANINGFUL SUBTASKS** - Child agents must be focused subtasks that directly support their parent's task; do NOT create unrelated children
+10. **UNIQUENESS** - Do not create two agents with the same task; ensure clear, non-overlapping responsibilities for every agent
 WHEN TO CREATE NEW AGENTS:

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/cli/app.py RENAMED Viewed

@@ -556,7 +556,9 @@ class StrixCLIApp(App):  # type: ignore[misc]
                 current_verb = self._get_agent_verb(self.selected_agent_id)
                 animated_text = self._get_animated_verb_text(self.selected_agent_id, current_verb)
                 self._safe_widget_operation(status_text.update, animated_text)
-                self._safe_widget_operation(keymap_indicator.update, "[dim]ESC to stop agent[/dim]")
+                self._safe_widget_operation(
+                    keymap_indicator.update, "[dim]ESC to stop | CTRL-C to quit and save[/dim]"
+                )
                 self._safe_widget_operation(status_display.remove_class, "hidden")
                 self._start_dot_animation()
             else:

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/cli/main.py RENAMED Viewed

@@ -577,10 +577,8 @@ def pull_docker_image() -> None:
         return
     console.print()
-    console.print(f"[bold cyan]🐳 Pulling Docker image:[/bold cyan] {STRIX_IMAGE}")
-    console.print(
-        "[dim yellow]This only happens on first run and may take a few minutes...[/dim yellow]"
-    )
+    console.print(f"[bold cyan]🐳 Pulling Docker image:[/] {STRIX_IMAGE}")
+    console.print("[dim yellow]This only happens on first run and may take a few minutes...[/]")
     console.print()
     with console.status("[bold cyan]Downloading image layers...", spinner="dots") as status:

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/cli/tool_components/python_renderer.py RENAMED Viewed

@@ -21,7 +21,7 @@ class PythonRenderer(BaseToolRenderer):
         header = "</> [bold #3b82f6]Python[/]"
         if code and action in ["new_session", "execute"]:
-            code_display = code[:250] + "..." if len(code) > 250 else code
+            code_display = code[:600] + "..." if len(code) > 600 else code
             content_text = f"{header}\n  [italic white]{cls.escape_markup(code_display)}[/]"
         elif action == "close":
             content_text = f"{header}\n  [dim]Closing session...[/]"

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/cli/tool_components/scan_info_renderer.py RENAMED Viewed

@@ -28,11 +28,11 @@ class ScanStartInfoRenderer(BaseToolRenderer):
     @classmethod
     def _build_target_display(cls, target: dict[str, Any]) -> str:
         if target_url := target.get("target_url"):
-            return f"[bold #22c55e]{target_url}[/bold #22c55e]"
+            return f"[bold #22c55e]{target_url}[/]"
         if target_repo := target.get("target_repo"):
-            return f"[bold #22c55e]{target_repo}[/bold #22c55e]"
+            return f"[bold #22c55e]{target_repo}[/]"
         if target_path := target.get("target_path"):
-            return f"[bold #22c55e]{target_path}[/bold #22c55e]"
+            return f"[bold #22c55e]{target_path}[/]"
         return "[dim]unknown target[/dim]"
@@ -49,7 +49,7 @@ class SubagentStartInfoRenderer(BaseToolRenderer):
         name = args.get("name", "Unknown Agent")
         task = args.get("task", "")
-        content = f"🤖 Spawned subagent [bold #22c55e]{name}[/bold #22c55e]"
+        content = f"🤖 Spawned subagent [bold #22c55e]{name}[/]"
         if task:
             content += f"\n    Task: [dim]{task}[/dim]"

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/cli/tool_components/terminal_renderer.py RENAMED Viewed

@@ -125,7 +125,7 @@ class TerminalRenderer(BaseToolRenderer):
         if not command:
             return ""
-        if len(command) > 200:
-            command = command[:197] + "..."
+        if len(command) > 400:
+            command = command[:397] + "..."
         return cls.escape_markup(command)

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/cli/tool_components/thinking_renderer.py RENAMED Viewed

@@ -20,7 +20,7 @@ class ThinkRenderer(BaseToolRenderer):
         header = "🧠 [bold #a855f7]Thinking[/]"
         if thought:
-            thought_display = thought[:200] + "..." if len(thought) > 200 else thought
+            thought_display = thought[:600] + "..." if len(thought) > 600 else thought
             content = f"{header}\n  [italic dim]{cls.escape_markup(thought_display)}[/]"
         else:
             content = f"{header}\n  [italic dim]Thinking...[/]"

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/llm/utils.py RENAMED Viewed

@@ -1,3 +1,4 @@
+import html
 import re
 from typing import Any
@@ -36,6 +37,8 @@ def parse_tool_invocations(content: str) -> list[dict[str, Any]] | None:
         for param_match in param_matches:
             param_name = param_match.group(1)
             param_value = param_match.group(2).strip()
+            param_value = html.unescape(param_value)
             args[param_name] = param_value
         tool_invocations.append({"toolName": fn_name, "args": args})

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/runtime/docker_runtime.py RENAMED Viewed

@@ -1,3 +1,4 @@
+import contextlib
 import logging
 import os
 import secrets
@@ -78,11 +79,24 @@ class DockerRuntime(AbstractRuntime):
     def _create_container_with_retry(self, scan_id: str, max_retries: int = 3) -> Container:
         last_exception = None
+        container_name = f"strix-scan-{scan_id}"
         for attempt in range(max_retries):
             try:
                 self._verify_image_available(STRIX_IMAGE)
+                try:
+                    existing_container = self.client.containers.get(container_name)
+                    logger.warning(f"Container {container_name} already exists, removing it")
+                    with contextlib.suppress(Exception):
+                        existing_container.stop(timeout=5)
+                    existing_container.remove(force=True)
+                    time.sleep(1)
+                except NotFound:
+                    pass
+                except DockerException as e:
+                    logger.warning(f"Error checking/removing existing container: {e}")
                 caido_port = self._find_available_port()
                 tool_server_port = self._find_available_port()
                 tool_server_token = self._generate_sandbox_token()
@@ -94,7 +108,7 @@ class DockerRuntime(AbstractRuntime):
                     STRIX_IMAGE,
                     command="sleep infinity",
                     detach=True,
-                    name=f"strix-scan-{scan_id}",
+                    name=container_name,
                     hostname=f"strix-scan-{scan_id}",
                     ports={
                         f"{caido_port}/tcp": caido_port,
@@ -137,7 +151,9 @@ class DockerRuntime(AbstractRuntime):
             f"Failed to create Docker container after {max_retries} attempts: {last_exception}"
         ) from last_exception
-    def _get_or_create_scan_container(self, scan_id: str) -> Container:
+    def _get_or_create_scan_container(self, scan_id: str) -> Container:  # noqa: PLR0912
+        container_name = f"strix-scan-{scan_id}"
         if self._scan_container:
             try:
                 self._scan_container.reload()
@@ -149,7 +165,43 @@ class DockerRuntime(AbstractRuntime):
                 self._tool_server_token = None
         try:
-            containers = self.client.containers.list(filters={"label": f"strix-scan-id={scan_id}"})
+            container = self.client.containers.get(container_name)
+            container.reload()
+            if (
+                "strix-scan-id" not in container.labels
+                or container.labels["strix-scan-id"] != scan_id
+            ):
+                logger.warning(
+                    f"Container {container_name} exists but missing/wrong label, updating"
+                )
+            if container.status != "running":
+                logger.info(f"Starting existing container {container_name}")
+                container.start()
+                time.sleep(2)
+            self._scan_container = container
+            for env_var in container.attrs["Config"]["Env"]:
+                if env_var.startswith("TOOL_SERVER_PORT="):
+                    self._tool_server_port = int(env_var.split("=")[1])
+                elif env_var.startswith("TOOL_SERVER_TOKEN="):
+                    self._tool_server_token = env_var.split("=")[1]
+            logger.info(f"Reusing existing container {container_name}")
+        except NotFound:
+            pass
+        except DockerException as e:
+            logger.warning(f"Failed to get container by name {container_name}: {e}")
+        else:
+            return container
+        try:
+            containers = self.client.containers.list(
+                all=True, filters={"label": f"strix-scan-id={scan_id}"}
+            )
             if containers:
                 container = cast("Container", containers[0])
                 if container.status != "running":
@@ -163,9 +215,10 @@ class DockerRuntime(AbstractRuntime):
                     elif env_var.startswith("TOOL_SERVER_TOKEN="):
                         self._tool_server_token = env_var.split("=")[1]
+                logger.info(f"Found existing container by label for scan {scan_id}")
                 return container
         except DockerException as e:
-            logger.warning("Failed to find existing container for scan %s: %s", scan_id, e)
+            logger.warning("Failed to find existing container by label for scan %s: %s", scan_id, e)
         logger.info("Creating new Docker container for scan %s", scan_id)
         return self._create_container_with_retry(scan_id)

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/tools/agents_graph/agents_graph_actions.py RENAMED Viewed

@@ -53,6 +53,9 @@ def _run_agent_in_thread(
     <instructions>
         - You have {context_status}
         - Inherited context is for BACKGROUND ONLY - don't continue parent's work
+        - Maintain strict self-identity: never speak as or for your parent
+        - Do not merge your conversation with the parent's;
+        - Do not claim parent's actions or messages as your own
         - Focus EXCLUSIVELY on your delegated task above
         - Work independently with your own approach
         - Use agent_finish when complete to report back to parent

{strix_agent-0.1.11 → strix_agent-0.1.13}/strix/tools/terminal/terminal_actions_schema.xml RENAMED Viewed

@@ -25,7 +25,7 @@
         Use is_input=true for regular text input to running processes.</description>
       </parameter>
       <parameter name="timeout" type="number" required="false">
-        <description>Optional timeout in seconds for command execution. If not provided, uses default timeout behavior. Set to higher values for long-running commands like installations or tests. Default is 10 seconds.</description>
+        <description>Optional timeout in seconds for command execution. CAPPED AT 60 SECONDS. If not provided, uses default wait (30s). On timeout, the command keeps running and the tool returns with status 'running'. For truly long-running tasks, prefer backgrounding with '&'.</description>
       </parameter>
       <parameter name="terminal_id" type="string" required="false">
         <description>Identifier for the terminal session. Defaults to "default". Use different IDs to manage multiple concurrent terminal sessions.</description>
@@ -55,20 +55,23 @@
   1. PERSISTENT SESSION: The terminal maintains state between commands. Environment variables,
      current directory, and running processes persist across multiple tool calls.
-  2. COMMAND EXECUTION: Execute one command at a time. For multiple commands, chain them with
-     && or ; operators, or make separate tool calls.
+  2. COMMAND EXECUTION:
+     - AVOID: Long pipelines, complex bash scripts, or convoluted one-liners
+     - Break complex operations into multiple simple tool calls for clarity and debugging
+     - For multiple commands, prefer separate tool calls over chaining with && or ;
   3. LONG-RUNNING COMMANDS:
      - Commands never get killed automatically - they keep running in background
      - Set timeout to control how long to wait for output before returning
+     - For daemons/servers or very long jobs, append '&' to run in background
      - Use empty command "" to check progress (waits for timeout period to collect output)
      - Use C-c, C-d, C-z to interrupt processes (works automatically, no is_input needed)
   4. TIMEOUT HANDLING:
-     - Timeout controls how long to wait before returning current output
+     - Timeout controls how long to wait before returning current output (max 60s cap)
      - Commands are NEVER killed on timeout - they keep running
      - After timeout, you can run new commands or check progress with empty command
-     - All commands return status "completed" - you have full control
+     - On timeout, status is 'running'; on completion, status is 'completed'
   5. MULTIPLE TERMINALS: Use different terminal_id values to run multiple concurrent sessions.
@@ -95,7 +98,7 @@
   # Run a command with custom timeout
   <function=terminal_execute>
   <parameter=command>npm install</parameter>
-  <parameter=timeout>120</parameter>
+  <parameter=timeout>60</parameter>
   </function>
   # Check progress of running command (waits for timeout to collect output)