opensandbox-cli 0.1.0.dev1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,112 @@
1
+ ---
2
+ name: troubleshoot-sandbox
3
+ description: Troubleshoot OpenSandbox issues by running diagnostics (logs, inspect, events, summary) via CLI or HTTP API to diagnose sandbox failures like OOM, crash, image pull errors, network problems, etc.
4
+ user-invocable: true
5
+ argument-hint: "[sandbox-id]"
6
+ ---
7
+
8
+ # OpenSandbox Troubleshooting
9
+
10
+ Troubleshoot sandbox $ARGUMENTS using the opensandbox diagnostics.
11
+
12
+ There are two ways to interact with the diagnostics API: **CLI** (if opensandbox CLI is installed) or **HTTP** (curl against the server directly). Use whichever is available. The HTTP approach works regardless of how the sandbox was created (SDK, API, CLI).
13
+
14
+ ## Workflow
15
+
16
+ ### Step 1: Confirm sandbox state
17
+
18
+ **CLI:**
19
+ ```bash
20
+ opensandbox sandbox get <sandbox-id>
21
+ ```
22
+
23
+ **HTTP:**
24
+ ```bash
25
+ curl http://<server-domain>/v1/sandboxes/<sandbox-id>
26
+ ```
27
+
28
+ If the server requires authentication, add `-H "OPEN-SANDBOX-API-KEY: <your-key>"` to all curl commands.
29
+
30
+ Check the sandbox status (Running, Pending, Paused, Failed, etc.). If the sandbox is not found, it may have been deleted or expired.
31
+
32
+ ### Step 2: Get diagnostics summary (recommended first action)
33
+
34
+ **CLI:**
35
+ ```bash
36
+ opensandbox devops summary <sandbox-id>
37
+ ```
38
+
39
+ **HTTP:**
40
+ ```bash
41
+ curl http://<server-domain>/v1/sandboxes/<sandbox-id>/diagnostics/summary
42
+ ```
43
+
44
+ This returns a combined plain-text view of:
45
+ - **Inspect**: container/pod details (status, resources, network, labels)
46
+ - **Events**: state transitions, OOM kills, errors
47
+ - **Logs**: recent container output
48
+
49
+ Read the output carefully and look for common failure patterns listed below.
50
+
51
+ ### Step 3: Drill down if needed
52
+
53
+ If the summary is not enough, use individual endpoints for more detail:
54
+
55
+ **CLI:**
56
+ ```bash
57
+ opensandbox devops logs <sandbox-id> --tail 500
58
+ opensandbox devops logs <sandbox-id> --since 30m
59
+ opensandbox devops inspect <sandbox-id>
60
+ opensandbox devops events <sandbox-id> --limit 100
61
+ ```
62
+
63
+ **HTTP:**
64
+ ```bash
65
+ # Get more log lines
66
+ curl "http://<server-domain>/v1/sandboxes/<sandbox-id>/diagnostics/logs?tail=500"
67
+
68
+ # Get logs from recent time window
69
+ curl "http://<server-domain>/v1/sandboxes/<sandbox-id>/diagnostics/logs?since=30m"
70
+
71
+ # Detailed container/pod inspection
72
+ curl "http://<server-domain>/v1/sandboxes/<sandbox-id>/diagnostics/inspect"
73
+
74
+ # More events
75
+ curl "http://<server-domain>/v1/sandboxes/<sandbox-id>/diagnostics/events?limit=100"
76
+ ```
77
+
78
+ ### Step 4: Diagnose common problems
79
+
80
+ | Symptom | What to check | Likely cause |
81
+ |---------|---------------|--------------|
82
+ | Status=Pending, no IP | inspect - look for Waiting containers | Image pull failure, insufficient resources, node scheduling |
83
+ | OOMKilled=true | inspect - check memory limits | Container exceeded memory limit, increase memory resource |
84
+ | Exit Code 137 | events + logs | OOM kill or external SIGKILL |
85
+ | Exit Code 1 | logs - check application output | Application error, check entrypoint and env vars |
86
+ | Exit Code 126/127 | logs | Entrypoint command not found or not executable |
87
+ | Connection refused to sandbox | inspect - check ports and network | Service not started inside sandbox, wrong port, network policy blocking |
88
+ | Sandbox stuck in Running but unresponsive | logs (tail=200) | Application hung, check for deadlocks or resource exhaustion |
89
+ | execd health check failing | logs - look for execd errors | execd daemon crashed or port conflict |
90
+ | ImagePullBackOff (K8s) | events | Wrong image name, missing registry credentials |
91
+ | CrashLoopBackOff (K8s) | events + logs | Application keeps crashing, check exit code and logs |
92
+
93
+ ### Step 5: Suggest resolution
94
+
95
+ Based on the diagnosis, suggest one of:
96
+ - **Image issue**: Verify image name, check registry access
97
+ - **OOM**: Increase memory limit in sandbox creation (e.g. `memory=4Gi`)
98
+ - **Application error**: Fix the entrypoint or application code
99
+ - **Network**: Check network policy, verify port configuration
100
+ - **Scheduling (K8s)**: Check node resources, check pool availability
101
+ - **execd**: Update execd image version, check port conflicts
102
+
103
+ ## API Reference
104
+
105
+ All diagnostics endpoints return `text/plain` and are available at:
106
+
107
+ | Endpoint | Query Params | Description |
108
+ |----------|-------------|-------------|
109
+ | `GET /v1/sandboxes/{id}/diagnostics/summary` | `tail` (default 50), `event_limit` (default 20) | Combined inspect + events + logs |
110
+ | `GET /v1/sandboxes/{id}/diagnostics/logs` | `tail` (default 100), `since` (e.g. 10m, 1h) | Container/pod logs |
111
+ | `GET /v1/sandboxes/{id}/diagnostics/inspect` | - | Container/pod detailed state |
112
+ | `GET /v1/sandboxes/{id}/diagnostics/events` | `limit` (default 50) | Container/pod events |
@@ -0,0 +1,145 @@
1
+ # Copyright 2026 Alibaba Group Holding Ltd.
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+
15
+ """Shared CLI utilities: duration parsing, error handling, key-value parsing."""
16
+
17
+ from __future__ import annotations
18
+
19
+ import functools
20
+ import re
21
+ import sys
22
+ from datetime import timedelta
23
+
24
+ import click
25
+
26
+ # ---------------------------------------------------------------------------
27
+ # Duration parsing (e.g. "10m", "1h30m", "90s", "2h")
28
+ # ---------------------------------------------------------------------------
29
+
30
+ _DURATION_RE = re.compile(
31
+ r"^(?:(?P<hours>\d+)h)?(?:(?P<minutes>\d+)m)?(?:(?P<seconds>\d+)s)?$"
32
+ )
33
+
34
+
35
+ def parse_duration(value: str) -> timedelta:
36
+ """Parse a human-friendly duration string into a ``timedelta``.
37
+
38
+ Supported formats: ``10m``, ``1h30m``, ``90s``, ``2h``, ``1h30m45s``.
39
+ A plain integer is treated as seconds.
40
+ """
41
+ value = value.strip()
42
+ if not value:
43
+ raise click.BadParameter("Duration cannot be empty")
44
+
45
+ # Plain integer → seconds
46
+ if value.isdigit():
47
+ return timedelta(seconds=int(value))
48
+
49
+ m = _DURATION_RE.match(value)
50
+ if not m or not m.group(0):
51
+ raise click.BadParameter(
52
+ f"Invalid duration '{value}'. Use format like 10m, 1h30m, 90s."
53
+ )
54
+
55
+ hours = int(m.group("hours") or 0)
56
+ minutes = int(m.group("minutes") or 0)
57
+ seconds = int(m.group("seconds") or 0)
58
+ return timedelta(hours=hours, minutes=minutes, seconds=seconds)
59
+
60
+
61
+ class DurationType(click.ParamType):
62
+ """Click parameter type for duration strings."""
63
+
64
+ name = "duration"
65
+
66
+ def convert(
67
+ self, value: str, param: click.Parameter | None, ctx: click.Context | None
68
+ ) -> timedelta:
69
+ if isinstance(value, timedelta):
70
+ return value
71
+ try:
72
+ return parse_duration(value)
73
+ except click.BadParameter:
74
+ self.fail(
75
+ f"Invalid duration '{value}'. Use format like 10m, 1h30m, 90s.",
76
+ param,
77
+ ctx,
78
+ )
79
+
80
+
81
+ DURATION = DurationType()
82
+
83
+
84
+ # ---------------------------------------------------------------------------
85
+ # Key=Value parsing (e.g. --env FOO=bar)
86
+ # ---------------------------------------------------------------------------
87
+
88
+
89
+ class KeyValueType(click.ParamType):
90
+ """Click parameter type that parses ``KEY=VALUE`` strings into a tuple."""
91
+
92
+ name = "KEY=VALUE"
93
+
94
+ def convert(
95
+ self, value: str, param: click.Parameter | None, ctx: click.Context | None
96
+ ) -> tuple[str, str]:
97
+ if isinstance(value, tuple):
98
+ return value
99
+ if "=" not in value:
100
+ self.fail(f"Expected KEY=VALUE format, got '{value}'", param, ctx)
101
+ key, _, val = value.partition("=")
102
+ return (key, val)
103
+
104
+
105
+ KEY_VALUE = KeyValueType()
106
+
107
+
108
+ # ---------------------------------------------------------------------------
109
+ # Error handling decorator
110
+ # ---------------------------------------------------------------------------
111
+
112
+
113
+ def handle_errors(fn): # type: ignore[no-untyped-def]
114
+ """Decorator that catches SDK / HTTP exceptions and prints a friendly message."""
115
+
116
+ @functools.wraps(fn)
117
+ def wrapper(*args, **kwargs): # type: ignore[no-untyped-def]
118
+ try:
119
+ return fn(*args, **kwargs)
120
+ except click.exceptions.Exit:
121
+ raise
122
+ except click.ClickException:
123
+ raise
124
+ except Exception as exc:
125
+ # Import here to avoid circular imports at module level
126
+ from opensandbox.exceptions import SandboxException
127
+
128
+ # Try to get the OutputFormatter from the Click context
129
+ ctx = click.get_current_context(silent=True)
130
+ obj = getattr(ctx, "obj", None) if ctx else None
131
+ output = getattr(obj, "output", None) if obj else None
132
+
133
+ if output and hasattr(output, "error_panel"):
134
+ if isinstance(exc, SandboxException):
135
+ output.error_panel(str(exc), title="Sandbox Error")
136
+ else:
137
+ output.error_panel(
138
+ f"{str(exc)}\n\n[dim]Type: {type(exc).__qualname__}[/]",
139
+ title=type(exc).__name__,
140
+ )
141
+ else:
142
+ click.secho(f"Error: {exc}", fg="red", err=True)
143
+ sys.exit(1)
144
+
145
+ return wrapper