@stephen-lord/other2 1.0.8 → 1.0.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/docs/manus/CN-/346/211/222/345/256/214/345/205/250/347/275/221/346/234/200/345/274/272-AI-/345/233/242/351/230/237/347/232/204-Context-Engineering-/346/224/273/347/225/245/346/210/221/344/273/254/346/200/273/347/273/223/345/207/272/344/272/206/350/277/231-5-/345/244/247/346/226/271/346/263/225-/346/231/272/346/272/220/347/244/276/345/214/272.md +2464 -0
- package/dist/docs/manus/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus.md +212 -0
- package/dist/docs/manus/Context-Engineering-for-AI-Agents-Part-2.md +96 -0
- package/dist/docs/manus/Industry.md +94 -0
- package/dist/docs/manus/Observability-for-Manus-15-Agents-Logs-Retries-and-Error-Budgets.md +346 -0
- package/dist/docs/manus/OpenManus-Technical-Analysis-Architecture-and-Implementation-of-an-Open-Source-A.md +324 -0
- package/dist/docs/manus/README.md +85 -0
- package/dist/docs/manus/Tech-Constrained-Decoding-Agent-Reliability.md +81 -0
- package/dist/docs/manus/Tech-How-to-build-function-calling-and-JSON-mode.md +43 -0
- package/dist/docs/manus/Tech-Understanding-Logit-Bias-in-LLMs-Medium.md +1354 -0
- package/dist/docs/manus/The-Performance-Reality-KV-Cache-as-the-North-Star.md +155 -0
- package/dist/docs/manus/Why-Context-Engineering.md +125 -0
- package/dist/docs/manus/article_1_raw.md +1 -0
- package/dist/docs/manus/split_articles.py +52 -0
- package/dist/docs/manus//346/235/245/350/207/252-Manus-/347/232/204/344/270/200/346/211/213/345/210/206/344/272/253/345/246/202/344/275/225/346/236/204/345/273/272-AI-Agent-/347/232/204/344/270/212/344/270/213/346/226/207/345/267/245/347/250/213-/346/231/272/346/272/220/347/244/276/345/214/272.md +2180 -0
- package/dist/ui-ux-pro-max/SKILL.md +386 -0
- package/dist/ui-ux-pro-max/data/charts.csv +26 -0
- package/dist/ui-ux-pro-max/data/colors.csv +97 -0
- package/dist/ui-ux-pro-max/data/icons.csv +101 -0
- package/dist/ui-ux-pro-max/data/landing.csv +31 -0
- package/dist/ui-ux-pro-max/data/products.csv +97 -0
- package/dist/ui-ux-pro-max/data/prompts.csv +24 -0
- package/dist/ui-ux-pro-max/data/react-performance.csv +45 -0
- package/dist/ui-ux-pro-max/data/stacks/flutter.csv +53 -0
- package/dist/ui-ux-pro-max/data/stacks/html-tailwind.csv +56 -0
- package/dist/ui-ux-pro-max/data/stacks/jetpack-compose.csv +53 -0
- package/dist/ui-ux-pro-max/data/stacks/nextjs.csv +53 -0
- package/dist/ui-ux-pro-max/data/stacks/nuxt-ui.csv +51 -0
- package/dist/ui-ux-pro-max/data/stacks/nuxtjs.csv +59 -0
- package/dist/ui-ux-pro-max/data/stacks/react-native.csv +52 -0
- package/dist/ui-ux-pro-max/data/stacks/react.csv +54 -0
- package/dist/ui-ux-pro-max/data/stacks/shadcn.csv +61 -0
- package/dist/ui-ux-pro-max/data/stacks/svelte.csv +54 -0
- package/dist/ui-ux-pro-max/data/stacks/swiftui.csv +51 -0
- package/dist/ui-ux-pro-max/data/stacks/vue.csv +50 -0
- package/dist/ui-ux-pro-max/data/styles.csv +59 -0
- package/dist/ui-ux-pro-max/data/typography.csv +58 -0
- package/dist/ui-ux-pro-max/data/ui-reasoning.csv +101 -0
- package/dist/ui-ux-pro-max/data/ux-guidelines.csv +100 -0
- package/dist/ui-ux-pro-max/data/web-interface.csv +31 -0
- package/dist/ui-ux-pro-max/scripts/__pycache__/core.cpython-310.pyc +0 -0
- package/dist/ui-ux-pro-max/scripts/__pycache__/core.cpython-312.pyc +0 -0
- package/dist/ui-ux-pro-max/scripts/__pycache__/design_system.cpython-312.pyc +0 -0
- package/dist/ui-ux-pro-max/scripts/core.py +258 -0
- package/dist/ui-ux-pro-max/scripts/design_system.py +1066 -0
- package/dist/ui-ux-pro-max/scripts/search.py +106 -0
- package/package.json +6 -6
|
@@ -0,0 +1,346 @@
|
|
|
1
|
+
# Observability for Manus 1.5 Agents: Logs, Retries, and Error Budgets
|
|
2
|
+
|
|
3
|
+
**Source:** https://skywork.ai/blog/ai-agent/observability-manus-1-5-agents-best-practices/
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Author: andywang
|
|
8
|
+
|
|
9
|
+
Observability for Manus 1.5 Agents: Logs, Retries, Error Budgets
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
If you run Manus 1.5–class agent systems in production, you’ve felt the pain: asynchronous chains of tools, variable model latencies, and failure modes that don’t look like traditional microservices. Public docs for Manus 1.5 emphasize speed, precision, and an “unlimited context” multi‑agent design, but detailed observability internals are not published; plan on platform‑agnostic patterns and open standards (see the 2025 Manus overview for context in the Manus 1.5 introduction).
|
|
17
|
+
|
|
18
|
+
In practice, three levers determine whether your agent fleet is stable or chaotic: structured logs, safe retries (with circuit breakers), and disciplined error budgets. Below is what has worked repeatedly in multi‑agent, multi‑modal deployments.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## 1) Pillar: Structured Logs That Agents Can Live On
|
|
23
|
+
|
|
24
|
+
Logs are your ground truth when traces get sampled away or a provider times out. The mistake I see most often is unstructured strings with missing correlation. For agentic systems, treat logs as queryable data.
|
|
25
|
+
|
|
26
|
+
What to implement first:
|
|
27
|
+
|
|
28
|
+
- Emit JSON logs with consistent schema
|
|
29
|
+
- Include correlation fields for traces and spans
|
|
30
|
+
- Capture agent/step/tool context and outcome
|
|
31
|
+
- Apply privacy controls (PII/token redaction)
|
|
32
|
+
- Set retention and sampling policies per environment
|
|
33
|
+
|
|
34
|
+
A pragmatic field set (ECS/OTel aligned):
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
{
|
|
38
|
+
"@timestamp": "2025-10-17T04:07:59Z",
|
|
39
|
+
"log.level": "info",
|
|
40
|
+
"service.name": "agent-orchestrator",
|
|
41
|
+
"env": "prod",
|
|
42
|
+
"trace.id": "4bf92f3577b34da6a3ce929d0e0e4736",
|
|
43
|
+
"span.id": "00f067aa0ba902b7",
|
|
44
|
+
"session.id": "u-9f12a",
|
|
45
|
+
"agent.name": "planner",
|
|
46
|
+
"step": "tool_call",
|
|
47
|
+
"tool.name": "web.search",
|
|
48
|
+
"message": "tool_call success",
|
|
49
|
+
"ai.model": "gpt-4o",
|
|
50
|
+
"ai.input_tokens": 120,
|
|
51
|
+
"ai.output_tokens": 300,
|
|
52
|
+
"latency_ms": 842,
|
|
53
|
+
"http.status_code": 200
|
|
54
|
+
}
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Why these fields:
|
|
59
|
+
|
|
60
|
+
- trace.id/span.id: lets you pivot between logs and traces seamlessly.
|
|
61
|
+
- agent.name/step/tool.name: reconstruct agent plans and loops.
|
|
62
|
+
- ai.* and latency_ms: correlate cost/latency with quality and retries.
|
|
63
|
+
- session.id: privacy‑aware user/session grouping for blast‑radius analysis.
|
|
64
|
+
|
|
65
|
+
Elastic’s guidance on structured logging and the Elastic Common Schema has remained solid through 2025; it’s a helpful reference for field naming and correlation design, as explained in the Elastic logging best practices (ECS).
|
|
66
|
+
|
|
67
|
+
Log levels policy that actually reduces MTTR:
|
|
68
|
+
|
|
69
|
+
- debug: local/dev only; redact prompts/responses by default
|
|
70
|
+
- info: major agent steps (plan, tool_call, evaluate) and state transitions
|
|
71
|
+
- warn: transient provider errors (429, 5xx) with retry intent
|
|
72
|
+
- error: final failures after retries; include error.class and last http.status_code
|
|
73
|
+
|
|
74
|
+
Tie logs to traces from day one. If you’re new to OTel in LLM systems, this primer shows how to correlate logs/traces and where to add guardrails: LLM observability best practices with OpenTelemetry.
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
## 2) Pillar: Retries With Guardrails (and Circuit Breakers)
|
|
79
|
+
|
|
80
|
+
Agents should be resilient to transient faults (429s, timeouts). But indiscriminate retries create retry storms, blow error budgets, and can domino into provider bans. The pattern that consistently works is: exponential backoff with jitter, per‑attempt timeouts, capped total attempts, idempotency for mutating actions, and a circuit breaker.
|
|
81
|
+
|
|
82
|
+
Key parameters to start with (tune from prod telemetry):
|
|
83
|
+
|
|
84
|
+
- initial delay: 250–750 ms; backoff factor ×2; full jitter
|
|
85
|
+
- per‑attempt timeout: 5–10 s (match provider SLA)
|
|
86
|
+
- max attempts: 3–5 for read operations; fewer for writes unless idempotent
|
|
87
|
+
- only retry on: 429/5xx/timeouts; respect Retry‑After headers
|
|
88
|
+
- circuit breaker: trip after N consecutive failures or error rate ≥ threshold; reset after cool‑down
|
|
89
|
+
|
|
90
|
+
Authoritative cloud guidance lines up with this shape. The AWS Well‑Architected and prescriptive patterns advocate exponential backoff with jitter, idempotency, and circuit breakers; use them as your baseline policy, as summarized in the AWS Well‑Architected Framework guidance.
|
|
91
|
+
|
|
92
|
+
Python (Tenacity) example:
|
|
93
|
+
|
|
94
|
+
```
|
|
95
|
+
from tenacity import (
|
|
96
|
+
retry, stop_after_attempt, wait_exponential_jitter,
|
|
97
|
+
retry_if_exception_type
|
|
98
|
+
)
|
|
99
|
+
import requests
|
|
100
|
+
|
|
101
|
+
class TransientError(Exception):
|
|
102
|
+
pass
|
|
103
|
+
|
|
104
|
+
@retry(
|
|
105
|
+
reraise=True,
|
|
106
|
+
stop=stop_after_attempt(4),
|
|
107
|
+
wait=wait_exponential_jitter(initial=0.5, max=30),
|
|
108
|
+
retry=retry_if_exception_type(TransientError)
|
|
109
|
+
)
|
|
110
|
+
def call_provider(payload, timeout=10):
|
|
111
|
+
resp = requests.post("https://api.provider/llm", json=payload, timeout=timeout)
|
|
112
|
+
if resp.status_code in (429, 500, 502, 503, 504):
|
|
113
|
+
# Optionally parse Retry-After and sleep accordingly
|
|
114
|
+
raise TransientError(f"Provider error {resp.status_code}")
|
|
115
|
+
resp.raise_for_status()
|
|
116
|
+
return resp.json()
|
|
117
|
+
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
Node.js circuit breaker (Opossum) around a tool call:
|
|
121
|
+
|
|
122
|
+
```
|
|
123
|
+
const CircuitBreaker = require('opossum');
|
|
124
|
+
|
|
125
|
+
async function toolCall(input) {
|
|
126
|
+
// your HTTP call or SDK invocation
|
|
127
|
+
}
|
|
128
|
+
|
|
129
|
+
const breaker = new CircuitBreaker(toolCall, {
|
|
130
|
+
timeout: 3000, // fail fast if too slow
|
|
131
|
+
errorThresholdPercentage: 50,
|
|
132
|
+
volumeThreshold: 10, // minimum requests before stats apply
|
|
133
|
+
resetTimeout: 30000 // half-open after cool-down
|
|
134
|
+
});
|
|
135
|
+
|
|
136
|
+
breaker.on('open', () => console.warn('circuit_open')); // log + alert hook
|
|
137
|
+
breaker.on('halfOpen', () => console.info('circuit_half_open'));
|
|
138
|
+
breaker.on('close', () => console.info('circuit_closed'));
|
|
139
|
+
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
If you use OpenAI or Azure OpenAI, implement rate‑limit aware retries and parallelism throttling per their official guidance. The OpenAI Cookbook shows working patterns to catch 429s, honor Retry‑After, and throttle workers, see the 2025 update in the OpenAI Cookbook on rate limits.
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## 3) Pillar: Error Budgets That Govern Velocity
|
|
147
|
+
|
|
148
|
+
Error budgets translate reliability targets into day‑to‑day decisions: ship features vs. fix reliability. In 2025, the canonical implementation uses SLO‑based burn‑rate alerts across multiple windows to detect both fast burn and slow leaks without spam.
|
|
149
|
+
|
|
150
|
+
A quick refresher (SRE canon):
|
|
151
|
+
|
|
152
|
+
- SLI: the quantitative measure (e.g., success rate of agent tasks)
|
|
153
|
+
- SLO: the target (e.g., 99.9% success over 30 days)
|
|
154
|
+
- Error budget: allowable failure in the period: (100% − SLO%) × period
|
|
155
|
+
|
|
156
|
+
Google’s SRE Workbook provides the standard multi‑window, multi‑burn‑rate approach. A common 99.9%/30‑day policy uses urgent 14.4× over 1h+5m, high 6× over 6h+30m, etc., as detailed in the Google SRE Workbook “Alerting on SLOs”.
|
|
157
|
+
|
|
158
|
+
Prometheus/Grafana‑style pseudo‑config:
|
|
159
|
+
|
|
160
|
+
```
|
|
161
|
+
slo:
|
|
162
|
+
name: agent_task_success
|
|
163
|
+
objective: 99.9
|
|
164
|
+
window: 30d
|
|
165
|
+
sli:
|
|
166
|
+
# ratio of good events (status="ok") over all
|
|
167
|
+
numerator: sum(rate(agent_tasks_total{status="ok"}[5m]))
|
|
168
|
+
denominator: sum(rate(agent_tasks_total[5m]))
|
|
169
|
+
alerts:
|
|
170
|
+
- name: SLOBudgetBurnUrgent
|
|
171
|
+
long_window: 1h
|
|
172
|
+
short_window: 5m
|
|
173
|
+
burn_rate: 14.4
|
|
174
|
+
action: "page on-call; auto-throttle retries by 50%"
|
|
175
|
+
- name: SLOBudgetBurnHigh
|
|
176
|
+
long_window: 6h
|
|
177
|
+
short_window: 30m
|
|
178
|
+
burn_rate: 6
|
|
179
|
+
action: "block non-critical deploys; route traffic to stable agents"
|
|
180
|
+
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
Grafana has a practical 2025 walkthrough of multi‑window alerts that pairs well with the Workbook’s math; if you operate in Grafana Cloud, start with their implementation guidance in the Grafana multi‑window burn‑rate alert guide (2025).
|
|
184
|
+
|
|
185
|
+
Policy moves that keep teams honest:
|
|
186
|
+
|
|
187
|
+
- When high‑severity burn alert fires, pause risky prompt/model experiments automatically
|
|
188
|
+
- Auto‑reduce retry concurrency by 25–50% while in burn (prevents storming)
|
|
189
|
+
- Freeze non‑critical deploys when remaining budget < 25% of period
|
|
190
|
+
- Publish daily budget remaining on an always‑on dashboard
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
194
|
+
## 4) End‑to‑End Workflow Example (Manus‑like Orchestration)
|
|
195
|
+
|
|
196
|
+
Scenario: A planner agent decomposes a user goal; an executor agent calls tools (search, retrieval, calendar); a reviewer agent validates outputs. Telemetry and governance wire‑up:
|
|
197
|
+
|
|
198
|
+
1. Tracing: Create spans for llm.prompt, llm.tool_call, rag.retrieve, moderation.check. Propagate context across async steps so trace.id follows the task.
|
|
199
|
+
2. Logging: Emit JSON logs at plan/execution/review steps with trace.id/span.id, agent.name, tool.name, latency_ms, ai.* tokens. Redact sensitive fields.
|
|
200
|
+
3. Retries: Use backoff+jitter on tool and provider calls; cap attempts; only retry retryable classes.
|
|
201
|
+
4. Circuit breaker: Wrap flaky tools/providers to fail fast and provide a “degraded mode” path.
|
|
202
|
+
5. Error budgets: SLI tracks task success; alerts use 14.4×/6× patterns; on burn, throttle retries and freeze non‑critical changes.
|
|
203
|
+
|
|
204
|
+
In one recent rollout, we paired OTel traces with structured logs and budget‑driven deploy gates. For teams that prefer an integrated research‑to‑docs workflow, Skywork AI can serve as the documentation and research hub alongside your observability stack. Disclosure: Skywork AI is our product.
|
|
205
|
+
|
|
206
|
+
For span conventions specific to AI agents, the OpenTelemetry community’s 2025 update is a good north star as the semantic conventions evolve; see the OpenTelemetry blog on AI agent observability (2025).
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## 5) Tooling That Works in 2025
|
|
211
|
+
|
|
212
|
+
Vendor‑neutral backends
|
|
213
|
+
|
|
214
|
+
- ELK/OpenSearch: flexible JSON search, ECS alignment; great for log forensics
|
|
215
|
+
- Prometheus/Grafana: SLIs, burn‑rate alerts, cost‑effective metrics
|
|
216
|
+
- OpenTelemetry: standardize spans/metrics/logs; collector for routing and sampling control
|
|
217
|
+
|
|
218
|
+
AI agent observability platforms
|
|
219
|
+
|
|
220
|
+
- Datadog: chain/agent tracing, token/cost metrics, dashboards, and security scanning. The 2024–2025 documentation covers LLM chain tracing and agent views; start with the Datadog LLM Observability docs.
|
|
221
|
+
- Langtrace/Raindrop: developer‑friendly tracing/monitoring for LLM apps; if you’re integrating LangChain/TypeScript, this deep dive helps you wire spans, attributes, and sampling: Langtrace developer’s guide (TypeScript).
|
|
222
|
+
|
|
223
|
+
Internal architecture context
|
|
224
|
+
|
|
225
|
+
- If you need a refresher on multi‑agent patterns (planners, executors, reviewers) and how they differ from chatbots, this overview frames design decisions: Agentic AI explained vs. chatbots.
|
|
226
|
+
|
|
227
|
+
---
|
|
228
|
+
|
|
229
|
+
## 6) Incident Playbooks That Actually Help
|
|
230
|
+
|
|
231
|
+
Retry storms (provider 429 or partial outage)
|
|
232
|
+
|
|
233
|
+
Immediate actions:
|
|
234
|
+
|
|
235
|
+
- halve retry concurrency; increase backoff caps
|
|
236
|
+
- open circuit on offending provider/tool; route to fallback or degraded mode
|
|
237
|
+
- reduce context sizes to lower per‑call latency and token cost
|
|
238
|
+
|
|
239
|
+
Agent loops or tool thrash
|
|
240
|
+
|
|
241
|
+
Immediate actions:
|
|
242
|
+
|
|
243
|
+
- set step budget caps; abort when exceeded
|
|
244
|
+
- log and alert on repeated identical tool calls within a trace
|
|
245
|
+
- introduce reviewer/critic checkpoints with stricter exit criteria
|
|
246
|
+
|
|
247
|
+
Budget exhaustion near period end
|
|
248
|
+
|
|
249
|
+
Immediate actions:
|
|
250
|
+
|
|
251
|
+
- freeze non‑critical deploys; reduce experiment traffic
|
|
252
|
+
- enable conservative retry policy (lower attempts, higher backoff)
|
|
253
|
+
|
|
254
|
+
---
|
|
255
|
+
|
|
256
|
+
## 7) Rollout Checklist (Copy/Paste)
|
|
257
|
+
|
|
258
|
+
Logging
|
|
259
|
+
|
|
260
|
+
- [ ] JSON logs with ECS‑style fields; include trace.id/span.id
|
|
261
|
+
- [ ] PII/prompt redaction policies; masked by default
|
|
262
|
+
- [ ] Log levels policy enforced via linter/CI
|
|
263
|
+
|
|
264
|
+
Tracing
|
|
265
|
+
|
|
266
|
+
- [ ] OTel spans for prompt/tool/retrieval/moderation; context propagation across async boundaries
|
|
267
|
+
- [ ] Tail‑based sampling for slow/expensive outliers; head sampling elsewhere
|
|
268
|
+
|
|
269
|
+
Retries/Circuit breakers
|
|
270
|
+
|
|
271
|
+
- [ ] Exponential backoff with jitter; cap attempts and per‑attempt timeouts
|
|
272
|
+
- [ ] Only retry on retryable classes (429/5xx/timeouts); honor Retry‑After
|
|
273
|
+
- [ ] Circuit breakers around flaky tools/providers with degraded modes
|
|
274
|
+
|
|
275
|
+
Error budgets
|
|
276
|
+
|
|
277
|
+
- [ ] SLIs defined; SLOs agreed; dashboards live
|
|
278
|
+
- [ ] Multi‑window burn‑rate alerts (14.4× urgent; 6× high) wired to actions
|
|
279
|
+
- [ ] Policy: throttle retries/freeze deploys during burn; publish daily budget remaining
|
|
280
|
+
|
|
281
|
+
Governance
|
|
282
|
+
|
|
283
|
+
- [ ] Canary releases and prompt/model change tracking
|
|
284
|
+
- [ ] Per‑tenant quotas and adaptive concurrency
|
|
285
|
+
- [ ] On‑call runbooks for storms, loops, budget exhaust
|
|
286
|
+
|
|
287
|
+
---
|
|
288
|
+
|
|
289
|
+
## 8) Trade‑offs and Limits
|
|
290
|
+
|
|
291
|
+
- Observability overhead: Tracing and verbose logs cost latency and storage. Use tail‑based sampling and environment‑specific log levels.
|
|
292
|
+
- Privacy and compliance: Treat prompts/outputs as sensitive; default‑redact with a controlled allowlist.
|
|
293
|
+
- Evolving standards: OpenTelemetry’s AI semantic conventions are still moving; pin versions and plan for updates.
|
|
294
|
+
- Platform opacity: Manus 1.5’s public materials do not specify exporter schemas or OTel support; instrument your orchestration tier regardless and integrate what the platform exposes over time.
|
|
295
|
+
|
|
296
|
+
For a deeper OTel‑first approach to LLM systems (including guardrails), this foundation is a practical companion: OTel logging, tracing, and guardrails for LLM apps.
|
|
297
|
+
|
|
298
|
+
---
|
|
299
|
+
|
|
300
|
+
## References and Further Reading
|
|
301
|
+
|
|
302
|
+
- OpenTelemetry’s 2025 guidance on AI agent telemetry design: AI agent observability (2025) on the OTel blog
|
|
303
|
+
- Google SRE’s multi‑window burn‑rate alerting (timeless, updated online): SRE Workbook: Alerting on SLOs
|
|
304
|
+
- Elastic’s structured logging field discipline: ECS logging best practices
|
|
305
|
+
- Rate‑limit‑aware retries for OpenAI APIs: OpenAI Cookbook: handle rate limits
|
|
306
|
+
- Vendor platform with chain/agent views (docs maintained 2024–2025): Datadog LLM Observability
|
|
307
|
+
- Background on Manus 1.5 capabilities (marketing/intro, 2025): Manus 1.5 introduction
|
|
308
|
+
- Burn‑rate alert implementation in Grafana Cloud (2025): Grafana multi‑window burn‑rate alerting
|
|
309
|
+
|
|
310
|
+
---
|
|
311
|
+
|
|
312
|
+
Ready to operationalize this? Start with logs and burn‑rate alerts, then layer in circuit breakers. If you need a research and documentation co‑pilot for your observability rollouts, Skywork AI can help consolidate sources, drafts, and runbooks alongside your dashboards.
|
|
313
|
+
|
|
314
|
+
|
|
315
|
+
|
|
316
|
+
|
|
317
|
+
|
|
318
|
+
### About The Author
|
|
319
|
+
|
|
320
|
+
#### andywang
|
|
321
|
+
|
|
322
|
+
## Related Posts
|
|
323
|
+
|
|
324
|
+
|
|
325
|
+
|
|
326
|
+
|
|
327
|
+
|
|
328
|
+
|
|
329
|
+
|
|
330
|
+
|
|
331
|
+
|
|
332
|
+
|
|
333
|
+
|
|
334
|
+
Get 500 Free Credits of Skywork Get 500 Free Credits of Skywork
|
|
335
|
+
|
|
336
|
+
|
|
337
|
+
|
|
338
|
+
|
|
339
|
+
|
|
340
|
+
Ask Skywork AI about this article
|
|
341
|
+
|
|
342
|
+
Click to get more info about this article
|
|
343
|
+
|
|
344
|
+
|
|
345
|
+
|
|
346
|
+
# Context Engineering for AI Agents: Key Lessons from Manus - DEV Community
|
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
# OpenManus Technical Analysis: Architecture and Implementation of an Open-Source Agent Framework
|
|
2
|
+
|
|
3
|
+
**Source:** https://llmmultiagents.com/en/blogs/OpenManus_Technical_Analysis
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Author: LLM Multi Agent Team
|
|
8
|
+
|
|
9
|
+
OpenManus Technical Analysis: Architecture and Implementation of an Open-Source Agent Framework | LLM Multi Agent
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
OpenManus is an open-source agent framework that provides functionality similar to Manus but without requiring an invite code, allowing developers and users to easily build and use powerful AI agents. This article provides a comprehensive analysis of OpenManus from its architecture design to core components, usage methods, and technical features.
|
|
13
|
+
|
|
14
|
+
## 1. Project Overview
|
|
15
|
+
|
|
16
|
+
OpenManus was developed by a team of developers from the MetaGPT community, including @mannaandpoem, @XiangJinyu, @MoshiQAQ, and others. The project focuses on building a universal agent framework that enables users to complete complex tasks through simple instructions, including but not limited to programming, information retrieval, file processing, and web browsing.
|
|
17
|
+
|
|
18
|
+
## 2. Architecture Design
|
|
19
|
+
|
|
20
|
+
OpenManus adopts a modular architecture design, which consists of the following core modules:
|
|
21
|
+
|
|
22
|
+
### 2.1 Agent Layer
|
|
23
|
+
|
|
24
|
+
The Agent layer is the core of OpenManus, responsible for coordinating various components and executing user instructions. It includes the following types of agents:
|
|
25
|
+
|
|
26
|
+
- ToolCallAgent: Base agent class that handles the abstraction of tool/function calls
|
|
27
|
+
- PlanningAgent: Planning agent responsible for creating and managing task execution plans
|
|
28
|
+
- ReActAgent: Reactive agent that implements the think-act-observe loop pattern
|
|
29
|
+
- Manus: Main agent implementation, inheriting from ToolCallAgent, serving as the direct entry point for user interaction
|
|
30
|
+
|
|
31
|
+
The agent layer follows a hierarchical inheritance relationship, with each agent type having specific responsibilities and extension capabilities:
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
ReActAgent --> ToolCallAgent --> PlanningAgent --> Manus
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
### 2.2 Tool Layer
|
|
38
|
+
|
|
39
|
+
The tool layer provides a rich set of tools that enable the agent to interact with external systems. Major tools include:
|
|
40
|
+
|
|
41
|
+
- PythonExecute: Executes Python code
|
|
42
|
+
- BrowserUseTool: Browser operation tool, supporting webpage navigation, element interaction, etc.
|
|
43
|
+
- GoogleSearch: Google search tool
|
|
44
|
+
- FileSaver: File saving tool
|
|
45
|
+
- PlanningTool: Planning management tool
|
|
46
|
+
- CreateChatCompletion: Tool for creating chat completions
|
|
47
|
+
- Terminate: Tool for terminating execution
|
|
48
|
+
|
|
49
|
+
These tools are defined through a unified interface (BaseTool), ensuring that agents can consistently call and use them.
|
|
50
|
+
|
|
51
|
+
### 2.3 Prompt Layer
|
|
52
|
+
|
|
53
|
+
The prompt layer defines system prompts and instruction templates for interaction with the LLM, including:
|
|
54
|
+
|
|
55
|
+
- System Prompts: Define the agent's role and capability scope
|
|
56
|
+
- Step Prompts: Guide the agent on how to use available tools to perform tasks
|
|
57
|
+
- Planning Prompts: Used for task decomposition and plan creation
|
|
58
|
+
|
|
59
|
+
### 2.4 LLM Interaction Layer
|
|
60
|
+
|
|
61
|
+
This layer is responsible for communicating with large language models, supporting different LLM providers and model configurations. Core functionalities include:
|
|
62
|
+
|
|
63
|
+
- Model configuration management
|
|
64
|
+
- Tool/function calling interface
|
|
65
|
+
- Response parsing and processing
|
|
66
|
+
|
|
67
|
+
## 3. Core Process Analysis
|
|
68
|
+
|
|
69
|
+
The workflow of OpenManus can be summarized in the following key steps:
|
|
70
|
+
|
|
71
|
+
### 3.1 Task Planning Process
|
|
72
|
+
|
|
73
|
+
1. User inputs task request
|
|
74
|
+
2. Initial plan creation (create_initial_plan)
|
|
75
|
+
3. Task decomposition into executable steps
|
|
76
|
+
4. Generation of structured plan (including goals, steps, and status tracking)
|
|
77
|
+
|
|
78
|
+
### 3.2 Task Execution Process
|
|
79
|
+
|
|
80
|
+
1. Get current active step (_get_current_step_index)
|
|
81
|
+
2. Thinking phase (think): Evaluate current status and decide next action
|
|
82
|
+
3. Action phase (act): Execute selected tools and record results
|
|
83
|
+
4. Update plan status (update_plan_status)
|
|
84
|
+
5. Repeat the above steps until the plan is completed or maximum step count is reached
|
|
85
|
+
|
|
86
|
+
### 3.3 Tool Calling Mechanism
|
|
87
|
+
|
|
88
|
+
1. Parse tool call parameters
|
|
89
|
+
2. Execute the tool and obtain results
|
|
90
|
+
3. Add results to the agent's memory
|
|
91
|
+
4. Handle status changes for special tools (such as termination)
|
|
92
|
+
|
|
93
|
+
## 4. Key Component Analysis
|
|
94
|
+
|
|
95
|
+
### 4.1 PlanningAgent
|
|
96
|
+
|
|
97
|
+
PlanningAgent is a core agent type in OpenManus, specifically responsible for task planning and execution. Its main functions include:
|
|
98
|
+
|
|
99
|
+
- Plan Creation and Management: Creating structured plans based on user input
|
|
100
|
+
- Step Tracking: Recording the execution status of each step
|
|
101
|
+
- Progress Management: Automatically marking the current active step and updating completion status
|
|
102
|
+
|
|
103
|
+
The key feature is its ability to maintain task execution context, ensuring that complex tasks can be completed in sequence while providing clear execution status feedback.
|
|
104
|
+
|
|
105
|
+
### 4.2 ToolCallAgent
|
|
106
|
+
|
|
107
|
+
ToolCallAgent implements the fundamental mechanism for tool calling. Its main responsibilities include:
|
|
108
|
+
|
|
109
|
+
- Tool Selection: Deciding which tools to use to complete tasks
|
|
110
|
+
- Parameter Parsing: Processing tool parameters in JSON format
|
|
111
|
+
- Result Processing: Formatting tool execution results
|
|
112
|
+
- Error Handling: Providing robust error capture and reporting mechanisms
|
|
113
|
+
|
|
114
|
+
The design of ToolCallAgent allows OpenManus to flexibly extend new tools while maintaining a unified calling interface.
|
|
115
|
+
|
|
116
|
+
### 4.3 BrowserUseTool
|
|
117
|
+
|
|
118
|
+
BrowserUseTool is a powerful tool that allows agents to interact with web browsers. Its main functions include:
|
|
119
|
+
|
|
120
|
+
- Web Navigation: Visiting specified URLs
|
|
121
|
+
- Element Interaction: Clicking, inputting text
|
|
122
|
+
- Content Extraction: Getting HTML, text, and links
|
|
123
|
+
- JavaScript Execution: Running JS code in pages
|
|
124
|
+
- Tab Management: Creating, switching, and closing tabs
|
|
125
|
+
|
|
126
|
+
This tool is implemented based on the browser-use library, providing agents with rich web interaction capabilities.
|
|
127
|
+
|
|
128
|
+
### 4.4 PythonExecute
|
|
129
|
+
|
|
130
|
+
The PythonExecute tool allows agents to execute Python code, enabling them to:
|
|
131
|
+
|
|
132
|
+
- Perform data processing tasks
|
|
133
|
+
- Conduct system operations
|
|
134
|
+
- Create and modify files
|
|
135
|
+
- Call other Python libraries and APIs
|
|
136
|
+
|
|
137
|
+
This makes OpenManus capable of executing programming tasks, which is a key component of its universal capabilities.
|
|
138
|
+
|
|
139
|
+
## 5. Usage Guide
|
|
140
|
+
|
|
141
|
+
### 5.1 Installation and Configuration
|
|
142
|
+
|
|
143
|
+
OpenManus provides two installation methods:
|
|
144
|
+
|
|
145
|
+
Using conda:
|
|
146
|
+
|
|
147
|
+
bash
|
|
148
|
+
|
|
149
|
+
```
|
|
150
|
+
conda create -n open_manus python=3.12
|
|
151
|
+
conda activate open_manus
|
|
152
|
+
git clone https://github.com/mannaandpoem/OpenManus.git
|
|
153
|
+
cd OpenManus
|
|
154
|
+
pip install -r requirements.txt
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
Using uv (Recommended):
|
|
158
|
+
|
|
159
|
+
bash
|
|
160
|
+
|
|
161
|
+
```
|
|
162
|
+
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
163
|
+
git clone https://github.com/mannaandpoem/OpenManus.git
|
|
164
|
+
cd OpenManus
|
|
165
|
+
uv venv
|
|
166
|
+
source .venv/bin/activate
|
|
167
|
+
uv pip install -r requirements.txt
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
### 5.2 Configuring LLM
|
|
171
|
+
|
|
172
|
+
OpenManus requires LLM API configuration to work properly:
|
|
173
|
+
|
|
174
|
+
Create a configuration file:
|
|
175
|
+
|
|
176
|
+
bash
|
|
177
|
+
|
|
178
|
+
```
|
|
179
|
+
cp config/config.example.toml config/config.toml
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Edit`config.toml` to add API keys:
|
|
183
|
+
|
|
184
|
+
toml
|
|
185
|
+
|
|
186
|
+
```
|
|
187
|
+
[llm]
|
|
188
|
+
model = "gpt-4o"
|
|
189
|
+
base_url = "https://api.openai.com/v1"
|
|
190
|
+
api_key = "sk-..." # Replace with your actual API key
|
|
191
|
+
max_tokens = 4096
|
|
192
|
+
temperature = 0.0
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### 5.3 Basic Usage
|
|
196
|
+
|
|
197
|
+
Starting OpenManus requires just one command:
|
|
198
|
+
|
|
199
|
+
bash
|
|
200
|
+
|
|
201
|
+
```
|
|
202
|
+
python main.py
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
Then input your requests through the terminal, such as:
|
|
206
|
+
|
|
207
|
+
- "Create a simple website calculator"
|
|
208
|
+
- "Find the latest research on climate change"
|
|
209
|
+
- "Help me optimize this Python code"
|
|
210
|
+
|
|
211
|
+
OpenManus will automatically plan tasks and call the appropriate tools to complete the request.
|
|
212
|
+
|
|
213
|
+
## 6. Code Analysis
|
|
214
|
+
|
|
215
|
+
### 6.1 Main Entry (main.py)
|
|
216
|
+
|
|
217
|
+
python
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
async def main():
|
|
221
|
+
agent = Manus()
|
|
222
|
+
while True:
|
|
223
|
+
try:
|
|
224
|
+
prompt = input("Enter your prompt (or 'exit'/'quit' to quit): ")
|
|
225
|
+
prompt_lower = prompt.lower()
|
|
226
|
+
if prompt_lower in ["exit", "quit"]:
|
|
227
|
+
logger.info("Goodbye!")
|
|
228
|
+
break
|
|
229
|
+
if not prompt.strip():
|
|
230
|
+
logger.warning("Skipping empty prompt.")
|
|
231
|
+
continue
|
|
232
|
+
logger.warning("Processing your request...")
|
|
233
|
+
await agent.run(prompt)
|
|
234
|
+
except KeyboardInterrupt:
|
|
235
|
+
logger.warning("Goodbye!")
|
|
236
|
+
break
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
The main entry is very concise, creating a Manus agent instance, then looping to receive user input and passing it to the agent for processing.
|
|
240
|
+
|
|
241
|
+
### 6.2 Manus Agent Implementation
|
|
242
|
+
|
|
243
|
+
python
|
|
244
|
+
|
|
245
|
+
```
|
|
246
|
+
class Manus(ToolCallAgent):
|
|
247
|
+
name: str = "Manus"
|
|
248
|
+
description: str = "A versatile agent that can solve various tasks using multiple tools"
|
|
249
|
+
system_prompt: str = SYSTEM_PROMPT
|
|
250
|
+
next_step_prompt: str = NEXT_STEP_PROMPT
|
|
251
|
+
available_tools: ToolCollection = Field(
|
|
252
|
+
default_factory=lambda: ToolCollection(
|
|
253
|
+
PythonExecute(), GoogleSearch(), BrowserUseTool(), FileSaver(), Terminate()
|
|
254
|
+
)
|
|
255
|
+
)
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
The Manus agent inherits from ToolCallAgent and configures system prompts and available tool sets.
|
|
259
|
+
|
|
260
|
+
### 6.3 Tool Execution Method (execute_tool)
|
|
261
|
+
|
|
262
|
+
python
|
|
263
|
+
|
|
264
|
+
```
|
|
265
|
+
async def execute_tool(self, command: ToolCall) -> str:
|
|
266
|
+
if not command or not command.function or not command.function.name:
|
|
267
|
+
return "Error: Invalid command format"
|
|
268
|
+
|
|
269
|
+
name = command.function.name
|
|
270
|
+
if name not in self.available_tools.tool_map:
|
|
271
|
+
return f"Error: Unknown tool '{name}'"
|
|
272
|
+
|
|
273
|
+
try:
|
|
274
|
+
# Parse arguments
|
|
275
|
+
args = json.loads(command.function.arguments or "{}")
|
|
276
|
+
|
|
277
|
+
# Execute tool
|
|
278
|
+
logger.info(f"🔧 Activating tool: '{name}'...")
|
|
279
|
+
result = await self.available_tools.execute(name=name, tool_input=args)
|
|
280
|
+
|
|
281
|
+
# Format result
|
|
282
|
+
observation = (
|
|
283
|
+
f"Observed output of cmd `{name}` executed:\
|
|
284
|
+
{str(result)}"
|
|
285
|
+
if result
|
|
286
|
+
else f"Cmd `{name}` completed with no output"
|
|
287
|
+
)
|
|
288
|
+
|
|
289
|
+
# Handle special tools
|
|
290
|
+
await self._handle_special_tool(name=name, result=result)
|
|
291
|
+
|
|
292
|
+
return observation
|
|
293
|
+
except Exception as e:
|
|
294
|
+
# Error handling
|
|
295
|
+
error_msg = f"⚠️ Tool '{name}' encountered a problem: {str(e)}"
|
|
296
|
+
logger.error(error_msg)
|
|
297
|
+
return f"Error: {error_msg}"
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
This method demonstrates how OpenManus executes tool calls, including parameter parsing, error handling, and result formatting.
|
|
301
|
+
|
|
302
|
+
## 7. Project Roadmap
|
|
303
|
+
|
|
304
|
+
According to the project roadmap, OpenManus plans to implement the following features in the future:
|
|
305
|
+
|
|
306
|
+
1. Enhance Planning capabilities and optimize task decomposition and execution logic
|
|
307
|
+
2. Introduce standardized evaluation metrics (based on GAIA and TAU-Bench)
|
|
308
|
+
3. Expand model adaptation and optimize low-cost application scenarios
|
|
309
|
+
4. Implement containerized deployment to simplify installation and usage workflows
|
|
310
|
+
5. Enrich example libraries with more practical cases
|
|
311
|
+
6. Develop frontend/backend to improve user experience
|
|
312
|
+
|
|
313
|
+
## 8. Conclusion
|
|
314
|
+
|
|
315
|
+
OpenManus is a well-designed open-source agent framework that achieves powerful task planning and execution capabilities through its modular architecture and flexible tool integration mechanism. Its core advantages include:
|
|
316
|
+
|
|
317
|
+
1. Flexible Agent System: From basic tool-calling agents to complex planning agents, providing implementations of agents at different levels
|
|
318
|
+
2. Rich Tool Set: Built-in Python execution, browser interaction, search, and other tools to meet diverse task requirements
|
|
319
|
+
3. Structured Task Planning: Ability to automatically decompose complex tasks and track execution progress
|
|
320
|
+
4. Open Source: Allowing community contributions and extensions, continuously evolving
|
|
321
|
+
|
|
322
|
+
OpenManus provides developers with a powerful platform that makes building complex AI agents simple and intuitive, making it an ideal choice for AI application development. Whether researchers, developers, or ordinary users, everyone can benefit from OpenManus by creating their own intelligent assistants to solve various problems.
|
|
323
|
+
|
|
324
|
+
For developers who want to learn more or contribute to the project, it is recommended to check the official GitHub repository and join the community discussion group to jointly promote the development of OpenManus.
|