machinaos 0.0.57 → 0.0.59

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -70,6 +70,11 @@ Create AI agents that remember conversations, use tools, and work together. Choo
70
70
  - Update spreadsheets
71
71
  - Manage tasks and contacts
72
72
 
73
+ ### Universal Email (IMAP/SMTP)
74
+ - Send, read, search, and manage emails via the Himalaya CLI
75
+ - Works with Gmail, Outlook, Yahoo, iCloud, ProtonMail, Fastmail, or any custom IMAP/SMTP server
76
+ - Polling-based trigger for incoming email workflows
77
+
73
78
  ### Control Your Devices
74
79
  - Send WhatsApp messages automatically
75
80
  - Post to Twitter/X
@@ -77,10 +82,19 @@ Create AI agents that remember conversations, use tools, and work together. Choo
77
82
  - Control your Android phone (WiFi, Bluetooth, apps, camera)
78
83
  - Schedule tasks and reminders
79
84
 
80
- ### Process Documents
81
- - Scrape websites
85
+ ### Browse the Web
86
+ - Interactive browser automation with accessibility-tree navigation (agent-browser)
87
+ - Web scraping with BeautifulSoup or Playwright (crawlee)
82
88
  - Route requests through residential proxies with geo-targeting
83
89
  - Run Apify actors for social media and search engine scraping
90
+ - DuckDuckGo, Brave, Serper (Google), and Perplexity search
91
+
92
+ ### Plan Complex Tasks
93
+ - `writeTodos` tool lets any AI agent create and update structured task lists
94
+ - Real-time checklist rendering in the UI
95
+ - Plan-work-update loop with `pending` / `in_progress` / `completed` states
96
+
97
+ ### Process Documents
84
98
  - Parse PDFs and documents
85
99
  - Search your files with AI
86
100
 
@@ -102,7 +116,7 @@ Create AI agents that remember conversations, use tools, and work together. Choo
102
116
  - **AI Employee / Orchestrator** - Team lead agents for coordinating multiple specialized agents
103
117
  - **Intelligent Delegation** - AI decides when to delegate based on task context
104
118
  - **Delegation Tools** - Connected agents become `delegate_to_*` tools automatically
105
- - **13 Specialized Agents** - Android, Coding, Web, Task, Social, Travel, Tool, Productivity, Payments, Consumer, Autonomous, Orchestrator
119
+ - **16 Specialized Agents** - Android, Coding, Web, Task, Social, Travel, Tool, Productivity, Payments, Consumer, Autonomous, Orchestrator, AI Employee, RLM, Claude Code, Deep Agent
106
120
  - **Team Monitor** - Real-time visualization of team operations
107
121
 
108
122
  ### Run Code
@@ -148,11 +162,11 @@ A contributor's map to the codebase. This section tells you *where things live*
148
162
 
149
163
  At a glance:
150
164
 
151
- - **96 workflow nodes** across 21 categories (AI, agents, social, Android, Google Workspace, documents, code, proxies, utilities)
165
+ - **106 workflow nodes** across 25 categories (AI, agents, social, Android, Google Workspace, email, browser, documents, code, proxies, utilities)
152
166
  - **10 LLM providers** via a hybrid native SDK + LangChain architecture
153
- - **15 specialized AI agents** with the Agent Teams delegation pattern
154
- - **89 WebSocket handlers** replacing most REST endpoints
155
- - **49 built-in skills** across 10 categories, editable in-UI with SKILL.md defaults on disk
167
+ - **16 specialized AI agents** with the Agent Teams delegation pattern
168
+ - **127 WebSocket handlers** replacing most REST endpoints
169
+ - **55 built-in skills** across 10 categories, editable in-UI with SKILL.md defaults on disk
156
170
  - **Three execution modes** with automatic fallback: Temporal distributed, Redis parallel, sequential
157
171
 
158
172
  ### How Workflows Execute
@@ -167,7 +181,7 @@ Deep dives: [DESIGN.md](docs-internal/DESIGN.md) - [TEMPORAL_ARCHITECTURE.md](do
167
181
 
168
182
  [![AI Agent Routing](docs/diagrams/ai-agent-routing.svg)](https://raw.githubusercontent.com/trohitg/MachinaOS/main/docs/diagrams/ai-agent-routing.svg)
169
183
 
170
- AI execution splits into two paths. `execute_chat()` for direct chat completions prefers the native SDK layer in [services/llm/](server/services/llm/) (10 providers, lazy imports, normalized `LLMResponse`), falling back to LangChain for Groq and Cerebras. `execute_agent()` and `execute_chat_agent()` always use LangChain + LangGraph because tool-calling, state graphs, and the checkpointer have no native equivalent today. Team leads (`orchestrator_agent`, `ai_employee`) auto-inject `delegate_to_<type>` tools for every agent connected to their `input-teammates` handle.
184
+ AI execution splits into two paths. `execute_chat()` for direct chat completions prefers the native SDK layer in [services/llm/](server/services/llm/) (10 providers, lazy imports, normalized `LLMResponse`), falling back to LangChain for Groq and Cerebras. `execute_agent()` and `execute_chat_agent()` always use LangChain + LangGraph because tool-calling, state graphs, and the checkpointer have no native equivalent today. Team leads (`orchestrator_agent`, `ai_employee`) auto-inject `delegate_to_<type>` tools for every agent connected to their `input-teammates` handle. The Deep Agent variant uses [LangChain DeepAgents](https://github.com/langchain-ai/deepagents) with built-in filesystem tools, sub-agent delegation, and todo planning; the RLM Agent uses a REPL-based recursive language model pattern. Long-running activities (DeepAgent, browser automation) stay alive across Temporal's 2-minute heartbeat window via per-message `activity.heartbeat()` calls in the WebSocket read loop.
171
185
 
172
186
  Deep dives: [agent_architecture.md](docs-internal/agent_architecture.md) - [native_llm_sdk.md](docs-internal/native_llm_sdk.md) - [agent_teams.md](docs-internal/agent_teams.md) - [memory_compaction.md](docs-internal/memory_compaction.md)
173
187
 
@@ -175,18 +189,18 @@ Deep dives: [agent_architecture.md](docs-internal/agent_architecture.md) - [nati
175
189
 
176
190
  | Directory | What lives here | Start reading |
177
191
  |---|---|---|
178
- | `client/src/nodeDefinitions/` | 96 workflow node definitions (TypeScript) | [node_creation.md](docs-internal/node_creation.md) |
192
+ | `client/src/nodeDefinitions/` | 106 workflow node definitions across 25 TypeScript files | [node_creation.md](docs-internal/node_creation.md) |
179
193
  | `client/src/components/` | React Flow canvas, parameter panel, modals | [CLAUDE.md](CLAUDE.md) |
180
194
  | `server/services/` | WorkflowService, NodeExecutor, AI service | [DESIGN.md](docs-internal/DESIGN.md) |
181
195
  | `server/services/handlers/` | One handler per node type (dispatch targets) | [node_creation.md](docs-internal/node_creation.md) |
182
196
  | `server/services/llm/` | Native LLM SDK layer (10 providers) | [native_llm_sdk.md](docs-internal/native_llm_sdk.md) |
183
197
  | `server/services/execution/` | Decide pattern, DLQ, recovery, conditions | [DESIGN.md](docs-internal/DESIGN.md) |
184
198
  | `server/services/temporal/` | Distributed execution via Temporal | [TEMPORAL_ARCHITECTURE.md](docs-internal/TEMPORAL_ARCHITECTURE.md) |
185
- | `server/routers/websocket.py` | 89 WebSocket handlers | [status_broadcaster.md](docs-internal/status_broadcaster.md) |
199
+ | `server/routers/websocket.py` | 127 WebSocket handlers | [status_broadcaster.md](docs-internal/status_broadcaster.md) |
186
200
  | `server/core/` | Cache, encryption, DI container, config | [credentials_encryption.md](docs-internal/credentials_encryption.md) |
187
- | `server/skills/` | 49 skill SKILL.md files in 10 folders | [GUIDE.md](server/skills/GUIDE.md) |
188
- | `server/config/` | llm_defaults.json, pricing.json, model_registry.json | [pricing_service.md](docs-internal/pricing_service.md) |
189
- | `docs-internal/` | In-repo architecture deep dives (28 files) | Index below |
201
+ | `server/skills/` | 55 skill SKILL.md files across 10 folders | [GUIDE.md](server/skills/GUIDE.md) |
202
+ | `server/config/` | llm_defaults.json, pricing.json, model_registry.json, email_providers.json, google_apis.json | [pricing_service.md](docs-internal/pricing_service.md) |
203
+ | `docs-internal/` | In-repo architecture deep dives (30 files) | Index below |
190
204
 
191
205
  ### How to Contribute
192
206
 
@@ -252,7 +266,8 @@ Full setup and scripts reference: [SETUP.md](docs-internal/SETUP.md) - [SCRIPTS.
252
266
  |---|---|
253
267
  | [DESIGN.md](docs-internal/DESIGN.md) | Execution engine architecture, design patterns, execution modes |
254
268
  | [TEMPORAL_ARCHITECTURE.md](docs-internal/TEMPORAL_ARCHITECTURE.md) | Distributed execution via Temporal activities |
255
- | [workflow-schema.md](docs-internal/workflow-schema.md) | Workflow JSON schema and full node catalog (96 nodes) |
269
+ | [workflow-schema.md](docs-internal/workflow-schema.md) | Workflow JSON schema and full node catalog (106 nodes) |
270
+ | [deep_agent.md](docs-internal/deep_agent.md) | LangChain DeepAgents integration with filesystem tools and sub-agents |
256
271
  | [ROADMAP.md](docs-internal/ROADMAP.md) | Implementation status and completed phases |
257
272
  | [SETUP.md](docs-internal/SETUP.md) | Development environment setup |
258
273
  | [SCRIPTS.md](docs-internal/SCRIPTS.md) | npm/shell scripts reference |