npmai-agent 0.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1185 @@
1
+ Metadata-Version: 2.4
2
+ Name: npmai-agent
3
+ Version: 0.0.1
4
+ Summary: A production-grade AI agent framework with 21 integrated tool classes and a four-role autonomous LLM pipeline — built on the NPMAI ECOSYSTEM.
5
+ Author-email: Sonu Kumar <sonuramashishnpm@gmail.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://npmai.netlify.app
8
+ Project-URL: Documentation, https://npmai.netlify.app
9
+ Project-URL: Repository, https://github.com/sonuramashishnpm/npmai-agent
10
+ Project-URL: Bug Tracker, https://github.com/sonuramashishnpm/npmai-agent/issues
11
+ Project-URL: PyPI, https://pypi.org/project/npmai-agent
12
+ Keywords: ai,agent,automation,llm,npmai,npmai-ecosystem,desktop-automation,multi-agent,rag,open-source,email-automation,file-automation,github-automation,agentic-ai,free-llm
13
+ Classifier: Development Status :: 3 - Alpha
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: Intended Audience :: Science/Research
16
+ Classifier: License :: OSI Approved :: MIT License
17
+ Classifier: Operating System :: OS Independent
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3.9
20
+ Classifier: Programming Language :: Python :: 3.10
21
+ Classifier: Programming Language :: Python :: 3.11
22
+ Classifier: Programming Language :: Python :: 3.12
23
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
24
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
25
+ Classifier: Topic :: System :: Systems Administration
26
+ Classifier: Topic :: Office/Business :: Scheduling
27
+ Requires-Python: >=3.9
28
+ Description-Content-Type: text/markdown
29
+ Requires-Dist: npmai>=0.1.9
30
+ Requires-Dist: requests>=2.31.0
31
+ Requires-Dist: cryptography>=41.0.0
32
+ Requires-Dist: langchain-core>=0.1.0
33
+ Provides-Extra: full
34
+ Requires-Dist: beautifulsoup4>=4.12.0; extra == "full"
35
+ Requires-Dist: playwright>=1.40.0; extra == "full"
36
+ Requires-Dist: PyGithub>=2.1.0; extra == "full"
37
+ Requires-Dist: slack-sdk>=3.23.0; extra == "full"
38
+ Requires-Dist: gspread>=5.12.0; extra == "full"
39
+ Requires-Dist: google-auth>=2.23.0; extra == "full"
40
+ Requires-Dist: openpyxl>=3.1.0; extra == "full"
41
+ Requires-Dist: pandas>=2.1.0; extra == "full"
42
+ Requires-Dist: Pillow>=10.0.0; extra == "full"
43
+ Requires-Dist: pypdf>=3.17.0; extra == "full"
44
+ Requires-Dist: python-docx>=1.1.0; extra == "full"
45
+ Requires-Dist: pyttsx3>=2.90; extra == "full"
46
+ Requires-Dist: SpeechRecognition>=3.10.0; extra == "full"
47
+ Requires-Dist: pyperclip>=1.8.2; extra == "full"
48
+ Requires-Dist: schedule>=1.2.0; extra == "full"
49
+ Requires-Dist: psutil>=5.9.0; extra == "full"
50
+ Requires-Dist: watchdog>=3.0.0; extra == "full"
51
+ Requires-Dist: tweepy>=4.14.0; extra == "full"
52
+ Requires-Dist: pywhatkit>=5.4; extra == "full"
53
+ Requires-Dist: qrcode>=7.4.2; extra == "full"
54
+ Requires-Dist: paramiko>=3.3.0; extra == "full"
55
+ Requires-Dist: python-dotenv>=1.0.0; extra == "full"
56
+ Requires-Dist: pyautogui>=0.9.54; extra == "full"
57
+ Requires-Dist: opencv-python>=4.8.0; extra == "full"
58
+ Requires-Dist: pytesseract>=0.3.10; extra == "full"
59
+ Requires-Dist: yt-dlp>=2023.11.16; extra == "full"
60
+ Requires-Dist: discord.py>=2.3.2; extra == "full"
61
+ Requires-Dist: telethon>=1.32.0; extra == "full"
62
+ Requires-Dist: notion-client>=2.2.1; extra == "full"
63
+ Requires-Dist: todoist-api-python>=2.1.3; extra == "full"
64
+ Requires-Dist: jira>=3.5.2; extra == "full"
65
+ Provides-Extra: dev
66
+ Requires-Dist: pytest>=7.4.0; extra == "dev"
67
+ Requires-Dist: black>=23.0.0; extra == "dev"
68
+ Requires-Dist: ruff>=0.1.0; extra == "dev"
69
+
70
+ # npmai-agent
71
+
72
+ > **Part of the [NPMAI ECOSYSTEM](https://npmai.netlify.app) — Open Source AI Research & Development**
73
+
74
+ [![PyPI version](https://img.shields.io/badge/pypi-npmai--agent-blue?style=flat-square)](https://pypi.org/project/npmai-agent)
75
+ [![License](https://img.shields.io/badge/license-MIT-green?style=flat-square)](LICENSE)
76
+ [![Built by NPMAI](https://img.shields.io/badge/built%20by-NPMAI%20ECOSYSTEM-purple?style=flat-square)](https://npmai.netlify.app)
77
+ [![Version](https://img.shields.io/badge/version-0.0.1-orange?style=flat-square)](https://pypi.org/project/npmai-agent)
78
+
79
+ **npmai-agent** (internally known as the **npmai-agent-suite**) is a production-grade AI agent framework built on top of the NPMAI ECOSYSTEM. It gives any Python developer a fully autonomous, multi-LLM agentic pipeline with 21 integrated tool classes — covering everything from email automation, file management, GitHub operations, browser control, spreadsheets, PDF processing, image manipulation, SSH, Telegram, Discord, Slack, Twitter, QR codes, voice, RAG, and more — all orchestrated by a four-role LLM pipeline (Planner → Coder → Auditor → Verifier) with Fernet-encrypted credential storage and LARA RAG integration.
80
+
81
+ No paid APIs required. Free forever. Built on 45+ open-source LLMs.
82
+
83
+ ---
84
+
85
+
86
+
87
+ ## Developed by NPMAI ECOSYSTEM
88
+
89
+ **npmai-agent** is a product of the **[NPMAI ECOSYSTEM](https://npmai.netlify.app)**, an open-source AI research and development community founded by **Sonu Kumar** (known online as **Bihar Viral Boy**).
90
+
91
+ > Sonu Kumar is a 14-year-old self-taught developer, TEDx speaker, and researcher from Bihar, India, currently studying in Kota, Rajasthan. He founded NPMAI ECOSYSTEM at age 14, building the entire infrastructure on free cloud services — Render, HuggingFace Spaces, Supabase, Netlify — which now serves hundreds of thousands of developers worldwide with 2 million+ PyPI downloads
92
+ and 45+ community-contributed LLMs.
93
+
94
+ **Founder:** Sonu Kumar · [GitHub](https://github.com/sonuramashishnpm) · [PyPI](https://pypi.org/project/npmai) · [sonuramashishnpm@gmail.com](mailto:sonuramashishnpm@gmail.com)
95
+
96
+ **Ecosystem Website:** [https://npmai.netlify.app](https://npmai.netlify.app)
97
+
98
+ ---
99
+
100
+ ## Table of Contents
101
+
102
+ - [What is npmai-agent?](#what-is-npmai-agent)
103
+ - [Why npmai-agent?](#why-npmai-agent)
104
+ - [Features](#features)
105
+ - [Architecture Overview](#architecture-overview)
106
+ - [Installation](#installation)
107
+ - [Configuration — CredStore](#configuration--credstore)
108
+ - [Tool Classes & Documentation](#tool-classes--documentation)
109
+ - [CredStore](#credstore)
110
+ - [Workspace](#workspace)
111
+ - [EmailTool](#emailtool)
112
+ - [FileTool](#filetool)
113
+ - [PDFTool](#pdftool)
114
+ - [WebTool](#webtool)
115
+ - [SpreadsheetTool](#spreadsheettool)
116
+ - [GitHubTool](#githubtool)
117
+ - [SlackTool](#slacktool)
118
+ - [DiscordTool](#discordtool)
119
+ - [WhatsAppTool](#whatsapptool)
120
+ - [NotionTool](#notiontool)
121
+ - [TwitterTool](#twittertool)
122
+ - [SystemTool](#systemtool)
123
+ - [ImageTool](#imagetool)
124
+ - [SchedulerTool](#schedulertool)
125
+ - [JiraTool](#jiratool)
126
+ - [TelegramTool](#telegramtool)
127
+ - [QRTool](#qrtool)
128
+ - [VoiceTool](#voicetool)
129
+ - [WatcherTool](#watchertool)
130
+ - [RAGTool](#ragtool)
131
+ - [SSHTool](#sshtool)
132
+ - [AgentBrain — The Autonomous Pipeline](#agentbrain--the-autonomous-pipeline)
133
+ - [Executor](#executor)
134
+ - [Version](#version)
135
+ - [License](#license)
136
+
137
+ ---
138
+
139
+ ## What is npmai-agent?
140
+
141
+ `npmai-agent` is a **desktop automation + agentic AI framework** that lets you:
142
+
143
+ 1. **Use individual tool classes directly** in your own Python scripts.
144
+ 2. **Hand a plain-English task to `AgentBrain`** and have a multi-LLM pipeline autonomously plan, generate code, audit it for security, execute it, verify the result, and retry on failure — all without you writing a single line of task-specific code.
145
+
146
+ It is the backbone of the **npmai-agent-suite** desktop application and is designed to be equally powerful as a headless library.
147
+
148
+ ---
149
+
150
+ ## Why npmai-agent?
151
+
152
+ Most AI agent frameworks require you to pay for GPT-4, Claude, or Gemini API credits. `npmai-agent` runs entirely on the **NPMAI ECOSYSTEM load balancer** — 45+ open-source LLMs available for free via `pip install npmai`. No credit card. No rate-limit anxiety. No vendor lock-in.
153
+
154
+ | Pain Point | npmai-agent Solution |
155
+ |---|---|
156
+ | Paid LLM APIs | 45+ free LLMs via NPMAI load balancer |
157
+ | Single-model pipelines | 4 specialized LLM roles (Planner, Coder, Auditor, Verifier) |
158
+ | Manual tool integration | 21 ready-made tool classes, zero boilerplate |
159
+ | Plain-text credential storage | Fernet-encrypted `CredStore` with machine-specific key |
160
+ | No memory between runs | Persistent `Memory` sessions via `npmai.Memory` |
161
+ | Document Q&A | LARA RAG pipeline via `npmai.Rag` |
162
+ | Complex setup | Auto-installs all dependencies on first run |
163
+
164
+ ---
165
+
166
+ ## Features
167
+
168
+ - **21 integrated tool classes** — email, files, PDF, web, spreadsheets, GitHub, Slack, Discord, WhatsApp, Notion, Twitter, system, images, scheduler, Jira, Telegram, QR, voice, file watcher, RAG, SSH
169
+ - **Four-role autonomous LLM pipeline** — Planner, Coder, Auditor, Verifier each run a different model optimised for their role
170
+ - **Security auditor built-in** — every generated code block is scanned before execution; destructive or credential-stealing code is blocked
171
+ - **Fernet-encrypted credential store** — machine-specific AES key, credentials never stored in plain text
172
+ - **LARA RAG integration** — query and summarise large documents using the NPMAI RAG architecture
173
+ - **Persistent memory** — separate memory contexts for planning, coding, chat, and task history
174
+ - **Auto-dependency installer** — missing packages are pip-installed automatically at runtime
175
+ - **Up to 12 auto-retries per step** — failed steps are regenerated with the error context fed back to the coder LLM
176
+ - **Workspace scanner** — the agent scans your Desktop, Downloads, Documents, Pictures, Videos, and Music folders to build a live context profile before planning
177
+ - **Kill switch support** — long-running tasks can be cancelled mid-execution
178
+
179
+ ---
180
+
181
+ ## Architecture Overview
182
+
183
+ ```
184
+ User Task (plain English)
185
+
186
+
187
+ ┌─────────────────┐
188
+ │ AgentBrain │
189
+ │ │
190
+ │ 1. Workspace │ ← scans your file system for context
191
+ │ Scanner │
192
+ │ │
193
+ │ 2. Planner LLM │ ← breaks task into 2–5 atomic steps
194
+ │ (llama3.2:3b) │
195
+ │ │
196
+ │ For each step: │
197
+ │ ┌───────────┐ │
198
+ │ │ 3. Coder │ │ ← generates Python code (codellama:7b)
199
+ │ │ LLM │ │
200
+ │ └─────┬─────┘ │
201
+ │ │ │
202
+ │ ┌─────▼─────┐ │
203
+ │ │ 4. Auditor│ │ ← security scan (qwen2.5-coder:7b)
204
+ │ │ LLM │ │
205
+ │ └─────┬─────┘ │
206
+ │ │ │
207
+ │ ┌─────▼─────┐ │
208
+ │ │ 5.Executor│ │ ← subprocess runner with live stdout
209
+ │ └─────┬─────┘ │
210
+ │ │ │
211
+ │ ┌─────▼─────┐ │
212
+ │ │6. Verifier│ │ ← confirms step success (llama3.2:3b)
213
+ │ │ LLM │ │
214
+ │ └───────────┘ │
215
+ │ (retry ×12) │
216
+ └─────────────────┘
217
+
218
+
219
+ Task Complete ✓
220
+ ```
221
+
222
+ All LLMs are served free via the NPMAI ECOSYSTEM load balancer (`npmai.Ollama` with `change=True`).
223
+
224
+ ---
225
+
226
+ ## Installation
227
+
228
+ ```bash
229
+ pip install npmai-agent
230
+ ```
231
+
232
+ > `npmai-agent` will automatically install all required dependencies on first run, including `npmai`, `requests`, `beautifulsoup4`, `playwright`, `PyGithub`, `slack-sdk`, `gspread`, `openpyxl`, `pandas`, `Pillow`, `pypdf`, `python-docx`, `pyttsx3`, `SpeechRecognition`, `pyperclip`, `schedule`, `psutil`, `watchdog`, `tweepy`, `pywhatkit`, `qrcode`, `cryptography`, `paramiko`, `python-dotenv`, `pyautogui`, `opencv-python`, `pytesseract`, `yt-dlp`, `discord.py`, `telethon`, `notion-client`, `todoist-api-python`, `jira`, and more.
233
+
234
+ ---
235
+
236
+ ## Configuration — CredStore
237
+
238
+ Before using any tool that requires authentication (email, GitHub, Slack, etc.), you must store your credentials using `CredStore`. Credentials are encrypted with a Fernet key derived from your machine and stored locally at `~/.npmai_agent/creds.json`. They are never stored in plain text.
239
+
240
+ ### How CredStore Works
241
+
242
+ ```python
243
+ from npmai-agent import CredStore
244
+
245
+ # Save credentials for a named service
246
+ CredStore.save("gmail", {
247
+ "email": "you@gmail.com",
248
+ "password": "your-app-password", # Use Gmail App Password, not your login password
249
+ "smtp_host": "smtp.gmail.com",
250
+ "smtp_port": 587,
251
+ "imap_host": "imap.gmail.com"
252
+ })
253
+
254
+ # Load credentials (returns a dict)
255
+ creds = CredStore.load("gmail")
256
+ print(creds["email"])
257
+
258
+ # List all saved credential keys
259
+ keys = CredStore.all_keys()
260
+ print(keys) # ['gmail', 'github', 'slack', ...]
261
+ ```
262
+
263
+ ### Credential Reference by Tool
264
+
265
+ | Tool | `cred_key` | Required fields |
266
+ |---|---|---|
267
+ | `EmailTool` | `"gmail"` | `email`, `password`, `smtp_host`, `smtp_port`, `imap_host` |
268
+ | `GitHubTool` | `"github"` | `token` |
269
+ | `SlackTool` | `"slack"` | `bot_token` |
270
+ | `SpreadsheetTool` (Google Sheets) | `"google"` | Full service account JSON as dict |
271
+ | `NotionTool` | `"notion"` | `token` |
272
+ | `TwitterTool` | `"twitter"` | `api_key`, `api_secret`, `access_token`, `access_token_secret` |
273
+ | `JiraTool` | `"jira"` | `server`, `email`, `api_token` |
274
+ | `TelegramTool` | `"telegram"` | `bot_token` |
275
+ | `SSHTool` | `"ssh"` | `user`, `password` (or `key_path`) |
276
+
277
+ ```python
278
+ # GitHub
279
+ CredStore.save("github", {"token": "ghp_xxxxxxxxxxxxxxxxxxxx"})
280
+
281
+ # Slack
282
+ CredStore.save("slack", {"bot_token": "xoxb-xxxxxxxxxxxx"})
283
+
284
+ # Notion
285
+ CredStore.save("notion", {"token": "secret_xxxxxxxxxxxx"})
286
+
287
+ # Twitter / X
288
+ CredStore.save("twitter", {
289
+ "api_key": "...",
290
+ "api_secret": "...",
291
+ "access_token": "...",
292
+ "access_token_secret": "..."
293
+ })
294
+
295
+ # Jira
296
+ CredStore.save("jira", {
297
+ "server": "https://yourworkspace.atlassian.net",
298
+ "email": "you@company.com",
299
+ "api_token": "your-jira-api-token"
300
+ })
301
+
302
+ # Telegram
303
+ CredStore.save("telegram", {"bot_token": "1234567890:AAxxxxxxxxxxxxxx"})
304
+
305
+ # SSH
306
+ CredStore.save("ssh", {"user": "ubuntu", "password": "secret"})
307
+ # or with key file
308
+ CredStore.save("ssh", {"user": "ubuntu", "key_path": "/home/you/.ssh/id_rsa"})
309
+ ```
310
+
311
+ ---
312
+
313
+ ## Tool Classes & Documentation
314
+
315
+ All tool classes extend `ensure`, which auto-installs required dependencies on first instantiation. Every method returns a `ToolResult` object with three fields:
316
+
317
+ ```python
318
+ result.success # bool — True if the operation succeeded
319
+ result.output # str — human-readable status message
320
+ result.data # any — the actual returned data (list, str, DataFrame, etc.)
321
+ ```
322
+
323
+ ---
324
+
325
+ ### CredStore
326
+
327
+ Fernet-encrypted local credential vault.
328
+
329
+ ```python
330
+ from npmai-agent import CredStore
331
+
332
+ # Store credentials
333
+ CredStore.save("service_name", {"key": "value"})
334
+
335
+ # Load credentials
336
+ data = CredStore.load("service_name")
337
+
338
+ # List all saved keys
339
+ print(CredStore.all_keys())
340
+ ```
341
+
342
+ ---
343
+
344
+ ### Workspace
345
+
346
+ Scans the user's file system and builds a context profile used by `AgentBrain` during planning.
347
+
348
+ ```python
349
+ from npmai-agent import Workspace
350
+
351
+ ws = Workspace()
352
+
353
+ # Scan Desktop, Downloads, Documents, Pictures, Videos, Music
354
+ profile = ws.scan()
355
+ print(profile["os"]) # 'Windows' / 'Darwin' / 'Linux'
356
+ print(profile["home"]) # '/home/sonu'
357
+ print(profile["paths"]) # dict of folder → files
358
+
359
+ # Update any custom profile field
360
+ ws.update_profile("user_name", "Sonu Kumar")
361
+
362
+ # Get a short text summary (used internally by the planner LLM)
363
+ print(ws.context_summary())
364
+ ```
365
+
366
+ ---
367
+
368
+ ### EmailTool
369
+
370
+ Send emails via Gmail SMTP, read inbox via IMAP, and send bulk personalised emails from a CSV.
371
+
372
+ ```python
373
+ from npmai-agent import EmailTool, CredStore
374
+
375
+ # Configure once
376
+ CredStore.save("gmail", {
377
+ "email": "you@gmail.com",
378
+ "password": "app-password-here",
379
+ "smtp_host": "smtp.gmail.com",
380
+ "smtp_port": 587,
381
+ "imap_host": "imap.gmail.com"
382
+ })
383
+
384
+ # Send a single email
385
+ result = EmailTool.send(
386
+ to="friend@example.com",
387
+ subject="Hello from npmai-agent",
388
+ body="<h1>This was sent by an AI agent!</h1>"
389
+ )
390
+ print(result) # ✓ Email sent to friend@example.com
391
+
392
+ # Send with attachments
393
+ result = EmailTool.send(
394
+ to="boss@company.com",
395
+ subject="Monthly Report",
396
+ body="Please find the report attached.",
397
+ attachments=["/home/sonu/report.pdf"]
398
+ )
399
+
400
+ # Read inbox (last 10 emails)
401
+ result = EmailTool.read_inbox(count=10)
402
+ if result.success:
403
+ for msg in result.data:
404
+ print(msg["from"], msg["subject"], msg["date"])
405
+
406
+ # Bulk email from CSV
407
+ # CSV must have 'name' and 'email' columns (configurable)
408
+ result = EmailTool.send_bulk(
409
+ csv_path="contacts.csv",
410
+ subject="Invitation to NPMAI Launch",
411
+ body_template="<p>Hello {name}, you are invited!</p>",
412
+ name_col="name",
413
+ email_col="email"
414
+ )
415
+ print(result) # ✓ Sent 42 emails, 0 failed
416
+ ```
417
+
418
+ ---
419
+
420
+ ### FileTool
421
+
422
+ Rename, move, copy, zip, unzip, search, organize, read, and write files.
423
+
424
+ ```python
425
+ from npmai-agent import FileTool
426
+
427
+ # Bulk rename all .txt files in a folder
428
+ result = FileTool.bulk_rename(
429
+ folder="/home/sonu/Documents",
430
+ pattern="*.txt",
431
+ prefix="NPMAI_",
432
+ suffix="_v1",
433
+ add_date=True
434
+ )
435
+ print(result) # ✓ Renamed 12 files
436
+
437
+ # Zip a folder
438
+ result = FileTool.zip_folder(
439
+ source="/home/sonu/project",
440
+ dest="/home/sonu/project_backup.zip"
441
+ )
442
+
443
+ # Unzip
444
+ result = FileTool.unzip(
445
+ zip_path="/home/sonu/archive.zip",
446
+ dest="/home/sonu/extracted"
447
+ )
448
+
449
+ # Find files recursively
450
+ result = FileTool.find_files(
451
+ folder="/home/sonu",
452
+ pattern="*.py",
453
+ recursive=True
454
+ )
455
+ print(result.data) # ['/home/sonu/agent_core.py', ...]
456
+
457
+ # Organize folder by file type (creates Images/, Videos/, Docs/, etc.)
458
+ result = FileTool.organize_by_type("/home/sonu/Downloads")
459
+ print(result) # ✓ Organized 87 files by type
460
+
461
+ # Read a file
462
+ result = FileTool.read_file("/home/sonu/notes.txt")
463
+ print(result.data) # file content as string
464
+
465
+ # Write a file
466
+ result = FileTool.write_file(
467
+ path="/home/sonu/output/report.txt",
468
+ content="Generated by npmai-agent."
469
+ )
470
+
471
+ # Copy entire directory tree
472
+ result = FileTool.duplicate_tree(
473
+ src="/home/sonu/project",
474
+ dst="/home/sonu/project_copy"
475
+ )
476
+ ```
477
+
478
+ ---
479
+
480
+ ### PDFTool
481
+
482
+ Merge, split, and extract text from PDF files.
483
+
484
+ ```python
485
+ from npmai-agent import PDFTool
486
+
487
+ # Extract all text from a PDF
488
+ result = PDFTool.extract_text("/home/sonu/research_paper.pdf")
489
+ print(result.data) # extracted text string
490
+
491
+ # Merge multiple PDFs into one
492
+ result = PDFTool.merge(
493
+ paths=["/home/sonu/chapter1.pdf", "/home/sonu/chapter2.pdf"],
494
+ out="/home/sonu/full_book.pdf"
495
+ )
496
+ print(result) # ✓ Merged 2 PDFs → /home/sonu/full_book.pdf
497
+
498
+ # Split a PDF into individual pages
499
+ result = PDFTool.split(
500
+ path="/home/sonu/document.pdf",
501
+ out_dir="/home/sonu/pages"
502
+ )
503
+ print(result) # ✓ Split into 24 pages in /home/sonu/pages
504
+ ```
505
+
506
+ ---
507
+
508
+ ### WebTool
509
+
510
+ Scrape websites, download files, take screenshots, automate browsers via Playwright, and make raw API calls.
511
+
512
+ ```python
513
+ from npmai-agent import WebTool
514
+
515
+ # Scrape full page text
516
+ result = WebTool.scrape("https://npmai.netlify.app")
517
+ print(result.data[:500])
518
+
519
+ # Scrape specific elements using CSS selector
520
+ result = WebTool.scrape(
521
+ url="https://example.com",
522
+ selector="h2"
523
+ )
524
+ print(result.data) # ['Heading 1', 'Heading 2', ...]
525
+
526
+ # Download a file
527
+ result = WebTool.download_file(
528
+ url="https://example.com/file.pdf",
529
+ dest="/home/sonu/downloads/file.pdf"
530
+ )
531
+
532
+ # Take a full-page screenshot (requires Playwright)
533
+ result = WebTool.screenshot_url(
534
+ url="https://npmai.netlify.app",
535
+ out="/home/sonu/npmai_screenshot.png"
536
+ )
537
+
538
+ # Automated browser actions (click, fill, extract)
539
+ result = WebTool.browser_action(
540
+ url="https://example.com/login",
541
+ actions=[
542
+ {"type": "fill", "selector": "#username", "value": "sonu"},
543
+ {"type": "fill", "selector": "#password", "value": "secret"},
544
+ {"type": "click", "selector": "#login-btn"},
545
+ {"type": "wait", "ms": 2000},
546
+ {"type": "screenshot", "path": "after_login.png"}
547
+ ]
548
+ )
549
+
550
+ # Make a raw HTTP API call
551
+ result = WebTool.api_call(
552
+ url="https://api.example.com/data",
553
+ method="POST",
554
+ headers={"Authorization": "Bearer token"},
555
+ payload={"query": "test"}
556
+ )
557
+ print(result.data) # parsed JSON response
558
+ ```
559
+
560
+ ---
561
+
562
+ ### SpreadsheetTool
563
+
564
+ Read/write CSV, Excel, and Google Sheets.
565
+
566
+ ```python
567
+ from npmai-agent import SpreadsheetTool, CredStore
568
+
569
+ # Read a CSV file
570
+ result = SpreadsheetTool.read_csv("/home/sonu/data.csv")
571
+ df = result.data # pandas DataFrame
572
+ print(f"{len(df)} rows")
573
+
574
+ # Write a DataFrame or list of dicts to Excel
575
+ data = [{"name": "Sonu", "age": 15}, {"name": "AI", "age": 0}]
576
+ result = SpreadsheetTool.write_excel(
577
+ data=data,
578
+ path="/home/sonu/output.xlsx",
579
+ sheet="Founders"
580
+ )
581
+
582
+ # Read from Google Sheets (requires service account credentials)
583
+ CredStore.save("google", {
584
+ "type": "service_account",
585
+ "project_id": "...",
586
+ "private_key_id": "...",
587
+ "private_key": "-----BEGIN PRIVATE KEY-----\n...",
588
+ "client_email": "...",
589
+ # ... full service account JSON
590
+ })
591
+
592
+ result = SpreadsheetTool.google_sheets_read(
593
+ sheet_id="1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
594
+ range_="Sheet1"
595
+ )
596
+ print(result.data) # list of row dicts
597
+ ```
598
+
599
+ ---
600
+
601
+ ### GitHubTool
602
+
603
+ Create issues, push files, list issues, fetch READMEs, clone repos, commit and push.
604
+
605
+ ```python
606
+ from npmai-agent import GitHubTool, CredStore
607
+
608
+ CredStore.save("github", {"token": "ghp_xxxxxxxxxxxxxxxxxxxx"})
609
+
610
+ # Create an issue
611
+ result = GitHubTool.create_issue(
612
+ repo="sonuramashishnpm/npmai",
613
+ title="Bug: model timeout on large inputs",
614
+ body="Steps to reproduce...",
615
+ labels=["bug", "llm"]
616
+ )
617
+ print(result) # ✓ Issue #42 created: https://github.com/...
618
+
619
+ # Push (create or update) a file in a repo
620
+ result = GitHubTool.push_file(
621
+ repo="sonuramashishnpm/npmai",
622
+ path="docs/agent.md",
623
+ content="# npmai-agent docs\n...",
624
+ message="docs: add agent documentation"
625
+ )
626
+
627
+ # List open issues
628
+ result = GitHubTool.list_issues(repo="sonuramashishnpm/npmai", state="open")
629
+ for issue in result.data:
630
+ print(issue["#"], issue["title"])
631
+
632
+ # Fetch README
633
+ result = GitHubTool.get_readme(repo="sonuramashishnpm/npmai")
634
+ print(result.data[:300])
635
+
636
+ # Clone a repository
637
+ result = GitHubTool.clone_repo(
638
+ url="https://github.com/sonuramashishnpm/npmai.git",
639
+ dest="/home/sonu/projects/npmai"
640
+ )
641
+
642
+ # Stage, commit, and push from a local repo
643
+ result = GitHubTool.git_commit_push(
644
+ repo_path="/home/sonu/projects/npmai",
645
+ message="feat: add new model endpoints"
646
+ )
647
+ ```
648
+
649
+ ---
650
+
651
+ ### SlackTool
652
+
653
+ Send messages, read channel history, upload files to Slack.
654
+
655
+ ```python
656
+ from npmai-agent import SlackTool, CredStore
657
+
658
+ CredStore.save("slack", {"bot_token": "xoxb-xxxxxxxxxxxx"})
659
+
660
+ # Send a message to a channel
661
+ result = SlackTool.send_message(
662
+ channel="#general",
663
+ text="npmai-agent task completed successfully ✓"
664
+ )
665
+
666
+ # Read last 20 messages from a channel
667
+ result = SlackTool.read_channel(channel="#dev-logs", limit=20)
668
+ for msg in result.data:
669
+ print(msg["user"], ":", msg["text"])
670
+
671
+ # Upload a file
672
+ result = SlackTool.upload_file(
673
+ channel="#reports",
674
+ file_path="/home/sonu/report.pdf",
675
+ comment="Weekly AI report"
676
+ )
677
+ ```
678
+
679
+ ---
680
+
681
+ ### DiscordTool
682
+
683
+ Send messages and embeds to Discord channels via webhook.
684
+
685
+ ```python
686
+ from npmai-agent import DiscordTool
687
+
688
+ WEBHOOK = "https://discord.com/api/webhooks/xxxx/yyyy"
689
+
690
+ # Send a plain message
691
+ result = DiscordTool.send_webhook(
692
+ webhook_url=WEBHOOK,
693
+ content="🚀 npmai-agent deployment complete!"
694
+ )
695
+
696
+ # Send with an embed
697
+ result = DiscordTool.send_webhook(
698
+ webhook_url=WEBHOOK,
699
+ content="Task update:",
700
+ embeds=[{
701
+ "title": "Step 3 Complete",
702
+ "description": "All files organized successfully.",
703
+ "color": 3066993
704
+ }]
705
+ )
706
+ ```
707
+
708
+ ---
709
+
710
+ ### WhatsAppTool
711
+
712
+ Send WhatsApp messages (requires WhatsApp Web to be open in the browser).
713
+
714
+ ```python
715
+ from npmai-agent import WhatsAppTool
716
+
717
+ # Phone number must include country code, e.g. +91 for India
718
+ result = WhatsAppTool.send(
719
+ phone="+919876543210",
720
+ message="Hello from npmai-agent!",
721
+ wait=15 # seconds to wait before sending
722
+ )
723
+ print(result) # ✓ WhatsApp sent to +919876543210
724
+ ```
725
+
726
+ ---
727
+
728
+ ### NotionTool
729
+
730
+ Create pages and add database entries in Notion.
731
+
732
+ ```python
733
+ from npmai-agent import NotionTool, CredStore
734
+
735
+ CredStore.save("notion", {"token": "secret_xxxxxxxxxxxx"})
736
+
737
+ # Create a new page inside a parent page
738
+ result = NotionTool.create_page(
739
+ parent_id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
740
+ title="NPMAI Agent Research Notes",
741
+ content="This page was created by npmai-agent automatically."
742
+ )
743
+ print(result) # ✓ Notion page created: https://notion.so/...
744
+
745
+ # Add a row to a Notion database
746
+ result = NotionTool.add_db_entry(
747
+ db_id="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
748
+ props={
749
+ "Name": {"title": [{"text": {"content": "Task Completed"}}]},
750
+ "Status": {"select": {"name": "Done"}}
751
+ }
752
+ )
753
+ ```
754
+
755
+ ---
756
+
757
+ ### TwitterTool
758
+
759
+ Post tweets using the Twitter v2 API.
760
+
761
+ ```python
762
+ from npmai-agent import TwitterTool, CredStore
763
+
764
+ CredStore.save("twitter", {
765
+ "api_key": "...",
766
+ "api_secret": "...",
767
+ "access_token": "...",
768
+ "access_token_secret": "..."
769
+ })
770
+
771
+ result = TwitterTool.tweet(
772
+ text="Just automated my workflow with npmai-agent by @NPMAIEcosystem 🤖 #OpenSource #AI"
773
+ )
774
+ print(result) # ✓ Tweeted: 1234567890123456789
775
+ ```
776
+
777
+ ---
778
+
779
+ ### SystemTool
780
+
781
+ Run shell commands, manage processes, use the clipboard, take screenshots, and send desktop notifications.
782
+
783
+ ```python
784
+ from npmai-agent import SystemTool
785
+
786
+ # Run any shell command
787
+ result = SystemTool.run_command("ls -la /home/sonu", cwd="/home/sonu", timeout=30)
788
+ print(result.output)
789
+
790
+ # Get clipboard contents
791
+ result = SystemTool.get_clipboard()
792
+ print(result.data)
793
+
794
+ # Set clipboard contents
795
+ result = SystemTool.set_clipboard("Copied by npmai-agent")
796
+
797
+ # Take a screenshot
798
+ result = SystemTool.screenshot(out="/home/sonu/screen.png")
799
+
800
+ # List running processes
801
+ result = SystemTool.get_processes()
802
+ for proc in result.data[:5]:
803
+ print(proc["name"], proc["cpu"], "%")
804
+
805
+ # Send a desktop notification
806
+ result = SystemTool.notify(
807
+ title="npmai-agent",
808
+ message="Your task has been completed!"
809
+ )
810
+ ```
811
+
812
+ ---
813
+
814
+ ### ImageTool
815
+
816
+ Resize, convert, OCR, and bulk-compress images using Pillow and pytesseract.
817
+
818
+ ```python
819
+ from npmai-agent import ImageTool
820
+
821
+ # Resize an image
822
+ result = ImageTool.resize(
823
+ path="/home/sonu/photo.jpg",
824
+ width=800,
825
+ height=600,
826
+ out="/home/sonu/photo_resized.jpg"
827
+ )
828
+
829
+ # Convert image format
830
+ result = ImageTool.convert(
831
+ path="/home/sonu/photo.jpg",
832
+ format="PNG",
833
+ out="/home/sonu/photo.png"
834
+ )
835
+
836
+ # Extract text from an image (OCR)
837
+ result = ImageTool.ocr("/home/sonu/scanned_doc.png")
838
+ print(result.data) # extracted text
839
+
840
+ # Bulk compress all JPGs/PNGs in a folder
841
+ result = ImageTool.bulk_compress(
842
+ folder="/home/sonu/Pictures",
843
+ quality=75
844
+ )
845
+ print(result) # ✓ Compressed 34 images
846
+ ```
847
+
848
+ ---
849
+
850
+ ### SchedulerTool
851
+
852
+ Schedule Python callbacks to run at specific times or intervals in background threads.
853
+
854
+ ```python
855
+ from npmai-agent import SchedulerTool
856
+
857
+ def my_task():
858
+ print("Running scheduled task!")
859
+
860
+ # Run every 5 minutes
861
+ result = SchedulerTool.schedule_task(
862
+ task_id="heartbeat",
863
+ cron_like="every 5 minutes",
864
+ callback=my_task
865
+ )
866
+
867
+ # Run every day at 09:00
868
+ result = SchedulerTool.schedule_task(
869
+ task_id="daily_report",
870
+ cron_like="every day at 09:00",
871
+ callback=my_task
872
+ )
873
+
874
+ # Run every Monday at 08:00
875
+ result = SchedulerTool.schedule_task(
876
+ task_id="weekly_sync",
877
+ cron_like="every monday at 08:00",
878
+ callback=my_task
879
+ )
880
+
881
+ # Cancel a scheduled task
882
+ result = SchedulerTool.cancel_task("heartbeat")
883
+ ```
884
+
885
+ ---
886
+
887
+ ### JiraTool
888
+
889
+ Create and manage Jira issues.
890
+
891
+ ```python
892
+ from npmai-agent import JiraTool, CredStore
893
+
894
+ CredStore.save("jira", {
895
+ "server": "https://yourworkspace.atlassian.net",
896
+ "email": "you@company.com",
897
+ "api_token": "your-jira-api-token"
898
+ })
899
+
900
+ # Create a Jira issue
901
+ result = JiraTool.create_issue(
902
+ project="NPMAI",
903
+ summary="Integrate agent v0.0.1 with desktop UI",
904
+ description="The agent core needs to be wired to the PySide6 app.",
905
+ issue_type="Task"
906
+ )
907
+ print(result) # ✓ Jira issue NPMAI-17 created
908
+ ```
909
+
910
+ ---
911
+
912
+ ### TelegramTool
913
+
914
+ Send messages via a Telegram bot.
915
+
916
+ ```python
917
+ from npmai-agent import TelegramTool, CredStore
918
+
919
+ CredStore.save("telegram", {"bot_token": "1234567890:AAxxxxxxxxxxxxxx"})
920
+
921
+ # Send a message (chat_id can be a user ID or @channel)
922
+ result = TelegramTool.send(
923
+ chat_id="123456789",
924
+ text="✅ npmai-agent task complete: organized 87 files."
925
+ )
926
+ print(result) # ✓ Telegram sent
927
+ ```
928
+
929
+ ---
930
+
931
+ ### QRTool
932
+
933
+ Generate QR codes from any text or URL.
934
+
935
+ ```python
936
+ from npmai-agent import QRTool
937
+
938
+ # Generate and save a QR code
939
+ result = QRTool.generate(
940
+ data="https://npmai.netlify.app",
941
+ out="/home/sonu/npmai_qr.png",
942
+ size=10
943
+ )
944
+ print(result) # ✓ QR code saved: /home/sonu/npmai_qr.png
945
+ ```
946
+
947
+ ---
948
+
949
+ ### VoiceTool
950
+
951
+ Text-to-speech output and speech-to-text input.
952
+
953
+ ```python
954
+ from npmai-agent import VoiceTool
955
+
956
+ # Speak text aloud
957
+ result = VoiceTool.speak("Task completed successfully. npmai-agent is ready.")
958
+ print(result) # ✓ Spoken
959
+
960
+ # Listen for speech input (5 seconds)
961
+ result = VoiceTool.listen(seconds=5)
962
+ if result.success:
963
+ print(result.data) # recognised text from microphone
964
+ ```
965
+
966
+ ---
967
+
968
+ ### WatcherTool
969
+
970
+ Watch a folder for file creation or modification events and trigger a callback.
971
+
972
+ ```python
973
+ from npmai-agent import WatcherTool
974
+
975
+ def on_file_change(file_path):
976
+ print(f"File changed: {file_path}")
977
+ # trigger any action here
978
+
979
+ # Start watching a folder in a background thread
980
+ result = WatcherTool.watch(
981
+ folder="/home/sonu/incoming",
982
+ callback=on_file_change
983
+ )
984
+ print(result) # ✓ Watching /home/sonu/incoming
985
+ ```
986
+
987
+ ---
988
+
989
+ ### RAGTool
990
+
991
+ Query large documents and summarise long files using the NPMAI LARA RAG pipeline.
992
+
993
+ ```python
994
+ from npmai-agent import RAGTool
995
+
996
+ # Query a document (PDF or plain text) using natural language
997
+ result = RAGTool.query_document(
998
+ doc_path="/home/sonu/research_paper.pdf",
999
+ question="What is the main contribution of this paper?",
1000
+ chunk_size=500
1001
+ )
1002
+ print(result.data) # LLM-generated answer
1003
+
1004
+ # Summarise a large document (processes up to 10 × 3000-char chunks)
1005
+ result = RAGTool.summarize_large_file(
1006
+ path="/home/sonu/thesis.pdf",
1007
+ model="mistral:7b"
1008
+ )
1009
+ print(result.data) # comprehensive summary
1010
+ ```
1011
+
1012
+ ---
1013
+
1014
+ ### SSHTool
1015
+
1016
+ Run commands on remote servers and transfer files via SFTP.
1017
+
1018
+ ```python
1019
+ from npmai-agent import SSHTool, CredStore
1020
+
1021
+ CredStore.save("ssh", {
1022
+ "user": "ubuntu",
1023
+ "password": "your-server-password"
1024
+ # or use key_path instead of password:
1025
+ # "key_path": "/home/sonu/.ssh/id_rsa"
1026
+ })
1027
+
1028
+ # Run a remote command
1029
+ result = SSHTool.run(
1030
+ host="192.168.1.100",
1031
+ command="df -h && uptime"
1032
+ )
1033
+ print(result.data) # command output
1034
+
1035
+ # Upload a file via SFTP
1036
+ result = SSHTool.upload(
1037
+ host="192.168.1.100",
1038
+ local="/home/sonu/deploy.sh",
1039
+ remote="/home/ubuntu/deploy.sh"
1040
+ )
1041
+ print(result) # ✓ Uploaded /home/sonu/deploy.sh → /home/ubuntu/deploy.sh
1042
+ ```
1043
+
1044
+ ---
1045
+
1046
+ ## AgentBrain — The Autonomous Pipeline
1047
+
1048
+ `AgentBrain` is the core orchestrator. Once your credentials are configured via `CredStore`, you can hand it any task in plain English and it will autonomously plan, code, audit, execute, verify, and retry until the task is done — using all 21 tool classes above as needed.
1049
+
1050
+ ```python
1051
+ from npmai-agent import AgentBrain
1052
+
1053
+ # Optional callbacks for logging, progress, and status
1054
+ def log(msg): print(msg)
1055
+ def progress(pct): print(f"Progress: {pct}%")
1056
+ def status(s): print(f"Status: {s}")
1057
+
1058
+ brain = AgentBrain(
1059
+ log_cb=log,
1060
+ progress_cb=progress,
1061
+ status_cb=status
1062
+ )
1063
+ ```
1064
+
1065
+ ### Running a Task
1066
+
1067
+ ```python
1068
+ # Simple plain-English task — the agent figures out the rest
1069
+ brain.run_task("Organize my Downloads folder by file type")
1070
+
1071
+ brain.run_task("Send an email to team@company.com saying the build passed")
1072
+
1073
+ brain.run_task("Scrape the titles of all articles from https://example.com/blog and save them to a CSV")
1074
+
1075
+ brain.run_task("Create a GitHub issue in sonuramashishnpm/npmai titled 'Add voice input support'")
1076
+
1077
+ brain.run_task("Read my last 5 emails and summarise them")
1078
+ ```
1079
+
1080
+ ### Chat Mode (for questions, not computer tasks)
1081
+
1082
+ ```python
1083
+ response = brain.chat("What is the LARA RAG architecture?")
1084
+ print(response)
1085
+ ```
1086
+
1087
+ ### Task with Kill Switch
1088
+
1089
+ ```python
1090
+ import threading
1091
+
1092
+ killed = [False]
1093
+
1094
+ def run():
1095
+ brain.run_task("Download and process 500 PDFs from the server", killed_flag=killed)
1096
+
1097
+ t = threading.Thread(target=run)
1098
+ t.start()
1099
+
1100
+ # Cancel at any time
1101
+ killed[0] = True
1102
+ ```
1103
+
1104
+ ### Task History
1105
+
1106
+ ```python
1107
+ history = AgentBrain.load_task_history()
1108
+ for entry in history:
1109
+ status = "✓" if entry["success"] else "✗"
1110
+ print(f"{status} [{entry['time']}] {entry['task']}")
1111
+ ```
1112
+
1113
+ ### How AgentBrain Uses All 21 Tools
1114
+
1115
+ `AgentBrain` exposes the complete tool registry to the Coder LLM. When generating code for each step, the LLM is provided with this import context:
1116
+
1117
+ ```python
1118
+ from npmai-agent import EmailTool, FileTool, WebTool, SpreadsheetTool
1119
+ from npmai-agent import GitHubTool, SlackTool, PDFTool, ImageTool
1120
+ from npmai-agent import SystemTool, TelegramTool, QRTool, RAGTool, SSHTool
1121
+ from npmai-agent import DiscordTool, WhatsAppTool, NotionTool, TwitterTool
1122
+ from npmai-agent import SchedulerTool, JiraTool, VoiceTool, WatcherTool
1123
+ from npmai-agent import CredStore, Workspace
1124
+ ```
1125
+
1126
+ This means you don't call the tools yourself — you just configure credentials with `CredStore` and describe your task in plain English. `AgentBrain` selects the right tools, generates the code, audits it for security, executes it, verifies success, and retries on failure — up to 12 times per step.
1127
+
1128
+ ### LLM Pipeline Details
1129
+
1130
+ | Role | Default Model | Fallback | Purpose |
1131
+ |---|---|---|---|
1132
+ | **Planner** | `llama3.2:3b` | `mistral:7b` | Breaks task into 2–5 atomic steps |
1133
+ | **Coder** | `codellama:7b-instruct` | `deepseek-coder:6.7b` | Generates executable Python code |
1134
+ | **Auditor** | `qwen2.5-coder:7b` | `falcon:7b-instruct` | Security scan before execution |
1135
+ | **Verifier** | `llama3.2:3b` | `mistral:7b` | Confirms step completed successfully |
1136
+ | **Chatter** | `granite3.3:2b` | `llama3.2:1b` | General Q&A / chat mode |
1137
+
1138
+ All models are served free via the NPMAI ECOSYSTEM load balancer with `change=True` for automatic fallback.
1139
+
1140
+ ---
1141
+
1142
+ ## Executor
1143
+
1144
+ `Executor` is used internally by `AgentBrain` but can be used standalone to safely run any Python code string as a subprocess with live stdout streaming.
1145
+
1146
+ ```python
1147
+ from npmai-agent import Executor
1148
+
1149
+ def log(line): print(line)
1150
+
1151
+ executor = Executor(log_cb=log, timeout=120)
1152
+
1153
+ code = """
1154
+ import time
1155
+ for i in range(5):
1156
+ print(f"Step {i+1}")
1157
+ time.sleep(0.5)
1158
+ """
1159
+
1160
+ success, output = executor.run(code)
1161
+ print(f"Success: {success}")
1162
+ print(f"Output: {output}")
1163
+
1164
+ # Kill a running executor
1165
+ executor.kill()
1166
+ ```
1167
+
1168
+ ---
1169
+
1170
+ ## Version
1171
+
1172
+ **0.0.1** — Initial release of `npmai-agent`.
1173
+
1174
+ This is the first public release of the npmai-agent-suite as a distributable PyPI package. The core agent pipeline, all 21 tool classes, CredStore, Workspace, Executor, and AgentBrain are stable and production-ready.
1175
+
1176
+ ---
1177
+
1178
+ ## License
1179
+
1180
+ MIT License — free to use, modify, and distribute.
1181
+
1182
+ Built with ❤️ by **Sonu Kumar**, Founder of NPMAI ECOSYSTEM · [npmai.netlify.app](https://npmai.netlify.app)
1183
+
1184
+ > *"Promoting Individual Journalism to every nation's village so that the democratic values of a nation can be strengthened and we can achieve Representative Ideal Democracy."*
1185
+ > — Sonu Kumar, Founder, NPMAI ECOSYSTEM