PyPI - researchloop - Versions diffs - 0.2.0__tar.gz → 0.3.1__tar.gz - Mend

researchloop 0.2.0tar.gz → 0.3.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (136) hide show

{researchloop-0.2.0 → researchloop-0.3.1}/CLAUDE.md RENAMED Viewed

@@ -8,18 +8,19 @@ ResearchLoop is an automated research sprint platform for HPC clusters. It orche
 Two processes:
-1. **Orchestrator** (`researchloop serve`) — FastAPI server that manages studies/sprints in SQLite, submits jobs via SSH, receives webhooks from runners, stores artifacts. Also serves the web dashboard and handles Slack events.
+1. **Orchestrator** (`researchloop serve`) — FastAPI server that manages studies/sprints in SQLite, submits jobs via SSH, receives webhooks from runners, stores artifacts. Also serves the web dashboard and handles Slack events. Has no Claude CLI dependency.
 2. **Sprint Runner** — runs inside each SLURM/SGE job on HPC. Self-contained bash scripts chain `claude -p` calls through a pipeline (research → red-team → fix → report → summarize), then upload artifacts and send a completion webhook.
 Key design decisions:
-- All AI work runs on HPC, never on the orchestrator (except Slack conversations and auto-loop idea generation, which use `claude -p` locally with restricted tools)
-- `claude -p --output-format stream-json` for sprint steps (enables live progress), `--output-format json` for conversations
+- All AI work runs on HPC; the orchestrator never invokes `claude`
+- `claude -p --output-format stream-json` for sprint steps (enables live progress)
 - SSH to HPC login nodes for sbatch/squeue/scancel/qsub/qdel
 - Job completion via per-sprint webhook tokens (runner → orchestrator), SSH polling as fallback
 - SQLite (aiosqlite, WAL mode) for metadata, with a `settings` table for persistent config (signing key, password hash)
 - Jinja2 templates for all prompts and job scripts — prompts are pre-rendered by the orchestrator and embedded as base64 in the job script
 - Auto-loop sprints generate their own ideas on the cluster (where Claude is authenticated) rather than on the orchestrator
+- Slack integration is notification + structured slash-style commands only (`sprint run`, `sprint list`, `loop start`, `help`); free-form Q&A was removed
 - Context hierarchy: global → cluster → study (inline text + file paths at each level)
 ## Tech stack
@@ -51,11 +52,10 @@ researchloop/
     models.py           — SprintStatus enum, Sprint/Study/AutoLoop dataclasses, generate_sprint_id(), format_sprint_dirname()
     orchestrator.py     — Orchestrator class + create_app() FastAPI factory (API + Slack + dashboard)
     credentials.py      — CLI credential storage (~/.config/researchloop/credentials.json) for remote orchestrator auth
-    auth.py             — check_claude_auth_async() helper for verifying Claude CLI auth status
   db/
     __init__.py
     database.py         — async SQLite wrapper (WAL mode, auto-migrations, fetch_one/fetch_all/execute)
-    migrations.py       — CREATE TABLE statements (7 tables: studies, sprints, auto_loops, artifacts, slack_sessions, events, settings) + indexes + incremental column migrations
+    migrations.py       — CREATE TABLE statements (7 tables: studies, sprints, tweaks, auto_loops, artifacts, events, settings) + indexes + incremental column migrations
     queries.py          — async CRUD functions (all take Database as first arg, return dicts)
   clusters/
     __init__.py
@@ -95,7 +95,6 @@ researchloop/
     base.py             — BaseNotifier ABC (notify_sprint_started/completed/failed)
     ntfy.py             — NtfyNotifier (ntfy.sh push notifications)
     slack.py            — SlackNotifier (chat:write + files:write) + verify_slack_signature()
-    conversation.py     — ConversationManager (Slack threads → Claude sessions via --resume, action execution, markdown→Slack conversion)
     router.py           — NotificationRouter (fan-out to all configured notifiers)
   dashboard/
     __init__.py
@@ -116,7 +115,7 @@ researchloop/
 ## Database
-SQLite with 8 tables: `studies`, `sprints`, `tweaks`, `auto_loops`, `artifacts`, `slack_sessions`, `events`, `settings`. Schema in `db/migrations.py`. All queries in `db/queries.py` use parameterized SQL and return plain dicts.
+SQLite with 7 tables: `studies`, `sprints`, `tweaks`, `auto_loops`, `artifacts`, `events`, `settings`. Schema in `db/migrations.py`. All queries in `db/queries.py` use parameterized SQL and return plain dicts. (An older `slack_sessions` table is dropped by the migration if present.)
 Key columns:
 - `sprints.webhook_token` — per-sprint token for webhook auth (generated at creation)
@@ -148,7 +147,7 @@ Key columns:
 - CSRF protection: HMAC-based tokens derived from session token + signing secret, checked on all mutating dashboard POST routes
 - Dashboard refresh: pulls live status from cluster via SSH (reads logs, progress.md, output.log, report.md, findings.md, summary.txt, idea.txt, checks for PDF)
 - Slack events: deduplication via event_id set, signature verification, background task processing (return 200 immediately), bot message filtering
-- Slack conversation: thread → session mapping in DB, context building with study/sprint info, action execution via [ACTION: ...] tags
+- Slack commands: `sprint run`, `sprint list`, `loop start`, `help` (no free-form chat — orchestrator does not run Claude locally)
 - Auto-loop: sprint idea=None → job script generates idea on cluster → idea.txt read back via SSH/webhook
 - CLI auth: `researchloop connect` gets a bearer token via /api/auth, stored in ~/.config/researchloop/credentials.json with 600 permissions
 - CLI auto-reauth: on 401, prompts for password, gets new token, saves it
@@ -156,7 +155,7 @@ Key columns:
 ## Testing
-339 unit tests covering: models, config parsing, database operations, all query functions, SLURM scheduler (mock SSH), SGE scheduler (mock SSH), local scheduler (real subprocesses), study/sprint managers, auto-loop controller (with mock claude), notification router, Slack notifier + signature verification + conversation manager + Slack events API, FastAPI API endpoints (TestClient), dashboard routes + auth + setup + CSRF, CLI commands (CliRunner), runner output parsing, and template rendering.
+Unit tests cover: models, config parsing, database operations, all query functions, SLURM scheduler (mock SSH), SGE scheduler (mock SSH), local scheduler (real subprocesses), study/sprint managers, auto-loop controller, notification router, Slack notifier + signature verification + Slack events API, FastAPI API endpoints (TestClient), dashboard routes + auth + setup + CSRF, CLI commands (CliRunner), runner output parsing, and template rendering.
 Integration tests (in tests/integration/) use a Docker SLURM container to test real job submission.

{researchloop-0.2.0 → researchloop-0.3.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: researchloop
-Version: 0.2.0
+Version: 0.3.1
 Summary: Automated research sprint platform for HPC clusters
 License: MIT
 License-File: LICENSE
@@ -38,6 +38,8 @@ Description-Content-Type: text/markdown
 [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
+<img width="720" height="456" alt="mmlu-combined" src="https://github.com/user-attachments/assets/6d1d495f-1078-4f81-9f8a-bb1792ea3905" />
 ---
 ResearchLoop submits AI-powered research experiments to your SLURM or SGE cluster, then reports back the results. You describe a research idea in natural language, it handles the rest: submitting the job, running a multi-step pipeline with [Claude Code](https://docs.anthropic.com/en/docs/claude-code), red-teaming the results, generating a report, and notifying you when it's done.
@@ -120,11 +122,13 @@ Browse to `/dashboard/` to see all your studies, sprints, and loops. Submit new
 ### Slack bot
-Chat with the bot to start sprints, check status, or discuss research ideas. The bot maintains conversation context across a thread, so you can have a back-and-forth about what to try next.
+Get sprint notifications in your Slack channel and run commands from a thread:
 ```
-You: What should I investigate next based on the results from sp-a3f7b2?
-Bot: Based on the findings, I'd suggest... [ACTION: sprint_run {"study": "my-project", "idea": "..."}]
+sprint run my-project "investigate feature X under condition Y"
+sprint list
+loop start my-project 5
+help
 ```
 See the [Slack setup guide](https://researchloop.github.io/researchloop/slack/) for configuration.
@@ -187,7 +191,7 @@ Full docs at **[researchloop.github.io/researchloop](https://researchloop.github
 - [Configuration reference](https://researchloop.github.io/researchloop/configuration/) -- all TOML options and environment variables
 - [Deployment guide](https://researchloop.github.io/researchloop/deployment/) -- Docker, Fly.io, SSH key setup
 - [Dashboard guide](https://researchloop.github.io/researchloop/dashboard/) -- web UI features and authentication
-- [Slack integration](https://researchloop.github.io/researchloop/slack/) -- setup, commands, conversational mode
+- [Slack integration](https://researchloop.github.io/researchloop/slack/) -- setup, commands, notifications
 - [CLI reference](https://researchloop.github.io/researchloop/cli/) -- all commands with examples
 - [Security](https://researchloop.github.io/researchloop/security/) -- authentication, CSRF, webhook tokens
 - [Development](https://researchloop.github.io/researchloop/development/) -- contributing, testing, architecture

{researchloop-0.2.0 → researchloop-0.3.1}/README.md RENAMED Viewed

@@ -7,6 +7,8 @@
 [![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
+<img width="720" height="456" alt="mmlu-combined" src="https://github.com/user-attachments/assets/6d1d495f-1078-4f81-9f8a-bb1792ea3905" />
 ---
 ResearchLoop submits AI-powered research experiments to your SLURM or SGE cluster, then reports back the results. You describe a research idea in natural language, it handles the rest: submitting the job, running a multi-step pipeline with [Claude Code](https://docs.anthropic.com/en/docs/claude-code), red-teaming the results, generating a report, and notifying you when it's done.
@@ -89,11 +91,13 @@ Browse to `/dashboard/` to see all your studies, sprints, and loops. Submit new
 ### Slack bot
-Chat with the bot to start sprints, check status, or discuss research ideas. The bot maintains conversation context across a thread, so you can have a back-and-forth about what to try next.
+Get sprint notifications in your Slack channel and run commands from a thread:
 ```
-You: What should I investigate next based on the results from sp-a3f7b2?
-Bot: Based on the findings, I'd suggest... [ACTION: sprint_run {"study": "my-project", "idea": "..."}]
+sprint run my-project "investigate feature X under condition Y"
+sprint list
+loop start my-project 5
+help
 ```
 See the [Slack setup guide](https://researchloop.github.io/researchloop/slack/) for configuration.
@@ -156,7 +160,7 @@ Full docs at **[researchloop.github.io/researchloop](https://researchloop.github
 - [Configuration reference](https://researchloop.github.io/researchloop/configuration/) -- all TOML options and environment variables
 - [Deployment guide](https://researchloop.github.io/researchloop/deployment/) -- Docker, Fly.io, SSH key setup
 - [Dashboard guide](https://researchloop.github.io/researchloop/dashboard/) -- web UI features and authentication
-- [Slack integration](https://researchloop.github.io/researchloop/slack/) -- setup, commands, conversational mode
+- [Slack integration](https://researchloop.github.io/researchloop/slack/) -- setup, commands, notifications
 - [CLI reference](https://researchloop.github.io/researchloop/cli/) -- all commands with examples
 - [Security](https://researchloop.github.io/researchloop/security/) -- authentication, CSRF, webhook tokens
 - [Development](https://researchloop.github.io/researchloop/development/) -- contributing, testing, architecture

researchloop-0.3.1/docs/assets/mmlu-combined.gif ADDED Viewed

Binary file

researchloop-0.3.1/docs/assets/mmlu-combined.mp4 ADDED Viewed

Binary file

{researchloop-0.2.0 → researchloop-0.3.1}/docs/index.md RENAMED Viewed

@@ -2,6 +2,10 @@
 **Automated AI research sprints on HPC clusters.**
+<video autoplay muted loop playsinline width="720" style="max-width:100%;height:auto;border-radius:6px">
+  <source src="assets/mmlu-combined.mp4" type="video/mp4">
+</video>
 ---
 ResearchLoop automates multi-step AI research pipelines on SLURM and SGE clusters. You describe a research idea, and ResearchLoop submits it to your HPC cluster where [Claude Code](https://docs.anthropic.com/en/docs/claude-code) executes a full research pipeline -- coding, red-teaming, fixing, reporting -- inside a single job. Results are reported back via webhooks, Slack, or push notifications, and you can monitor everything from a web dashboard or the CLI.

{researchloop-0.2.0 → researchloop-0.3.1}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "researchloop"
-version = "0.2.0"
+version = "0.3.1"
 description = "Automated research sprint platform for HPC clusters"
 readme = "README.md"
 license = {text = "MIT"}

researchloop-0.3.1/researchloop/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = "0.3.1"

{researchloop-0.2.0 → researchloop-0.3.1}/researchloop/clusters/monitor.py RENAMED Viewed

@@ -6,12 +6,15 @@ import asyncio
 import json
 import logging
 from datetime import datetime, timezone
-from typing import Any
+from typing import TYPE_CHECKING, Any
 from researchloop.clusters.ssh import SSHManager
 from researchloop.db import queries
 from researchloop.schedulers.base import BaseScheduler
+if TYPE_CHECKING:
+    from researchloop.sprints.manager import SprintManager
 logger = logging.getLogger(__name__)
 # If a job's heartbeat is older than this many seconds AND the job is not
@@ -28,11 +31,17 @@ class JobMonitor:
         db: Any,
         schedulers: dict[str, BaseScheduler],
         config: Any = None,
+        sprint_manager: SprintManager | None = None,
     ) -> None:
         self.ssh_manager = ssh_manager
         self.db = db
         self.schedulers = schedulers
         self.config = config
+        # Optional: when set, terminal-state transitions go through
+        # sprint_manager.mark_sprint_terminal so the parent auto-loop
+        # advances. None falls back to a direct DB update (used by
+        # minimal test fixtures that don't construct a SprintManager).
+        self.sprint_manager = sprint_manager
         self._polling_task: asyncio.Task[None] | None = None
         self._stop_event = asyncio.Event()
@@ -143,12 +152,17 @@ class JobMonitor:
             # Persist the updated status if it changed.
             if status in ("completed", "failed"):
                 try:
-                    await queries.update_sprint(
-                        self.db,
-                        sprint_id,
-                        status=status,
-                        completed_at=datetime.now(timezone.utc).isoformat(),
-                    )
+                    if self.sprint_manager is not None:
+                        await self.sprint_manager.mark_sprint_terminal(
+                            sprint_id, status
+                        )
+                    else:
+                        await queries.update_sprint(
+                            self.db,
+                            sprint_id,
+                            status=status,
+                            completed_at=datetime.now(timezone.utc).isoformat(),
+                        )
                 except Exception:
                     logger.exception(
                         "Failed to update DB status for sprint %s", sprint_id

{researchloop-0.2.0 → researchloop-0.3.1}/researchloop/comms/slack.py RENAMED Viewed

@@ -26,12 +26,10 @@ class SlackNotifier(BaseNotifier):
         bot_token: str,
         channel_id: str | None = None,
         dashboard_url: str | None = None,
-        conversation_manager: Any = None,
     ) -> None:
         self.bot_token = bot_token
         self.channel_id = channel_id
         self.dashboard_url = dashboard_url
-        self._cm = conversation_manager
     async def _post_message(
         self,
@@ -121,10 +119,7 @@ class SlackNotifier(BaseNotifier):
             f"*Study:* {study_name}\n"
             f"*Idea:* {idea_trunc}"
         )
-        resp = await self._post_message(msg)
-        ts = resp.get("ts", "")
-        if ts and self._cm:
-            await self._cm.store_bot_message(ts, msg)
+        await self._post_message(msg)
     async def notify_sprint_completed(
         self,
@@ -140,11 +135,7 @@ class SlackNotifier(BaseNotifier):
             f"*Study:* {study_name}\n"
             f"*Summary:* {summary_trunc}"
         )
-        resp = await self._post_message(msg)
-        # Store the notification for thread context.
-        ts = resp.get("ts", "")
-        if ts and self._cm:
-            await self._cm.store_bot_message(ts, msg)
+        await self._post_message(msg)
         if pdf_path:
             await self._upload_file(
                 pdf_path,
@@ -162,10 +153,7 @@ class SlackNotifier(BaseNotifier):
         msg = (
             f":x: Sprint *{link}* failed\n*Study:* {study_name}\n*Error:* {error[:500]}"
         )
-        resp = await self._post_message(msg)
-        ts = resp.get("ts", "")
-        if ts and self._cm:
-            await self._cm.store_bot_message(ts, msg)
+        await self._post_message(msg)
 def verify_slack_signature(

{researchloop-0.2.0 → researchloop-0.3.1}/researchloop/core/orchestrator.py RENAMED Viewed

@@ -15,7 +15,6 @@ from fastapi.responses import JSONResponse
 from researchloop.clusters.monitor import JobMonitor
 from researchloop.clusters.ssh import SSHManager
-from researchloop.comms.conversation import ConversationManager
 from researchloop.comms.ntfy import NtfyNotifier
 from researchloop.comms.router import NotificationRouter
 from researchloop.comms.slack import (
@@ -51,7 +50,6 @@ class Orchestrator:
         self.auto_loop: AutoLoopController | None = None
         self.notification_router: NotificationRouter | None = None
         self.job_monitor: JobMonitor | None = None
-        self.conversation_manager: ConversationManager | None = None
     # ------------------------------------------------------------------
     # Lifecycle
@@ -108,24 +106,15 @@ class Orchestrator:
             notification_router=self.notification_router,
         )
-        # 6b. Conversation manager
-        self.conversation_manager = ConversationManager(
-            self.db, sprint_manager=self.sprint_manager
-        )
-        # Wire conversation manager to Slack notifier
-        # so notifications store thread context.
-        if self.config.slack and self.config.slack.bot_token:
-            for n in self.notification_router._notifiers:
-                if isinstance(n, SlackNotifier):
-                    n._cm = self.conversation_manager
         # 7. Auto-loop controller
         self.auto_loop = AutoLoopController(
             db=self.db,
             sprint_manager=self.sprint_manager,
             config=self.config,
         )
+        # Late-bind the back-reference so SprintManager.mark_sprint_terminal
+        # can advance the parent loop on every terminal transition.
+        self.sprint_manager.auto_loop = self.auto_loop
         # 8. Job monitor
         self.job_monitor = JobMonitor(
@@ -133,6 +122,7 @@ class Orchestrator:
             db=self.db,
             schedulers=self.schedulers,
             config=self.config,
+            sprint_manager=self.sprint_manager,
         )
         await self.job_monitor.start_polling()
@@ -419,6 +409,10 @@ def create_app(orchestrator: Orchestrator) -> FastAPI:
                 {"ok": True, "sprint_id": sprint_id, "tweak_id": tweak_id}
             )
+        # handle_completion fires auto_loop.on_sprint_complete internally
+        # via mark_sprint_terminal — single chokepoint for terminal-state
+        # transitions, so the loop also advances when the JobMonitor or a
+        # dashboard refresh is the one that detects the terminal status.
         await orchestrator.sprint_manager.handle_completion(
             sprint_id=sprint_id,
             status=status,
@@ -427,10 +421,6 @@ def create_app(orchestrator: Orchestrator) -> FastAPI:
             idea=idea,
         )
-        # Trigger auto-loop advancement if applicable.
-        if orchestrator.auto_loop is not None:
-            await orchestrator.auto_loop.on_sprint_complete(sprint_id)
         logger.info(
             "Webhook: sprint %s completion processed (status=%s)",
             sprint_id,
@@ -790,33 +780,6 @@ def create_app(orchestrator: Orchestrator) -> FastAPI:
             return
         text_lower = text.lower().strip()
-        # Handle "auth status" / "login" commands
-        if any(kw in text_lower for kw in ("auth status", "auth check", "login")):
-            if slack_cfg and slack_cfg.bot_token:
-                from researchloop.core.auth import (
-                    check_claude_auth_async,
-                )
-                ok, detail = await check_claude_auth_async()
-                notifier = SlackNotifier(
-                    bot_token=slack_cfg.bot_token,
-                    channel_id=channel,
-                )
-                if ok:
-                    msg = (
-                        ":white_check_mark: Claude is"
-                        f" authenticated on this server ({detail})."
-                    )
-                else:
-                    msg = (
-                        ":information_source: Claude is not"
-                        " authenticated on this server"
-                        " (not required — AI runs on the"
-                        " HPC cluster)."
-                    )
-                await notifier._post_message(msg, thread_ts=thread_ts)
-            return
         # Handle "help" command.
         if text_lower == "help":
             if slack_cfg and slack_cfg.bot_token:
@@ -831,7 +794,6 @@ def create_app(orchestrator: Orchestrator) -> FastAPI:
                     "• `sprint list` — list recent sprints\n"
                     "• `loop start <study> <count>`"
                     " — start an auto-loop\n"
-                    "• `auth status` — check Claude auth\n"
                     "• `help` — show this message",
                     thread_ts=thread_ts,
                 )
@@ -894,29 +856,16 @@ def create_app(orchestrator: Orchestrator) -> FastAPI:
                     )
                 return
-        # Free-form chat — pass to Claude via ConversationManager.
-        cm = orchestrator.conversation_manager
-        if cm is not None and slack_cfg and slack_cfg.bot_token:
+        # Unrecognized message — point user at the help command.
+        if slack_cfg and slack_cfg.bot_token:
             notifier = SlackNotifier(
                 bot_token=slack_cfg.bot_token,
                 channel_id=channel,
             )
-            try:
-                response_text = await cm.handle_message(
-                    thread_ts=thread_ts,
-                    user_text=text,
-                    channel=channel,
-                    bot_token=slack_cfg.bot_token if slack_cfg else None,
-                )
-                await notifier._post_message(response_text, thread_ts=thread_ts)
-            except Exception as exc:
-                logger.exception("Chat handler failed: %s", exc)
-                await notifier._post_message(
-                    "Sorry, something went wrong. Try `help` for available commands.",
-                    thread_ts=thread_ts,
-                )
+            await notifier._post_message(
+                "Sorry, I didn't understand that. Try `help` for available commands.",
+                thread_ts=thread_ts,
+            )
         return
     # -- Dashboard HTML routes -----------------------------------------

researchloop 0.2.0__tar.gz → 0.3.1__tar.gz

researchloop 0.2.0tar.gz → 0.3.1tar.gz