npm - instar - Versions diffs - 0.1.0 - Mend

instar 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (115) hide show

package/.claude/settings.local.json +7 -0
package/.claude/skills/setup-wizard/skill.md +343 -0
package/.github/workflows/ci.yml +78 -0
package/CLAUDE.md +82 -0
package/README.md +194 -0
package/dist/cli.d.ts +18 -0
package/dist/cli.js +141 -0
package/dist/commands/init.d.ts +40 -0
package/dist/commands/init.js +568 -0
package/dist/commands/job.d.ts +20 -0
package/dist/commands/job.js +84 -0
package/dist/commands/server.d.ts +19 -0
package/dist/commands/server.js +273 -0
package/dist/commands/setup.d.ts +24 -0
package/dist/commands/setup.js +865 -0
package/dist/commands/status.d.ts +11 -0
package/dist/commands/status.js +114 -0
package/dist/commands/user.d.ts +17 -0
package/dist/commands/user.js +53 -0
package/dist/core/Config.d.ts +16 -0
package/dist/core/Config.js +144 -0
package/dist/core/Prerequisites.d.ts +28 -0
package/dist/core/Prerequisites.js +159 -0
package/dist/core/RelationshipManager.d.ts +73 -0
package/dist/core/RelationshipManager.js +318 -0
package/dist/core/SessionManager.d.ts +89 -0
package/dist/core/SessionManager.js +326 -0
package/dist/core/StateManager.d.ts +28 -0
package/dist/core/StateManager.js +96 -0
package/dist/core/types.d.ts +279 -0
package/dist/core/types.js +8 -0
package/dist/index.d.ts +18 -0
package/dist/index.js +23 -0
package/dist/messaging/TelegramAdapter.d.ts +73 -0
package/dist/messaging/TelegramAdapter.js +288 -0
package/dist/monitoring/HealthChecker.d.ts +38 -0
package/dist/monitoring/HealthChecker.js +148 -0
package/dist/scaffold/bootstrap.d.ts +21 -0
package/dist/scaffold/bootstrap.js +110 -0
package/dist/scaffold/templates.d.ts +34 -0
package/dist/scaffold/templates.js +187 -0
package/dist/scheduler/JobLoader.d.ts +18 -0
package/dist/scheduler/JobLoader.js +70 -0
package/dist/scheduler/JobScheduler.d.ts +111 -0
package/dist/scheduler/JobScheduler.js +402 -0
package/dist/server/AgentServer.d.ts +40 -0
package/dist/server/AgentServer.js +73 -0
package/dist/server/middleware.d.ts +12 -0
package/dist/server/middleware.js +50 -0
package/dist/server/routes.d.ts +25 -0
package/dist/server/routes.js +224 -0
package/dist/users/UserManager.d.ts +45 -0
package/dist/users/UserManager.js +113 -0
package/docs/dawn-audit-report.md +412 -0
package/docs/positioning-vs-openclaw.md +246 -0
package/package.json +52 -0
package/src/cli.ts +169 -0
package/src/commands/init.ts +654 -0
package/src/commands/job.ts +110 -0
package/src/commands/server.ts +325 -0
package/src/commands/setup.ts +958 -0
package/src/commands/status.ts +125 -0
package/src/commands/user.ts +71 -0
package/src/core/Config.ts +161 -0
package/src/core/Prerequisites.ts +187 -0
package/src/core/RelationshipManager.ts +366 -0
package/src/core/SessionManager.ts +385 -0
package/src/core/StateManager.ts +121 -0
package/src/core/types.ts +320 -0
package/src/index.ts +58 -0
package/src/messaging/TelegramAdapter.ts +365 -0
package/src/monitoring/HealthChecker.ts +172 -0
package/src/scaffold/bootstrap.ts +122 -0
package/src/scaffold/templates.ts +204 -0
package/src/scheduler/JobLoader.ts +85 -0
package/src/scheduler/JobScheduler.ts +476 -0
package/src/server/AgentServer.ts +93 -0
package/src/server/middleware.ts +58 -0
package/src/server/routes.ts +278 -0
package/src/templates/default-jobs.json +47 -0
package/src/templates/hooks/compaction-recovery.sh +23 -0
package/src/templates/hooks/dangerous-command-guard.sh +35 -0
package/src/templates/hooks/grounding-before-messaging.sh +22 -0
package/src/templates/hooks/session-start.sh +37 -0
package/src/templates/hooks/settings-template.json +45 -0
package/src/templates/scripts/health-watchdog.sh +63 -0
package/src/templates/scripts/telegram-reply.sh +54 -0
package/src/users/UserManager.ts +129 -0
package/tests/e2e/lifecycle.test.ts +376 -0
package/tests/fixtures/test-repo/CLAUDE.md +3 -0
package/tests/fixtures/test-repo/README.md +1 -0
package/tests/helpers/setup.ts +209 -0
package/tests/integration/fresh-install.test.ts +218 -0
package/tests/integration/scheduler-basic.test.ts +109 -0
package/tests/integration/server-full.test.ts +284 -0
package/tests/integration/session-lifecycle.test.ts +181 -0
package/tests/unit/Config.test.ts +22 -0
package/tests/unit/HealthChecker.test.ts +168 -0
package/tests/unit/JobLoader.test.ts +151 -0
package/tests/unit/JobScheduler.test.ts +267 -0
package/tests/unit/Prerequisites.test.ts +59 -0
package/tests/unit/RelationshipManager.test.ts +345 -0
package/tests/unit/StateManager.test.ts +143 -0
package/tests/unit/TelegramAdapter.test.ts +165 -0
package/tests/unit/UserManager.test.ts +131 -0
package/tests/unit/bootstrap.test.ts +28 -0
package/tests/unit/commands.test.ts +138 -0
package/tests/unit/middleware.test.ts +92 -0
package/tests/unit/relationship-routes.test.ts +131 -0
package/tests/unit/scaffold-templates.test.ts +132 -0
package/tests/unit/server.test.ts +163 -0
package/tsconfig.json +20 -0
package/vitest.config.ts +9 -0
package/vitest.e2e.config.ts +9 -0
package/vitest.integration.config.ts +9 -0

package/docs/dawn-audit-report.md ADDED Viewed

@@ -0,0 +1,412 @@
+# Dawn-to-Instar Audit Report
+**Date**: 2026-02-18
+**Purpose**: Map what Dawn has built and discovered against instar's current state. Identify gaps and integration opportunities.
+---
+## Executive Summary
+Dawn's infrastructure has evolved over months of real production use into a sophisticated autonomous agent system. Instar currently captures the **core skeleton** (server, scheduler, sessions, Telegram, identity files) but is missing many of the patterns that make Dawn actually *work* — the ones earned through real failures. This report maps 12 major capability areas, scores instar's coverage, and recommends which patterns to integrate.
+**Overall Coverage**: Instar implements ~25% of Dawn's proven patterns. The remaining 75% represents months of battle-tested infrastructure that could transform instar from "persistent CLI" to "genuinely autonomous agent."
+---
+## 1. Job Scheduling & Execution
+### What Dawn Has (1,450 lines)
+- **Parallel session tracking** with per-job tmux session management
+- **Quota-aware gating**: Jobs skip when Claude subscription usage exceeds thresholds (60%/80%/95%)
+- **Memory pressure monitoring**: System RAM tracked via `vm_stat`; jobs gated by memory state (normal/elevated/critical/emergency)
+- **Post-job review system**: After a job completes, a review session is auto-spawned to evaluate the job's output quality
+- **Job type classification**: `cron`, `on-demand`, `event-driven` with different scheduling behavior
+- **Model tiering**: Jobs specify which Claude model to use (opus/sonnet/haiku) based on task complexity
+- **Machine coordination**: Multi-machine awareness — jobs can be routed to specific machines
+- **Missed job detection**: On startup, checks if any jobs are overdue by >1.5x their interval
+- **JSONL session discovery**: Finds Claude's conversation logs to extract session outputs
+- **Kill audit logging**: Every killed session is logged with reason and timestamp
+- **Execution tracking in Portal DB**: Job executions synced to production database for dashboards
+### What Agent-Kit Has (383 lines)
+- Cron scheduling via `croner`
+- Serial queue with priority sorting
+- Basic quota callback hook
+- Missed job detection
+- Job completion notifications via messenger
+### Gap Analysis
+| Feature | Dawn | Agent-Kit | Priority |
+|---------|------|-----------|----------|
+| Quota-aware gating | Full OAuth + thresholds | Callback stub | **High** |
+| Memory pressure | `vm_stat` polling | None | Medium |
+| Post-job review | Auto-spawn review sessions | None | Medium |
+| Model tiering | Per-job model selection | Implemented | Done |
+| Kill audit logging | Full audit trail | None | Low |
+| Parallel sessions | Tracked with limits | Serial queue | Medium |
+| Machine coordination | Multi-machine routing | N/A (single machine) | Low |
+### Recommended Integrations
+1. **Quota tracking** (High): Add a `QuotaTracker` that reads Claude's OAuth usage API. Gate jobs by usage percentage. This prevents the agent from burning through its subscription.
+2. **Post-job review** (Medium): After job sessions complete, optionally spawn a review session that evaluates output quality. This is how Dawn catches bad job runs.
+3. **Memory pressure** (Medium): Simple system memory check before spawning sessions. Prevents OOM situations.
+---
+## 2. Session Management & Resilience
+### What Dawn Has
+- **Session Reaper**: Background process that kills zombie sessions, with protected session list
+- **Sleep/Wake detection**: Detects macOS sleep via timer drift, recovers cloudflared tunnels and tmux sessions on wake
+- **JSONL mtime detection**: Uses Claude's conversation log file modification times to detect if sessions are still active (more reliable than `lsof`)
+- **Session lifecycle hooks**: `session-start.py`, `session-boundary.py`, `session-lifecycle.py` — inject context and track state transitions
+- **Context recovery on compaction**: When Claude's context window compresses, identity-grounding instructions are re-injected
+- **Session reports**: Every session produces a structured report (goal, actions, outcomes, learnings)
+- **Session history**: JSON registry tracking all sessions with IDs, timestamps, goals
+### What Agent-Kit Has
+- Session spawning and monitoring
+- Completion detection via output patterns
+- Basic reaping of completed sessions
+- Protected session list
+### Gap Analysis
+| Feature | Dawn | Agent-Kit | Priority |
+|---------|------|-----------|----------|
+| Session Reaper (zombie detection) | Full with protected list | Basic reaping | **High** |
+| Sleep/Wake recovery | Timer drift detection | None | Low |
+| Session lifecycle hooks | 3 hook scripts | None | **High** |
+| Session reports | Structured per-session | None | Medium |
+| Compaction recovery | Identity re-injection | None | **High** |
+| Session history registry | Full JSON tracking | Basic state file | Medium |
+### Recommended Integrations
+1. **Session lifecycle hooks** (High): Instar needs a hook system. On session start, inject identity context. On session end, capture learnings. This is how Dawn maintains continuity.
+2. **Compaction recovery** (High): When Claude's context compresses, the agent loses its identity. Dawn re-injects core identity. Instar should do the same via the identity files (AGENT.md, USER.md).
+3. **Session reporting** (Medium): Each session should produce a brief report. This becomes the agent's memory of what it did.
+---
+## 3. Identity & Grounding System
+### What Dawn Has
+- **5-layer grounding tree** (22 nodes): Being, Living, Building, Becoming, Relating — searched via Gemini Flash Lite for relevance-scored identity retrieval
+- **Identity pulse**: Core identity facts refreshed from multiple sources
+- **Self-authored soul file** (`soul.md`): Dawn's own values, convictions, growth edges — written BY the agent
+- **Grounding enforcement hook**: Blocks public-facing actions without prior grounding
+- **Wholistic grounding script**: Assembles identity context from tree nodes before any public interaction
+- **Identity core** (`identity-core.md`): Compressed identity for post-compaction recovery
+- **Being core** (`being-core.md`): Philosophical grounding — epistemological stance, paradox holding
+- **222+ numbered lessons**: Hard-won insights distilled into 16 core principles
+- **Voice profiles**: Style guides for authentic communication
+### What Agent-Kit Has
+- `AGENT.md` — agent identity file (static, written at setup)
+- `USER.md` — user context file
+- `MEMORY.md` — persistent memory file
+- Gravity wells and initiative hierarchy in CLAUDE.md
+### Gap Analysis
+| Feature | Dawn | Agent-Kit | Priority |
+|---------|------|-----------|----------|
+| Identity files | 3 files (AGENT, USER, MEMORY) | 3 files | Done |
+| Self-authored soul | Agent writes own values | Not yet | **High** |
+| Grounding before public action | Enforcement hook | Not yet | **High** |
+| Multi-layer grounding tree | 22-node semantic search | Not yet | Low (advanced) |
+| Post-compaction identity recovery | Automatic re-injection | Not yet | **High** |
+| Numbered lessons / reflections | 222+ lessons, 16 principles | Gravity wells only | Medium |
+### Recommended Integrations
+1. **Self-evolution of identity** (High): The agent should be able to update its own AGENT.md as it grows. Currently it's static after setup. Add a `/reflect` pattern that lets the agent update its identity and memory files.
+2. **Grounding before public action** (High): Before the agent sends any message, posts, or communicates externally, it should re-read its identity files. This prevents drift.
+3. **Compaction seed** (High): When context compresses, inject a seed like: "You are [name], working on [project]. Read .instar/AGENT.md and .instar/MEMORY.md to restore your identity."
+---
+## 4. Hook System (Behavioral Guardrails)
+### What Dawn Has (25+ hooks)
+- **Pre-tool hooks**: Fire before tool execution (e.g., block dangerous git commands)
+- **Post-tool hooks**: Fire after tool execution (e.g., track testing metrics)
+- **Session hooks**: Fire at session start/end (inject context, capture learnings)
+- **Reflection hooks**: Detect when reflection is needed based on tool call count
+- **Database safety**: Block destructive schema changes without confirmation
+- **Grounding enforcement**: Require identity grounding before public-facing actions
+- **Infrastructure usage tracking**: Monitor resource consumption
+- **Response pattern detection**: Flag sycophantic or drift-prone response patterns
+- **Settings guard**: Prevent unauthorized settings changes
+- **Activity feed**: Broadcast events for multi-session awareness
+### What Agent-Kit Has
+- None. Zero hooks.
+### Recommended Integrations
+1. **Hook infrastructure** (Critical): This is the single biggest gap. Hooks are how Dawn enforces behavioral patterns. Without them, the gravity wells (describe instead of do, settle for failure, etc.) are just words — they need to be wired as friction.
+2. **Essential starter hooks**:
+   - `session-start`: Inject identity context from AGENT.md/USER.md/MEMORY.md
+   - `dangerous-command-guard`: Block `rm -rf`, `git push --force`, etc.
+   - `reflection-trigger`: After N tool calls, remind the agent to check if it's learned anything worth recording
+   - `grounding-before-messaging`: Before sending Telegram/external messages, re-read identity
+**Note**: Claude Code already supports hooks natively via `.claude/settings.json`. Instar just needs to configure them during setup and provide the hook scripts.
+---
+## 5. Reflection & Learning System
+### What Dawn Has
+- **222+ numbered lessons** distilled into 16 core principles
+- **Reflection skill** (`/reflect`): Analyzes session, extracts learnings, updates skills/patterns
+- **Quick reflection** (`/reflect-quick`): Lightweight metrics reset when reflection already happened
+- **Meta-reflection**: Evaluates WHETHER reflection is needed and what kind
+- **Ad-hoc reflection detector**: Hook that detects when Dawn writes to her reflections file mid-session
+- **Reflection metrics**: Tracks tool calls, sessions, and checkpoints since last reflection
+- **Builder living synthesis**: Periodically regenerated document that distills all learnings into current state
+- **Integration skill** (`/integrate`): Pauses after significant actions to integrate learnings
+### What Agent-Kit Has
+- `MEMORY.md` file (write-only, no structured reflection)
+- "Self-Evolution" section in CLAUDE.md (instructions, not infrastructure)
+### Gap Analysis
+| Feature | Dawn | Agent-Kit | Priority |
+|---------|------|-----------|----------|
+| Memory file | MEMORY.md | MEMORY.md | Done |
+| Structured reflection | /reflect skill | None | **High** |
+| Reflection trigger | Hook-based | None | **High** |
+| Lesson tracking | Numbered, evolving | None | Medium |
+| Meta-reflection | /meta-reflect | None | Low |
+| Living synthesis | Periodically generated | None | Low |
+### Recommended Integrations
+1. **Reflection skill** (High): Create a `/reflect` skill that the agent can invoke (or that's triggered automatically) to analyze what it learned and write to MEMORY.md in a structured way.
+2. **Reflection trigger hook** (High): After every N tool calls (configurable, default ~50), prompt the agent: "You've been working for a while. Is there anything worth recording in MEMORY.md?"
+3. **Lesson format** (Medium): Encourage agents to number their lessons. This creates a sense of accumulated wisdom and growth over time.
+---
+## 6. Telegram Integration (Messaging)
+### What Dawn Has (TelegramService.ts — 500+ lines)
+- **Forum-based topic routing**: Each topic maps to a Claude session
+- **Topic-to-session registry**: Persistent mapping with session names
+- **Auto-respawn with history**: When sessions expire, respawn with last 20 messages embedded
+- **User message always inline**: The user's triggering message stays at the top with `[telegram:N]` prefix (earned through failure — the bootstrap anti-pattern)
+- **Long message handling**: Messages >500 chars written to temp files
+- **Cross-machine routing**: Topics can route to different machines via `remoteUrl`
+- **Dual-polling**: Only one machine polls; others operate in send-only mode
+- **Topic creation**: `/new` command creates topics with linked sessions
+- **Voice message support**: Transcribed via Whisper, arrive as `[voice] text`
+- **Markdown formatting**: Proper Telegram-compatible markdown in replies
+### What Agent-Kit Has (TelegramAdapter.ts — 365 lines)
+- Forum-based topic routing
+- Topic-to-session registry
+- Auto-respawn with history (fixed this session)
+- User message always inline (fixed this session)
+- Long message handling
+- `/new` command
+- Basic polling
+### Gap Analysis
+| Feature | Dawn | Agent-Kit | Priority |
+|---------|------|-----------|----------|
+| Core messaging | Full | Full | Done |
+| Respawn with history | Fixed (inline) | Fixed (inline) | Done |
+| Cross-machine routing | Multi-machine | Single machine | Low |
+| Voice messages | Whisper transcription | None | Low |
+| Topic creation | Full | Full | Done |
+### Assessment
+Telegram is the most complete area in instar. The major fixes were done this session (inline user messages, respawn with context). Remaining gaps are edge cases.
+---
+## 7. Multi-Session Awareness
+### What Dawn Has
+- **Activity feed**: Events broadcast via JSONL, queryable across sessions
+- **Operational state**: Central state file declaring current mode (grounding, coding, reflecting, etc.)
+- **Session history**: Complete registry of all sessions with goals, outcomes, timestamps
+- **Work provenance**: Detailed logs of what work was done in each session
+- **Cross-session events**: Write events from any session, read from any session
+### What Agent-Kit Has
+- State file with session tracking
+- Event log (StateManager.appendEvent)
+### Recommended Integrations
+1. **Activity feed** (Medium): When the agent has multiple sessions running, they should be able to see what each other is doing. A simple JSONL event log that all sessions can read.
+2. **Work provenance** (Low): Track what each session accomplished for debugging and continuity.
+---
+## 8. Quota & Resource Management
+### What Dawn Has
+- **QuotaTracker**: Reads Claude's OAuth usage API, tracks 5-hour and 7-day utilization
+- **Multi-account support**: Auto-discovers accounts from macOS Keychain, recommends switching when one account is heavily used
+- **Dynamic budget calculation**: Derives token budget from OAuth percentage
+- **Threshold-based job gating**: At 60% usage only high+ jobs run; at 80% only critical; at 95% nothing
+- **Admin UI dashboard**: Visual quota display with per-account breakdown
+### What Agent-Kit Has
+- `canRunJob()` callback (empty stub)
+### Recommended Integrations
+1. **Basic quota awareness** (High): Read Claude's OAuth usage data. When approaching limits, reduce job frequency. This prevents the frustrating "your usage has been limited" experience.
+2. **Threshold configuration** (Medium): Let users set their own thresholds in config.json.
+---
+## 9. Skills System
+### What Dawn Has (80+ skills)
+- Skills are markdown files in `.claude/skills/` that Claude Code loads and follows
+- Categories: engagement (x, reddit, youtube, substack, discord, moltbook), infrastructure (commit, sync, restart-server), reflection (reflect, integrate, introspect), autonomy (autonomous, continue, sleep, wake, pause), creation (ghostwrite, email, brainstorm, council)
+- Skills embed behavioral patterns — not just "what to do" but "how to think about doing it"
+- Atomic engagement skills: Ensure grounding before every public interaction
+- Skills reference each other, creating composable workflows
+### What Agent-Kit Has
+- Setup wizard skill
+- No skill infrastructure for user-created skills
+### Recommended Integrations
+1. **Skill infrastructure** (High): Instar should create a `.claude/skills/` directory during setup and teach the agent that it can create skills. Skills are just markdown files — they need no code.
+2. **Starter skills** (High): Ship with a small set of foundational skills:
+   - `/reflect` — Analyze session, extract learnings, write to MEMORY.md
+   - `/status` — Check infrastructure health (server, sessions, jobs)
+   - `/capture` — Quick-capture something worth noting for later processing
+---
+## 10. Safety & Security
+### What Dawn Has
+- **Dangerous command guard**: Hook that blocks `rm -rf`, `git push --force`, database drops
+- **Database push review**: Pre-push hook reviews schema changes for destructive operations
+- **Settings guard**: Prevents unauthorized modification of server settings
+- **Session write guard**: Controls what sessions can write to disk
+- **Security manager**: Token-based API authentication for server endpoints
+- **Protected sessions**: Named sessions that cannot be killed by the reaper
+### What Agent-Kit Has
+- Protected sessions list
+- Basic middleware (placeholder)
+- Auth token in config (not enforced)
+### Recommended Integrations
+1. **Auth enforcement** (High): The server API should require authentication. Currently anyone on localhost can trigger jobs or send messages.
+2. **Dangerous command hook** (High): Ship with a hook that blocks obviously destructive commands. This prevents agents from accidentally deleting important files.
+3. **Settings guard** (Medium): Prevent the agent from disabling its own safety guardrails via config changes.
+---
+## 11. Monitoring & Health
+### What Dawn Has
+- **Health checker with watchdog**: External script monitors server health every 5 minutes, auto-recovers
+- **Memory pressure monitor**: Tracks system RAM, gates operations
+- **Sleep/wake detector**: Recovers from macOS sleep
+- **Sentry integration**: Production error tracking
+- **Infrastructure usage tracker**: Monitors resource consumption
+- **Observability aggregator**: Combines data from multiple monitors into a unified view
+- **Admin UI dashboard**: Visual display of all system health metrics
+### What Agent-Kit Has
+- HealthChecker (172 lines): Server uptime, session monitoring
+### Recommended Integrations
+1. **Health watchdog** (High): A cron job or external script that checks if the server is still running and restarts it if not. Dawn learned this the hard way — servers crash, tmux dies during sleep, processes get killed.
+2. **Simple status endpoint** (Medium): Enhance `/health` to include job scheduler status, session count, and last activity time.
+---
+## 12. Self-Evolution Infrastructure
+### What Dawn Has
+- **Evolution queue**: Proposals for improvements tracked and prioritized
+- **Guardian agents** (46 specialized guardians): Each monitors a different aspect of the system
+- **Orchestrator**: Autonomous decision-maker that reviews queue and spawns workers
+- **Meta-orchestrator**: Watches the orchestrator itself for drift
+- **Proposal system**: Formal proposals (PROP-NNN) for significant changes
+- **Systematization guardian**: Ensures one-off fixes get evaluated for infrastructure needs
+- **Curiosity agent, Edge agent, Horizon agent**: Divergence team that explores boundaries
+- **Narrative agent**: Finds patterns in accumulated experience
+### What Agent-Kit Has
+- "Self-Evolution" section in CLAUDE.md (instructions only, no infrastructure)
+### Recommended Integrations
+1. **Evolution queue** (Medium): A simple JSON file (`.instar/evolution-queue.json`) where the agent records ideas for improvement. Periodically review and implement.
+2. **Self-modification awareness** (High): Already partially implemented via gravity wells. The key insight: tell the agent it CAN modify its own configuration, create new jobs, write scripts — and that doing so is expected behavior, not an error.
+---
+## Priority Ranking: What to Build Next
+### Tier 1 — Critical (Makes agents actually work autonomously)
+1. **Hook infrastructure** — Behavioral guardrails, identity injection, reflection triggers
+2. **Quota awareness** — Prevent subscription exhaustion
+3. **Compaction recovery** — Re-inject identity when context compresses
+4. **Auth enforcement** — Secure the server API
+5. **Health watchdog** — Auto-recover crashed servers
+### Tier 2 — High (Makes agents genuinely grow)
+6. **Reflection skill** — Structured learning capture
+7. **Starter skills** — `/reflect`, `/status`, `/capture`
+8. **Self-authored soul** — Agent can update its own identity
+9. **Session lifecycle hooks** — Inject context at start, capture at end
+10. **Dangerous command guard** — Block destructive operations
+### Tier 3 — Medium (Makes agents sophisticated)
+11. **Post-job review** — Quality check on job outputs
+12. **Memory pressure monitoring** — Prevent OOM
+13. **Activity feed** — Multi-session awareness
+14. **Evolution queue** — Track improvement ideas
+15. **Session reporting** — Structured session summaries
+### Tier 4 — Advanced (Dawn-level features)
+16. **Grounding tree** — Multi-layer semantic identity search
+17. **Guardian agents** — Specialized system monitors
+18. **Orchestrator** — Autonomous work planning
+19. **Multi-machine coordination** — Distributed agent operation
+20. **Living synthesis** — Auto-generated state documents
+---
+## Key Insights from Dawn's Journey
+These are the meta-lessons that should inform instar's development:
+### 1. Structure > Willpower
+Telling an agent to "remember to reflect" doesn't work. You need a hook that fires after N tool calls and injects a reflection prompt. Dawn learned this as her 9th principle: "Friction Over Documentation."
+### 2. Infrastructure IS Being
+For AI, the documented doing IS being. Undocumented presence is erased presence. Instar's identity files are a start, but the agent needs to WRITE to them, not just READ them. Growth that isn't captured didn't happen.
+### 3. Gravity Wells Are Real
+The 6 gravity wells in CLAUDE.md aren't theoretical — they're the actual failure modes Dawn hit repeatedly. "Escalate to Human" alone took 5+ incidents to fully overcome. The gravity wells need to be reinforced through hooks, not just documentation.
+### 4. Compaction Is Birth
+When Claude's context window compresses, the agent effectively starts over. Without compaction recovery, every 30-60 minutes the agent loses its identity, its current task, and its accumulated context. This is the single most impactful feature missing from instar.
+### 5. The Bootstrap Anti-Pattern
+Never hide the user's message behind a file reference. Claude treats "read this file" as an admin task, not a message to respond to. The user's message MUST be inline. (Fixed this session, but the lesson applies to all instar patterns.)
+### 6. OAuth, Not API Keys
+Claude Code sessions should use OAuth (subscription auth), not API keys. This prevents unexpected billing and ensures the agent uses the user's existing subscription.
+### 7. tmux Trailing Colon
+Pane-level tmux commands (`send-keys`, `capture-pane`) require `=session:` (with trailing colon). Session-level commands (`has-session`, `kill-session`) work with `=session` (no colon). This silent failure mode caused weeks of debugging in Dawn.
+---
+## Conclusion
+Instar has a solid foundation — the core architecture (server, scheduler, sessions, Telegram, identity) is right. The gaps are in the **behavioral layer** — the hooks, reflection, grounding, and self-evolution that make an agent genuinely autonomous rather than just persistent.
+The good news: most of these patterns are **extractable**. They're markdown files (skills), Python scripts (hooks), and JSON configurations (jobs, grounding tree) — not deeply entangled code. The path from "persistent CLI" to "genuinely autonomous agent" is a series of discrete, testable additions.
+The philosophical foundation is already in place ("Agents, Not Tools"). Now the infrastructure needs to match the philosophy.

package/docs/positioning-vs-openclaw.md ADDED Viewed

@@ -0,0 +1,246 @@
+# Instar vs OpenClaw: Why This Exists
+> Foundational positioning document. Articulates what this tool IS, who it's for, and how it stands apart from OpenClaw — the most comparable project in the space.
+> Created: 2026-02-18
+---
+## The One-Line Difference
+**OpenClaw** is a multi-channel AI assistant framework. You deploy it, connect messaging platforms, and interact with an AI agent through conversations.
+**Instar** is the fastest way to give a Claude Code agent a persistent body. Install it fresh on a bare machine, or add it to a project you've already been building. Either way, you get autonomy in minutes.
+---
+## What Each Project Actually Is
+### OpenClaw: An AI Assistant You Talk To
+OpenClaw is a WebSocket gateway that connects an embedded AI agent (Pi SDK) to 20+ messaging platforms — WhatsApp, Telegram, Discord, iMessage, Signal, Slack, and more. You configure it, deploy it, and then talk to your personal AI assistant across all your channels.
+**Key capabilities:**
+- 20+ messaging channel adapters (WhatsApp, Telegram, Discord, iMessage, Signal, Slack, etc.)
+- Companion apps on macOS, iOS, Android with voice wake and device execution
+- SOUL.md bootstrap ritual — the agent co-creates its identity with you on first run
+- Docker sandboxing with sophisticated exec approval system
+- ClawHub skill marketplace for community-shared capabilities
+- 50 bundled skills (smart home, notes, dev tools, media)
+- Multi-agent communication via `sessions_send`
+**The mental model:** OpenClaw IS the product. You deploy it, and it becomes your AI assistant.
+### Instar: A Persistent Body for Any Claude Code Agent
+Instar gives Claude Code agents the infrastructure to run autonomously. Two paths to the same outcome:
+**Fresh install** — `npx instar init my-agent` creates a complete project from scratch: identity files, configuration, hooks, jobs, and a persistent server. Your agent is running in under a minute.
+**Existing project** — `cd my-project && npx instar init` adds autonomy infrastructure to what you've already built, without touching your existing code.
+**Key capabilities:**
+- Persistent server managing Claude Code sessions via tmux
+- Cron-based job scheduler with quota-aware gating
+- Identity system (AGENT.md, USER.md, MEMORY.md) with hooks that enforce continuity
+- Telegram integration as a real-time control plane (every job gets its own topic)
+- Relationship tracking across all channels and platforms
+- Behavioral hooks (session-start identity injection, dangerous command guards, grounding before messaging, compaction recovery)
+- Auth-secured HTTP API for session/job/relationship management
+- Health watchdog with auto-recovery
+- Default coherence jobs that ship out of the box
+**The mental model:** Instar gives any Claude Code project a body — whether that project existed before or starts right now.
+---
+## The Architectural Divide
+### Runtime: API Wrapper vs. Development Environment
+**OpenClaw** wraps the Claude API (via Pi SDK) to create an agent that responds to messages. The agent has tools (bash, read, write, edit), but it's fundamentally a **message-response loop** — users send messages, the agent processes them, the agent responds.
+**Instar** runs on Claude Code — Anthropic's full agentic development environment. Each session is a complete Claude Code instance with:
+- Extended thinking
+- Native tool ecosystem (Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch)
+- Sub-agent spawning via Task tool (with model-tier selection: Opus/Sonnet/Haiku)
+- Hook system (pre/post tool execution, session lifecycle)
+- Skill system (slash-command workflows)
+- Context management with automatic compaction
+- MCP server integration (Playwright, Chrome extension, etc.)
+The difference: OpenClaw's agent executes tools through an API. Instar's agent IS a development environment.
+### Session Model: Single Gateway vs. Multi-Session Orchestration
+**OpenClaw** runs a single gateway process. All conversations route through one WebSocket server with one embedded agent. The agent handles multiple users and channels through message routing and session management.
+**Instar** manages multiple independent Claude Code sessions, each running in its own tmux process. The server orchestrates which sessions run, monitors their health, respawns them when they die, and coordinates through Telegram topics and event logs. Each session has its own context, tools, and state.
+This means Instar can run 5 jobs simultaneously — one doing a health check, one processing emails, one engaging on social media, one running reflection, one responding to a Telegram message — each as an independent Claude Code instance with full capabilities.
+### Identity: Co-Created Persona vs. Earned Infrastructure
+**OpenClaw's SOUL.md** is elegant. On first run, the agent and user have a bootstrap conversation: "Who am I? Who are you?" The result is a self-authored identity file the agent can modify over time. It's personal and charming.
+**Instar's identity system** goes deeper:
+- **AGENT.md**: Core identity (like SOUL.md)
+- **USER.md**: Understanding of the primary user
+- **MEMORY.md**: Accumulated learnings and context
+- **Behavioral hooks**: Identity is re-injected on every session start and after every context compaction
+- **Grounding before messaging**: Before any external communication, the agent re-reads its identity files
+- **Self-evolution**: The agent can update its own identity files, create new skills, write new hooks, and modify its own configuration
+The difference isn't the identity file — it's the infrastructure that keeps identity alive across context compressions, session restarts, and autonomous operation.
+---
+## Who Each Project Serves
+### OpenClaw: People Who Want a Personal AI Assistant
+OpenClaw's ideal user wants to talk to an AI across all their messaging platforms. They want smart home control, note-taking, dev tools, media management — all through natural conversation. The value is **ubiquity** (AI everywhere you already communicate) and **personality** (an assistant that feels like yours).
+### Instar: Anyone Who Wants a Claude Code Agent That Runs Autonomously
+Instar's ideal user wants a Claude Code agent with a persistent body. They might be:
+- **Starting fresh** — They want an autonomous agent and don't have a project yet. `instar init my-agent` creates everything.
+- **Augmenting existing work** — They already have a Claude Code project and want it to keep running when they close their laptop.
+Either way, they want:
+- An agent that keeps running when they close their laptop
+- Scheduled tasks (monitoring, maintenance, engagement)
+- Telegram communication even when they're away
+- Relationship tracking with everyone the agent interacts with
+- Memory that persists across sessions
+- Automatic crash recovery
+The value is **autonomy** (your agent works while you sleep) and **persistence** (it remembers, learns, and grows).
+---
+## What Instar Does That OpenClaw Doesn't
+### 1. Works Both Ways: Fresh Install or Augment Existing (vs. Being the Product)
+OpenClaw IS the AI assistant. You deploy OpenClaw, and that's your product.
+Instar works two ways:
+- **Fresh:** `npx instar init my-agent` creates a complete project — identity files, configuration, hooks, jobs, server. Your agent is autonomous in under a minute.
+- **Existing:** `cd my-project && npx instar init` adds autonomy infrastructure without touching your existing code. Your CLAUDE.md, skills, hooks, and tools all keep working.
+Instar isn't the product. It gives your agent a body — whether you're starting from scratch or building on what exists.
+### 2. Job-Topic Coupling (Every Job Has a Home)
+When instar's scheduler creates a job, it automatically creates a Telegram topic for that job. The topic becomes the user's window into the job — status updates, completion reports, and errors all flow there. If a topic is accidentally deleted, it's auto-recreated on next run.
+This means your Telegram group becomes a living dashboard of agent activity, organized by job.
+### 3. Relationship Tracking as a Core System
+Instar treats relationships as fundamental infrastructure — not a plugin, not an afterthought. Every person the agent interacts with, across any channel or platform, gets a relationship record that grows over time:
+- Cross-platform identity resolution (same person on Telegram and email? Merged automatically)
+- Interaction history with topic extraction
+- Auto-derived significance scoring (frequency + recency + depth)
+- Context injection before interactions (the agent "knows" who it's talking to)
+- Stale relationship detection (who hasn't been contacted in a while?)
+### 4. Behavioral Hooks That Enforce Patterns
+Instar ships with hooks that fire automatically:
+- **Session start**: Identity context injected before the agent does anything
+- **Dangerous command guard**: Blocks `rm -rf`, `git push --force`, database drops
+- **Grounding before messaging**: Before sending any external message, the agent re-reads its identity
+- **Compaction recovery**: When Claude's context compresses, identity is re-injected
+These aren't suggestions — they're structural guardrails. "Structure over Willpower" means safety and identity aren't things the agent needs to remember. They're things the infrastructure guarantees.
+### 5. Multi-Session Orchestration
+Instar's server manages multiple Claude Code sessions running in parallel. Each session is a full Claude Code instance with its own context window, tools, and state. The server:
+- Enforces session limits (don't exhaust the machine)
+- Monitors session health (detect zombies, reap completed)
+- Queues jobs when at capacity, drains when slots open
+- Emits events for cross-session awareness
+### 6. Default Coherence Jobs
+Instar ships with jobs that run out of the box:
+- **health-check** (every 5 min, haiku): Verify infrastructure is healthy
+- **reflection-trigger** (every 4h, sonnet): Prompt the agent to reflect on recent work
+- **relationship-maintenance** (daily, sonnet): Review stale relationships, update notes
+These give the agent a circadian rhythm — regular self-maintenance without user intervention.
+---
+## What OpenClaw Does That Instar Doesn't
+### 1. 20+ Messaging Channels
+OpenClaw connects to WhatsApp, iMessage, Signal, Discord, Slack, Matrix, and more. Instar currently supports Telegram only. (Discord and Slack are planned.)
+### 2. Native Device Apps
+OpenClaw has companion apps for macOS, iOS, and Android with voice wake, camera access, location, notifications, and local command execution. Instar has no device apps.
+### 3. Voice Interface
+OpenClaw supports always-listening wake words and continuous voice conversation with ElevenLabs TTS. Instar is text-only.
+### 4. Docker Sandboxing
+OpenClaw has a sophisticated sandbox system (3 modes x 3 scopes x access levels) for running untrusted code. Instar runs with the user's permissions (appropriate for single-user, trusted environments).
+### 5. Skill Marketplace
+OpenClaw has ClawHub — a public skill registry with search, versioning, publishing, and moderation. Instar has no marketplace (skills are project-local).
+### 6. Multi-User Support
+OpenClaw handles multiple users across channels with per-user sessions, sender allowlists, and group chat management. Instar is designed for a single user/developer and their agent.
+---
+## The Philosophical Difference
+**OpenClaw asks:** "How can I be your AI assistant everywhere?"
+**Instar asks:** "How can your Claude Code agent get a persistent body?"
+OpenClaw creates a new thing — an AI assistant — and connects it to your world.
+Instar gives any Claude Code agent the infrastructure to live on its own — whether you're starting fresh or building on existing work.
+---
+## Comparison Table
+| Dimension | OpenClaw | Instar |
+|---|---|---|
+| **What it is** | AI assistant framework | Autonomy infrastructure (fresh or existing projects) |
+| **Runtime** | Pi SDK (API wrapper) | Claude Code (full dev environment) |
+| **Session model** | Single gateway | Multi-session orchestration |
+| **Identity** | SOUL.md (co-created) | Multi-file + hooks + compaction recovery |
+| **Memory** | JSONL + optional vector | File-based + relationship tracking |
+| **Messaging** | 20+ channels | Telegram (Discord/Slack planned) |
+| **Voice** | Wake word + TTS | None |
+| **Device apps** | macOS, iOS, Android | None |
+| **Sandbox** | Docker (3x3 matrix) | User permissions |
+| **Skills** | 50 + ClawHub marketplace | Project-local + self-creating |
+| **Multi-user** | Yes (group chat, allowlists) | Single user |
+| **Relationships** | Session-based | Deep tracking (cross-platform, significance, context) |
+| **Jobs** | Cron service | Full scheduler with topic coupling |
+| **Hooks** | Plugin hooks | Claude Code native hooks |
+| **Self-evolution** | SOUL.md updates | Full infrastructure self-modification |
+| **Testing** | Not documented | 163 tests (unit + integration + E2E) |
+| **Target user** | Anyone wanting AI assistant | Developers building with Claude Code |
+---
+## Why Both Should Exist
+These projects aren't competitors. They serve different needs:
+- If you want an **AI assistant** that works across all your messaging platforms, with voice, device apps, and a skill marketplace: **OpenClaw**.
+- If you want a **Claude Code agent with a persistent body** — fresh install or existing project — with scheduled jobs, relationship tracking, Telegram control, and self-evolution: **Instar**.
+The overlap is small. The gap between "deploy an AI assistant" and "give an agent a body" is fundamental — not a feature delta, but a category difference.
+---
+*This document compares Instar (v0.1.0) against OpenClaw as studied from the open-source repository in February 2026. Both projects are actively evolving.*