@openlife/cli 1.7.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/INSTALL.md +266 -0
- package/LICENSE +21 -0
- package/README.md +142 -0
- package/bin/openlife.js +3 -0
- package/dist/admin_panel_server.js +66 -0
- package/dist/cli/AgentManager.js +109 -0
- package/dist/cli/AutonomousInstaller.js +134 -0
- package/dist/cli/DreamOrganizer.js +88 -0
- package/dist/cli/HostInstaller.js +426 -0
- package/dist/cli/InstallBanner.js +16 -0
- package/dist/cli/InstallFlow.js +256 -0
- package/dist/cli/InstallHeadless.js +47 -0
- package/dist/cli/InstallModules.js +148 -0
- package/dist/cli/InstallStateStore.js +75 -0
- package/dist/cli/InstallWizard.js +364 -0
- package/dist/cli/ProfileManager.js +163 -0
- package/dist/cli/SystemInstaller.js +89 -0
- package/dist/cli/WorldClassCommands.js +208 -0
- package/dist/design/DesignMdImporter.js +82 -0
- package/dist/design/DesignMdMode.js +93 -0
- package/dist/design/DesignMdRegistry.js +67 -0
- package/dist/index.js +2575 -0
- package/dist/memory/ConversationMemory.js +33 -0
- package/dist/memory/LocalMemoryProvider.js +86 -0
- package/dist/memory/Mem0Provider.js +16 -0
- package/dist/memory/MemoryNamespacePolicy.js +27 -0
- package/dist/memory/MemoryOrchestrator.js +65 -0
- package/dist/memory/MemoryPromotionFlow.js +32 -0
- package/dist/memory/MemoryProvider.js +2 -0
- package/dist/memory/MemoryProviderRegistry.js +27 -0
- package/dist/memory/MemoryRetentionPolicy.js +60 -0
- package/dist/memory/MempalaceProvider.js +72 -0
- package/dist/memory/OmniMemory.js +106 -0
- package/dist/memory/RedisAgentMemoryProvider.js +16 -0
- package/dist/memory/SessionManager.js +86 -0
- package/dist/memory/ZepGraphitiProvider.js +16 -0
- package/dist/orchestrator/AgentRegistry.js +56 -0
- package/dist/orchestrator/AgentScoring.js +82 -0
- package/dist/orchestrator/AgentTeam.js +22 -0
- package/dist/orchestrator/ArbitrationAgent.js +43 -0
- package/dist/orchestrator/ArbitrationScorecard.js +17 -0
- package/dist/orchestrator/AssetPromotionEngine.js +65 -0
- package/dist/orchestrator/AssetReuseRouter.js +63 -0
- package/dist/orchestrator/BenchmarkEngine.js +75 -0
- package/dist/orchestrator/Brain.js +298 -0
- package/dist/orchestrator/CadenceEngine.js +76 -0
- package/dist/orchestrator/CapabilityRouter.js +36 -0
- package/dist/orchestrator/CommandLanguage.js +27 -0
- package/dist/orchestrator/CommandRouter.js +70 -0
- package/dist/orchestrator/ConsequenceForecaster.js +286 -0
- package/dist/orchestrator/CronManager.js +286 -0
- package/dist/orchestrator/DynamicAgentBuilder.js +48 -0
- package/dist/orchestrator/DynamicAgentExecutor.js +15 -0
- package/dist/orchestrator/EnterpriseAgenticCore.js +276 -0
- package/dist/orchestrator/ExecutionBoard.js +86 -0
- package/dist/orchestrator/ExecutionIntent.js +13 -0
- package/dist/orchestrator/ExecutionModePolicy.js +48 -0
- package/dist/orchestrator/ExecutionRouter.js +9 -0
- package/dist/orchestrator/ExecutionState.js +20 -0
- package/dist/orchestrator/ExecutorHealth.js +86 -0
- package/dist/orchestrator/ExternalCatalogRegistry.js +83 -0
- package/dist/orchestrator/Gatekeeper.js +414 -0
- package/dist/orchestrator/Gateway.js +508 -0
- package/dist/orchestrator/GovernanceConsentStore.js +66 -0
- package/dist/orchestrator/GovernanceLayer.js +179 -0
- package/dist/orchestrator/GovernancePolicyStore.js +53 -0
- package/dist/orchestrator/GovernanceScopeLedger.js +134 -0
- package/dist/orchestrator/GovernanceScopePolicy.js +67 -0
- package/dist/orchestrator/IntentClassifier.js +45 -0
- package/dist/orchestrator/JobLifecycle.js +91 -0
- package/dist/orchestrator/LearningRouter.js +24 -0
- package/dist/orchestrator/MediaManager.js +92 -0
- package/dist/orchestrator/MemoryCuratorAgent.js +41 -0
- package/dist/orchestrator/MissionState.js +155 -0
- package/dist/orchestrator/ModelManager.js +84 -0
- package/dist/orchestrator/OperatingSystem.js +71 -0
- package/dist/orchestrator/OperationalMemoryStore.js +94 -0
- package/dist/orchestrator/OptimizationLoop.js +72 -0
- package/dist/orchestrator/OrchestrationLoop.js +905 -0
- package/dist/orchestrator/OrgStructure.js +88 -0
- package/dist/orchestrator/OutcomeSimulator.js +46 -0
- package/dist/orchestrator/ParallelOrchestrationLoop.js +36 -0
- package/dist/orchestrator/PerformanceScorecard.js +105 -0
- package/dist/orchestrator/PlannerAgent.js +46 -0
- package/dist/orchestrator/ProcessSandbox.js +129 -0
- package/dist/orchestrator/PromotionPipeline.js +74 -0
- package/dist/orchestrator/PromotionReviewGate.js +11 -0
- package/dist/orchestrator/QueueScheduler.js +260 -0
- package/dist/orchestrator/ReleaseGate.js +36 -0
- package/dist/orchestrator/ReleaseWorkflow.js +68 -0
- package/dist/orchestrator/RemotePublisher.js +139 -0
- package/dist/orchestrator/ReuseEngine.js +89 -0
- package/dist/orchestrator/ReviewerAgent.js +49 -0
- package/dist/orchestrator/RoleHandoff.js +65 -0
- package/dist/orchestrator/RuntimeHealthMonitor.js +143 -0
- package/dist/orchestrator/RuntimePolicy.js +105 -0
- package/dist/orchestrator/RuntimeProbe.js +97 -0
- package/dist/orchestrator/RuntimeRegistry.js +73 -0
- package/dist/orchestrator/SandboxPolicy.js +22 -0
- package/dist/orchestrator/SecurityDownloadGuard.js +169 -0
- package/dist/orchestrator/SecurityEventStore.js +58 -0
- package/dist/orchestrator/ServiceCompletionPolicy.js +36 -0
- package/dist/orchestrator/ServiceState.js +195 -0
- package/dist/orchestrator/SkillCreator.js +404 -0
- package/dist/orchestrator/SkillLearningLoop.js +57 -0
- package/dist/orchestrator/SkillManager.js +75 -0
- package/dist/orchestrator/SkillNetwork.js +29 -0
- package/dist/orchestrator/SkillRegistryV2.js +28 -0
- package/dist/orchestrator/SkillScoring.js +70 -0
- package/dist/orchestrator/SquadAutoCreator.js +64 -0
- package/dist/orchestrator/SquadCreator.js +727 -0
- package/dist/orchestrator/SquadRegistry.js +28 -0
- package/dist/orchestrator/SquadRouter.js +33 -0
- package/dist/orchestrator/SquadScoring.js +70 -0
- package/dist/orchestrator/SubagentLifecycle.js +90 -0
- package/dist/orchestrator/SynthesizerAgent.js +48 -0
- package/dist/orchestrator/SystemDoctor.js +224 -0
- package/dist/orchestrator/TaskExecutor.js +422 -0
- package/dist/orchestrator/TeammateBoard.js +61 -0
- package/dist/orchestrator/TestHarness.js +184 -0
- package/dist/orchestrator/VoiceManager.js +203 -0
- package/dist/orchestrator/VoiceRouter.js +89 -0
- package/dist/orchestrator/capability/CapabilityGenesisEngine.js +278 -0
- package/dist/orchestrator/capability/CapabilityPackParser.js +223 -0
- package/dist/orchestrator/capability/CapabilityPackSchema.js +62 -0
- package/dist/orchestrator/capability/CapabilityPackState.js +163 -0
- package/dist/orchestrator/providers/AgentProvider.js +2 -0
- package/dist/orchestrator/providers/CapabilityProvider.js +12 -0
- package/dist/orchestrator/providers/CloudAgentProvider.js +55 -0
- package/dist/orchestrator/providers/CloudSkillProvider.js +55 -0
- package/dist/orchestrator/providers/CloudSquadProvider.js +55 -0
- package/dist/orchestrator/providers/CompositeAgentProvider.js +16 -0
- package/dist/orchestrator/providers/CompositeCapabilityProvider.js +25 -0
- package/dist/orchestrator/providers/CompositeSkillProvider.js +16 -0
- package/dist/orchestrator/providers/CompositeSquadProvider.js +16 -0
- package/dist/orchestrator/providers/CompositeWorkflowProvider.js +46 -0
- package/dist/orchestrator/providers/FileAgentProvider.js +105 -0
- package/dist/orchestrator/providers/FileCapabilityProvider.js +106 -0
- package/dist/orchestrator/providers/FileSkillProvider.js +65 -0
- package/dist/orchestrator/providers/FileSquadProvider.js +69 -0
- package/dist/orchestrator/providers/FileWorkflowProvider.js +103 -0
- package/dist/orchestrator/providers/SkillProvider.js +2 -0
- package/dist/orchestrator/providers/SquadProvider.js +2 -0
- package/dist/orchestrator/toolset/ToolsetGuard.js +69 -0
- package/dist/orchestrator/toolset/ToolsetRegistry.js +65 -0
- package/dist/orchestrator/toolset/ToolsetSchema.js +21 -0
- package/dist/orchestrator/util/AtomicWriter.js +204 -0
- package/dist/orchestrator/util/DistributedLock.js +232 -0
- package/dist/orchestrator/util/TemplateRenderer.js +87 -0
- package/dist/orchestrator/util/WatchdogHeartbeat.js +116 -0
- package/dist/orchestrator/workflow/ConditionParser.js +232 -0
- package/dist/orchestrator/workflow/WorkflowEngine.js +379 -0
- package/dist/orchestrator/workflow/WorkflowParser.js +368 -0
- package/dist/orchestrator/workflow/WorkflowSchema.js +65 -0
- package/dist/orchestrator/workflow/WorkflowState.js +11 -0
- package/dist/reversa/ReversaAgent.js +134 -0
- package/dist/reversa/ReversaContracts.js +62 -0
- package/dist/reversa/ReversaExecutors.js +65 -0
- package/dist/skills/SkillRegistry.js +71 -0
- package/dist/squads/SquadManager.js +87 -0
- package/dist/test_admin_teams_networks.js +54 -0
- package/dist/test_agent_team_skill_network.js +15 -0
- package/dist/test_aiobuilder_cli_parity.js +169 -0
- package/dist/test_ask_exit.js +73 -0
- package/dist/test_atomic_writer.js +209 -0
- package/dist/test_autonomous_soak.js +141 -0
- package/dist/test_benchmark_engine.js +41 -0
- package/dist/test_brain_error_diagnostics.js +51 -0
- package/dist/test_brain_fallback_chain.js +93 -0
- package/dist/test_capability_genesis_engine.js +225 -0
- package/dist/test_capability_pack_schema.js +214 -0
- package/dist/test_catalog_quality.js +150 -0
- package/dist/test_cli_crud_roundtrip.js +154 -0
- package/dist/test_cli_diagnostics.js +131 -0
- package/dist/test_cli_doc_parity.js +126 -0
- package/dist/test_cli_help_surface.js +106 -0
- package/dist/test_cli_service_commands.js +83 -0
- package/dist/test_consequence_forecast_brain.js +165 -0
- package/dist/test_consequence_forecaster.js +24 -0
- package/dist/test_conversation_memory.js +36 -0
- package/dist/test_create_entities.js +54 -0
- package/dist/test_creator_placeholders_completed.js +177 -0
- package/dist/test_cron_manager.js +123 -0
- package/dist/test_daemon_sigterm.js +72 -0
- package/dist/test_deep_research_capability.js +87 -0
- package/dist/test_designmd_import_registry.js +16 -0
- package/dist/test_designmd_mode.js +50 -0
- package/dist/test_designmd_mode_workspace.js +13 -0
- package/dist/test_dist_templates_layout.js +135 -0
- package/dist/test_distributed_lock.js +201 -0
- package/dist/test_distribution_installability.js +67 -0
- package/dist/test_doctor_sandbox_check.js +44 -0
- package/dist/test_dream_organizer.js +25 -0
- package/dist/test_dual_mode.js +15 -0
- package/dist/test_enterprise_agentic_core.js +128 -0
- package/dist/test_forecast_brain_wiring.js +87 -0
- package/dist/test_gateway_telegram_guardrails.js +52 -0
- package/dist/test_governance.js +34 -0
- package/dist/test_governance_advanced.js +75 -0
- package/dist/test_governance_scope_ledger.js +147 -0
- package/dist/test_governance_v13_policies.js +44 -0
- package/dist/test_guided_creator_cli.js +100 -0
- package/dist/test_host_install_e2e.js +324 -0
- package/dist/test_host_installer.js +259 -0
- package/dist/test_host_installers_gemini_codex.js +95 -0
- package/dist/test_host_uninstaller.js +295 -0
- package/dist/test_install_flow.js +70 -0
- package/dist/test_install_flow_host_validation.js +143 -0
- package/dist/test_install_wizard.js +272 -0
- package/dist/test_integration_gemini_live.js +95 -0
- package/dist/test_integration_http_trigger_live.js +154 -0
- package/dist/test_integration_telegram_live.js +102 -0
- package/dist/test_job_lifecycle.js +16 -0
- package/dist/test_memory_orchestrator.js +33 -0
- package/dist/test_memory_promotion.js +36 -0
- package/dist/test_memory_retention.js +37 -0
- package/dist/test_mission_checkpoint.js +204 -0
- package/dist/test_multi_host_docs_parity.js +125 -0
- package/dist/test_openlife_auto_creator_routing.js +69 -0
- package/dist/test_openlife_evolution_surface.js +77 -0
- package/dist/test_openlife_gatekeeper_routing.js +15 -0
- package/dist/test_openlife_routing_surface.js +27 -0
- package/dist/test_openlife_runtime_source_truth.js +25 -0
- package/dist/test_operating_system.js +45 -0
- package/dist/test_optimization_loop.js +38 -0
- package/dist/test_orchestration_assets_lifecycle.js +78 -0
- package/dist/test_outcome_simulator.js +38 -0
- package/dist/test_performance_latency.js +215 -0
- package/dist/test_performance_scorecard.js +38 -0
- package/dist/test_phase1_check_exit.js +103 -0
- package/dist/test_phase6_board.js +31 -0
- package/dist/test_phase6_cadence.js +29 -0
- package/dist/test_phase6_ops.js +37 -0
- package/dist/test_post_mission_evaluation.js +190 -0
- package/dist/test_process_sandbox.js +88 -0
- package/dist/test_profile_toolset_mcp.js +125 -0
- package/dist/test_queue_scheduler.js +239 -0
- package/dist/test_release_gate.js +23 -0
- package/dist/test_remote_publish.js +193 -0
- package/dist/test_reversa_contracts_e2e.js +48 -0
- package/dist/test_reversa_export_and_strict.js +51 -0
- package/dist/test_reversa_full_execution.js +12 -0
- package/dist/test_reversa_lite.js +9 -0
- package/dist/test_royal_stack_golden.js +179 -0
- package/dist/test_runtime_health_backoff.js +154 -0
- package/dist/test_runtime_policy.js +26 -0
- package/dist/test_runtime_probe.js +19 -0
- package/dist/test_runtime_profile_oauth_only.js +262 -0
- package/dist/test_runtime_registry.js +11 -0
- package/dist/test_security_download_and_scan.js +103 -0
- package/dist/test_security_download_guard.js +14 -0
- package/dist/test_service_command_surface.js +12 -0
- package/dist/test_service_completion_policy.js +32 -0
- package/dist/test_service_guardrails_delete.js +12 -0
- package/dist/test_service_mode_explicit_only.js +174 -0
- package/dist/test_sources_import_ref.js +46 -0
- package/dist/test_sources_scaffold.js +43 -0
- package/dist/test_squad_skill_creator.js +305 -0
- package/dist/test_squad_skill_design_llm.js +176 -0
- package/dist/test_subsystems_org_state.js +271 -0
- package/dist/test_subsystems_promotion_memory_assets.js +343 -0
- package/dist/test_subsystems_routing_governance.js +234 -0
- package/dist/test_task_executor_sandbox_optin.js +127 -0
- package/dist/test_teammate_learning.js +15 -0
- package/dist/test_telegram_delete_guardrail.js +21 -0
- package/dist/test_toolset_enforcement.js +188 -0
- package/dist/test_trigger_basic_auth.js +112 -0
- package/dist/test_util/doc_parity.js +120 -0
- package/dist/test_v15_e2e_integration.js +207 -0
- package/dist/test_watchdog_heartbeat.js +152 -0
- package/dist/test_workflow_condition_parser.js +63 -0
- package/dist/test_workflow_e2e.js +240 -0
- package/dist/test_workflow_engine.js +330 -0
- package/dist/test_workflow_parser.js +245 -0
- package/dist/test_workflow_schema_backward_compat.js +197 -0
- package/dist-templates/README.md +91 -0
- package/dist-templates/claude-code/agents/openlife-atlas.md +52 -0
- package/dist-templates/claude-code/agents/openlife-forge.md +42 -0
- package/dist-templates/claude-code/agents/openlife-genesis.md +59 -0
- package/dist-templates/claude-code/agents/openlife-lyra.md +40 -0
- package/dist-templates/claude-code/agents/openlife-maestro.md +45 -0
- package/dist-templates/claude-code/commands/openlife/ask.md +14 -0
- package/dist-templates/claude-code/commands/openlife/doctor.md +19 -0
- package/dist-templates/claude-code/commands/openlife/dream.md +20 -0
- package/dist-templates/claude-code/commands/openlife/status.md +14 -0
- package/dist-templates/claude-code/mcp/openlife-orchestrator.json +46 -0
- package/dist-templates/codex/README.md +7 -0
- package/dist-templates/codex/agents/openlife-atlas.md +52 -0
- package/dist-templates/codex/agents/openlife-forge.md +42 -0
- package/dist-templates/codex/agents/openlife-genesis.md +59 -0
- package/dist-templates/codex/agents/openlife-lyra.md +40 -0
- package/dist-templates/codex/agents/openlife-maestro.md +45 -0
- package/dist-templates/codex/commands/openlife/ask.md +14 -0
- package/dist-templates/codex/commands/openlife/doctor.md +19 -0
- package/dist-templates/codex/commands/openlife/dream.md +20 -0
- package/dist-templates/codex/commands/openlife/status.md +14 -0
- package/dist-templates/codex/mcp/openlife-orchestrator.json +46 -0
- package/dist-templates/gemini-cli/README.md +8 -0
- package/dist-templates/gemini-cli/agents/openlife-atlas.md +52 -0
- package/dist-templates/gemini-cli/agents/openlife-forge.md +42 -0
- package/dist-templates/gemini-cli/agents/openlife-genesis.md +59 -0
- package/dist-templates/gemini-cli/agents/openlife-lyra.md +40 -0
- package/dist-templates/gemini-cli/agents/openlife-maestro.md +45 -0
- package/dist-templates/gemini-cli/commands/openlife/ask.md +14 -0
- package/dist-templates/gemini-cli/commands/openlife/doctor.md +19 -0
- package/dist-templates/gemini-cli/commands/openlife/dream.md +20 -0
- package/dist-templates/gemini-cli/commands/openlife/status.md +14 -0
- package/dist-templates/gemini-cli/mcp/openlife-orchestrator.json +46 -0
- package/dist-templates/skill-template/README.md +34 -0
- package/dist-templates/skill-template/SKILL.md.template +59 -0
- package/dist-templates/squad-template/README.md +82 -0
- package/dist-templates/squad-template/SQUAD.md.template +51 -0
- package/dist-templates/squad-template/agent-template.md +51 -0
- package/dist-templates/squad-template/checklist-template.md +25 -0
- package/dist-templates/squad-template/task-template.md +36 -0
- package/dist-templates/workflows/PORTED_WORKFLOWS.md +60 -0
- package/dist-templates/workflows/brownfield-discovery.yaml +137 -0
- package/dist-templates/workflows/greenfield-fullstack.yaml +132 -0
- package/dist-templates/workflows/qa-loop.yaml +125 -0
- package/dist-templates/workflows/story-development-cycle.yaml +80 -0
- package/docs/CHANGELOG_FEATURE_ROLLOUT_DESIGNMD.md +43 -0
- package/docs/EXTERNAL_SOURCES_AND_SECURITY_GUARD.md +33 -0
- package/docs/OPENLIFE_AUDIT_2026-05-06.md +170 -0
- package/docs/OPENLIFE_CONSOLIDATED_PLAN_2026-05-06.md +299 -0
- package/docs/OPENLIFE_DUAL_MODE_IMPLEMENTATION_PLAN.md +205 -0
- package/docs/OPENLIFE_EVOLUTION_SURFACE_2026-05-07.md +53 -0
- package/docs/OPENLIFE_SKILLS_IMPORT_2026-05-07.json +223 -0
- package/docs/OPENLIFE_SQUADS_IMPORT_2026-05-07.json +184 -0
- package/docs/PAPERCLIP_OPENLIFE_INVESTIGATION.md +85 -0
- package/docs/README.md +28 -0
- package/docs/RELEASE_ORGANIZATION_PLAN.md +164 -0
- package/docs/audit/CLI-EXECUTION-RESULTS.md +113 -0
- package/docs/audit/CLI-MATRIX.md +556 -0
- package/docs/audit/DOC-PARITY-GAPS.md +351 -0
- package/docs/audit/ORCHESTRATOR-MATRIX.md +136 -0
- package/docs/audit/TEST-COVERAGE-GAPS.md +334 -0
- package/docs/audit/integrations/SKIPPED.md +101 -0
- package/docs/autonomous-install.md +79 -0
- package/docs/capability-genesis.md +137 -0
- package/docs/capability-pack-schema.md +157 -0
- package/docs/commands.md +82 -0
- package/docs/deep-research-capability.md +114 -0
- package/docs/development/typescript-conventions.md +95 -0
- package/docs/host-installers.md +68 -0
- package/docs/install/aiobuilder.md +70 -0
- package/docs/install/claude-code.md +83 -0
- package/docs/install/codex.md +64 -0
- package/docs/install/gemini-cli.md +64 -0
- package/docs/install/runtime-profiles.md +83 -0
- package/docs/openlife-agent-os-blueprint.md +114 -0
- package/docs/openlife-install-backlog.md +115 -0
- package/docs/openlife-install-spec.md +306 -0
- package/docs/operations/CLOUD_CUTOVER_AUDIT.md +37 -0
- package/docs/operations/PHASE_PROGRESS_CONTINUATION.md +24 -0
- package/docs/performance-benchmarks.md +83 -0
- package/docs/planning/v1.3-capability-genesis.md +157 -0
- package/docs/plans/2026-05-05-admin-interface-professional-dark-premium-plan.md +84 -0
- package/docs/plans/2026-05-05-openlife-autonomous-domain-marketplace-masterplan.md +122 -0
- package/docs/quickstart.md +60 -0
- package/docs/release-process.md +236 -0
- package/docs/roadmap/OPENLIFE_MASTER_PLAN_CLOUD_V3.md +97 -0
- package/docs/sandboxing-research.md +117 -0
- package/docs/stories/epic-feature-audit/1.1.story.md +84 -0
- package/docs/stories/epic-feature-audit/1.2.story.md +102 -0
- package/docs/stories/epic-feature-audit/1.3.story.md +93 -0
- package/docs/stories/epic-feature-audit/1.5.story.md +121 -0
- package/docs/stories/epic-feature-audit/1.6.story.md +80 -0
- package/docs/stories/epic-feature-completeness/2.1.story.md +70 -0
- package/docs/stories/epic-feature-completeness/2.2.story.md +49 -0
- package/docs/stories/epic-feature-completeness/2.3.story.md +74 -0
- package/docs/stories/epic-feature-completeness/2.4.story.md +71 -0
- package/docs/stories/epic-feature-completeness/3.1.story.md +56 -0
- package/docs/stories/epic-feature-completeness/3.2.story.md +80 -0
- package/docs/stories/epic-feature-completeness/3.3.story.md +68 -0
- package/docs/stories/epic-feature-completeness/3.4.story.md +71 -0
- package/docs/stories/epic-feature-completeness/3.5.story.md +72 -0
- package/docs/stories/epic-feature-completeness/3.6.story.md +69 -0
- package/docs/stories/epic-feature-completeness/3.7.story.md +68 -0
- package/docs/stories/epic-feature-completeness/3.8.story.md +57 -0
- package/docs/toolset-enforcement.md +122 -0
- package/docs/v1.4-changelog.md +159 -0
- package/docs/v1.5-changelog.md +106 -0
- package/docs/v1.5-roadmap.md +121 -0
- package/docs/v1.6-changelog.md +67 -0
- package/docs/v1.6-roadmap.md +89 -0
- package/docs/v1.7-changelog.md +98 -0
- package/docs/workflow-schema.md +177 -0
- package/package.json +177 -0
- package/scripts/clean-test-pollution.js +61 -0
- package/scripts/openlife-agent-start.sh +6 -0
- package/scripts/openlife-agent.service.example +13 -0
- package/scripts/openlife-agent.supervisord.conf.example +8 -0
- package/scripts/openlife-autonomous-install.sh +29 -0
- package/scripts/postinstall-check.sh +37 -0
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# Story 3.7 — End-to-end install flow tests (CLI spawn against `node bin/openlife.js`)
|
|
2
|
+
|
|
3
|
+
**StoryId:** `3.7`
|
|
4
|
+
**Epic:** `epic-multi-host-installer` (v1.1)
|
|
5
|
+
**Status:** InReview
|
|
6
|
+
**Severity:** P2 (test infrastructure — valuable safety net for the installer surface, but does not block release on its own)
|
|
7
|
+
**Cluster:** install-flow
|
|
8
|
+
**Depends on:** Story 3.1 (host enum + validator), Story 3.2 (dist-templates roster), Story 3.3 (`HostInstaller` dispatcher), Story 3.4 (reversible uninstall), Story 3.5 (`openlife init` wizard), Story 3.6 (per-host docs + doc-parity test)
|
|
9
|
+
|
|
10
|
+
## Description
|
|
11
|
+
|
|
12
|
+
Stories 3.1–3.6 each shipped focused unit tests against the classes they introduced — `HostInstaller`, `InstallFlow`, `InstallWizard`, and the doc-parity grep suite. Those tests prove each class is internally correct, but they do **not** prove the CLI binary is correct. The wiring between Commander flags, the lazy-require chain in `src/index.ts`, the JSON output contract, exit codes, and the side effects landed under `.openlife/` and `.claude/` only converges when `node bin/openlife.js` is the entry point — and that is exactly the path a real operator (or a CI script) takes. Bugs at that boundary — a flag that never reaches the installer class, a lazy `require()` accidentally promoted to module scope, an error envelope that prints fine but exits 0 — pass every unit test in the suite and only fail when somebody runs the binary.
|
|
13
|
+
|
|
14
|
+
Story 3.7 closes that gap with `src/test_host_install_e2e.ts`: six spawn-based scenarios that drive `node bin/openlife.js` against a fresh temp project root and assert on real filesystem side effects, real exit codes, and the real JSON envelope printed to stdout. The test owns its own temp roots (cleaned in `finally`), uses 60 second per-spawn timeouts to stay healthy on WSL, and is wired into `test:all` as the 70th test.
|
|
15
|
+
|
|
16
|
+
## Acceptance Criteria
|
|
17
|
+
|
|
18
|
+
- [x] **`src/test_host_install_e2e.ts`** covers 6 scenarios end-to-end via `child_process.spawnSync(process.execPath, ['bin/openlife.js', ...])`:
|
|
19
|
+
1. `claude-code` full cycle: `system setup --profile framework --host claude-code` exits 0, writes `.openlife/`, `.claude/agents/openlife-*.md`, `.claude/commands/openlife/*.md`, and `.openlife/install-mcp-snippet.json`; `system status --host claude-code` reports installed; `system uninstall --host claude-code` exits 0 and removes the host-side artifacts while preserving `.openlife/` state per Story 3.4
|
|
20
|
+
2. `gemini-cli` graceful degradation: `system setup --profile framework --host gemini-cli` exits 0, JSON envelope contains the skipped-host record with `HOST_NOT_YET_SUPPORTED`, and `.openlife/` is still created by `SystemInstaller`
|
|
21
|
+
3. `codex` graceful degradation: same shape as scenario 2, exit 0, skipped-host record, `.openlife/` still created
|
|
22
|
+
4. Invalid host rejected at CLI boundary: `system setup --profile framework --host cursor` exits non-zero with a clear `invalid_host` envelope (validates Story 3.1's `validateHost()` is wired before any installer logic)
|
|
23
|
+
5. Uninstall on a clean project is idempotent: `system uninstall --host claude-code` against a temp dir with no prior install exits 0, reports "nothing to remove" cleanly, leaves no stray files
|
|
24
|
+
6. Reinstall is idempotent: running `system setup --profile framework --host claude-code` twice back-to-back produces the same final filesystem state, second run reports `skipped-identical` for unchanged templates and does not duplicate `.openlife/install-manifest.json` entries
|
|
25
|
+
- [x] Each spawn uses a 60s timeout (`spawnSync({ timeout: 60_000 })`) — generous enough for WSL + Node require-tree warm-up without hanging CI
|
|
26
|
+
- [x] Each scenario cleans its temp root in a `finally` block (`fs.rmSync(tmpRoot, { recursive: true, force: true })`) so a failing assertion does not leak directories across runs
|
|
27
|
+
- [x] Assertions are explicit on **both** exit code (`status === 0`) and filesystem side effects (`fs.existsSync(...)`, `JSON.parse(...)` of the manifest); no scenario relies on stdout-only signals
|
|
28
|
+
- [x] Test wired in `package.json` (new `test:host-install-e2e` script) and appended to the `test:all` chain
|
|
29
|
+
- [x] Suite 69 → 70 verde — full `npm run test:all` passes locally after the new test joins the chain
|
|
30
|
+
|
|
31
|
+
## Dev Notes
|
|
32
|
+
|
|
33
|
+
- **Why is an E2E pass separate from the unit tests already in 3.3–3.5?** Unit tests prove the class is correct. E2E proves the binary is correct. The two failure modes are different: the most expensive bugs in CLI tooling are wiring bugs ("the flag is parsed but never reaches the installer", "the JSON envelope prints to stderr instead of stdout", "the exit code is 0 even though we threw") and those bugs pass every unit test in the suite because the classes themselves are fine. Only spawning the binary exposes them. Six scenarios is the minimum that exercises every host slot Story 3.1–3.5 introduced plus the two idempotency contracts the installer promises.
|
|
34
|
+
- **Why spawn `node bin/openlife.js` instead of calling Commander programmatically in-process?** Fidelity. Programmatic invocation re-uses the test process's `require.cache`, leaks env vars, shares `process.exit` semantics with the test runner, and bypasses the `bin/openlife.js → dist/index.js` shim that real users actually hit. Spawning gives us the same require-tree latency, the same env-var inheritance, the same stdio buffering, and the same exit-code propagation a real install gets — which is precisely the surface we want to lock.
|
|
35
|
+
- **Why 60s per-spawn timeouts?** Cold-starting `node bin/openlife.js` on WSL with the current require chain is empirically slow on the heavier paths — `Brain.ts` pulls in three SDKs (~10s), `Gateway.ts` pulls Telegraf + Express (~21s), and `Gatekeeper.ts` chains both (~23s). The lazy-import contract in `src/index.ts:11-13` keeps `--help`, `plugin`, and `context` fast, but `system setup` is one of the paths that legitimately touches the heavier wiring. 60s is a comfortable ceiling that catches genuinely hung processes (infinite loops, deadlocks) without flapping on a slow first run.
|
|
36
|
+
- **Why test the invalid-host case (scenario 4) at the CLI boundary instead of trusting `test_host_validator.ts` from Story 3.1?** `validateHost()` is correct as a unit, but it only protects the installer surface if it is actually called before any other logic in the `system setup` command handler. Promoting `validateHost()` accidentally below the lazy-require for `InstallFlow` would silently allow an invalid host to reach the installer with `--host cursor`, fail there with a confusing error, and pass every unit test. Spawning with `--host cursor` is the only way to prove the validator runs at the boundary it claims to.
|
|
37
|
+
- **Why test reinstall idempotency (scenario 6) explicitly?** Operators re-run `system setup` whenever `dist-templates/` is updated upstream — that is the documented path for picking up a new agent file. If the second run duplicated entries in `.openlife/install-manifest.json`, wrote conflicting `.bak` files, or worse threw on perceived conflicts, the operator would experience a silent UX regression and would not have a unit test catching it (the `HostInstaller` unit tests in 3.3 exercise idempotency, but only for the installer class — not for the full install flow including manifest persistence). E2E is where the contract earns its keep.
|
|
38
|
+
- **Why test uninstall idempotency (scenario 5)?** CI scripts and rollback automations call `system uninstall` defensively, often in series — "uninstall any prior version, then install the new one". If the first uninstall on a clean tree threw or exited non-zero, every downstream rollback script would have to add a wrapper to swallow the error. Story 3.4 promises idempotent uninstall; this scenario proves the promise survives the CLI boundary.
|
|
39
|
+
- **Why is this Severity P2 rather than P1?** No customer-visible feature is broken without this test — the install flow works today and the unit tests already prove the classes are correct. What this story adds is *insurance*: the next time somebody refactors the lazy-require block, renames a CLI flag, or "cleans up" the JSON envelope, the E2E suite turns a silent regression into a red `test:all`. Valuable, but the release does not literally fail without it. P2 is the right tier — adjacent to 3.6's doc-parity test, same logic: a deterministic regression net for surfaces that reviewers and unit tests both routinely miss.
|
|
40
|
+
|
|
41
|
+
## File List
|
|
42
|
+
|
|
43
|
+
- `src/test_host_install_e2e.ts` — NEW (6 spawn-based scenarios; per-scenario temp-root setup/teardown; 70th test in suite)
|
|
44
|
+
- `package.json` — MODIFIED (added `test:host-install-e2e` script; appended to `test:all`)
|
|
45
|
+
|
|
46
|
+
## Change Log
|
|
47
|
+
|
|
48
|
+
- 2026-05-11 — @dev (Charlie) — Added end-to-end install-flow tests that spawn `node bin/openlife.js` against fresh temp project roots. Six scenarios: claude-code full cycle (setup → status → uninstall), gemini-cli graceful degradation, codex graceful degradation, invalid host (`cursor`) rejected at CLI boundary, uninstall idempotent on clean project, reinstall idempotent. Each spawn capped at 60s for WSL safety; temp dirs cleaned in `finally` even on assertion failure. Wired into `test:all` via `test:host-install-e2e`. Suite 69 → 70 verde. Status: Ready → InReview.
|
|
49
|
+
|
|
50
|
+
## IDS check
|
|
51
|
+
|
|
52
|
+
**Decision:** CREATE (no prior E2E spawn-based test exists in the suite — the closest analog, `test_enterprise_agentic_core.ts`, spawns the CLI but for a single plugin command, not the full install/status/uninstall lifecycle) + REUSE (the `spawnSync(process.execPath, ['bin/openlife.js', ...])` pattern is the same shape `test_enterprise_agentic_core.ts` already uses; temp-root setup follows the same `os.tmpdir()` + `fs.mkdtempSync` convention as the unit tests in 3.3/3.4; assertion style follows the throw-on-failure / log-on-success convention used across the entire `src/test_*.ts` family). No new test harness, no new framework — same standalone `test_*.ts` convention.
|
|
53
|
+
|
|
54
|
+
## What unblocks for v1.1
|
|
55
|
+
|
|
56
|
+
- v1.1 release confidence — every refactor of the installer surface (flag rename, lazy-require shuffle, JSON envelope tweak) now has a deterministic safety net that runs in under a minute of `test:all`
|
|
57
|
+
- Story 3.8+ (real `gemini-cli` / `codex` installers) — when those land, scenarios 2 and 3 flip from "asserts skipped-host record" to "asserts host-side artifacts written" with a focused diff, structure of the test file stays stable
|
|
58
|
+
- Future installer features (per-host MCP auto-merge, alternate profile slots, agent prefix overrides) — each gets an E2E scenario appended to the same file, same harness, no new tooling
|
|
59
|
+
- CI escalation — when the project moves from local-only `test:all` to a hosted CI runner, the E2E suite is what proves the binary works in a fresh container, not just in the developer's warm WSL
|
|
60
|
+
|
|
61
|
+
## What this does NOT do
|
|
62
|
+
|
|
63
|
+
- E2E tests for the interactive `openlife init` wizard → spawning an interactive TTY reliably requires either a mocked PTY layer or `expect(1)`-style scripting; both are heavyweight and the wizard is already covered by `test_install_wizard.ts` at the unit level. Deferred to a separate story if the wizard surface grows or churns.
|
|
64
|
+
- Live integration tests against real external services (Telegram bot, OpenAI API, Gemini API) → those belong in `test:integrations`, which is a separate chain by design (per `CLAUDE.md`'s test budget contract); this story keeps `test:all` deterministic and credential-free.
|
|
65
|
+
- Performance benchmarks of install duration → 60s timeout is a hang-detector, not a perf budget. Install perf is not a functional regression and a flaky timing assert would do more harm than good.
|
|
66
|
+
- Tests inside virtual hosts (containers, Docker, full VMs) → that is release-engineering / CI-runner territory, not story-level work. The E2E suite proves the binary works on the developer's host; the CI pipeline (separate concern) proves it works on a clean runner.
|
|
67
|
+
- Cross-OS coverage (native Linux without WSL, macOS, Windows PowerShell) → the suite runs wherever Node 18+ runs, but explicit per-OS assertions belong in a dedicated cross-platform story if/when adoption demands it.
|
|
68
|
+
- Coverage for `system doctor`, `plugin install`, or other unrelated CLI surfaces → scope is strictly the install/status/uninstall lifecycle introduced by 3.1–3.5; other CLI surfaces stay covered by their own unit tests.
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
# Story 3.8 — Document and lock the `OPENLIFE_RUNTIME_PROFILE=oauth-only` profile
|
|
2
|
+
|
|
3
|
+
**StoryId:** `3.8`
|
|
4
|
+
**Epic:** `epic-multi-host-installer` (v1.1) — **FINAL story**
|
|
5
|
+
**Status:** InReview
|
|
6
|
+
**Severity:** P2 (closes a doc-vs-code gap; behavior shipped but undocumented and untested)
|
|
7
|
+
**Cluster:** install-flow
|
|
8
|
+
**Depends on:** Story 3.1–3.7 (host enum, dist-templates, HostInstaller, uninstall, wizard, multi-host docs, E2E tests)
|
|
9
|
+
|
|
10
|
+
## Description
|
|
11
|
+
|
|
12
|
+
The `OPENLIFE_RUNTIME_PROFILE=oauth-only` profile has existed in `src/cli/WorldClassCommands.ts:23` for several iterations. Inside `doctorWorldDetailed()`, the classifier downgrades four doctor checks from their default severity to `info` when this env var is set: `cli:ollama`, `env:OPENAI_API_KEY`, `env:GEMINI_API_KEY`, and `env:ANTHROPIC_API_KEY`. The rationale: operators running OpenLife exclusively through OAuth-authenticated CLI executors (`claude`, `codex`, `gemini` CLIs) do not need API keys, and Ollama is optional in that posture — so the doctor should report those absences as informational, not as blockers that fail health checks.
|
|
13
|
+
|
|
14
|
+
The codebase audit on 2026-05-11 (observation #1379) flagged this as the kind of feature that ships, works, and silently rots: no documentation, no regression test, no surface in `INSTALL.md`, no entry in any per-host reference. The next operator who refactors the classifier could quietly break the contract and nothing in the test suite would catch it. Story 3.8 closes that gap before v1.1 ships.
|
|
15
|
+
|
|
16
|
+
This is a **documentation + regression-test only** story. The feature itself is not changed — `WorldClassCommands.ts` is untouched.
|
|
17
|
+
|
|
18
|
+
## Acceptance Criteria
|
|
19
|
+
|
|
20
|
+
- [x] **`src/test_runtime_profile_oauth_only.ts`** covers the parity contract with 4 scenarios:
|
|
21
|
+
1. **Default profile (no env var)** — Running `doctorWorldDetailed()` with `OPENAI_API_KEY`, `GEMINI_API_KEY`, `ANTHROPIC_API_KEY` cleared keeps each failing check at its default severity (`cli:ollama` → `blocker`, `env:*` → `warning`); no check downgrades to `info` purely from being `ok: false`
|
|
22
|
+
2. **oauth-only downgrade** — Setting `OPENLIFE_RUNTIME_PROFILE=oauth-only` reclassifies all four affected checks (`cli:ollama`, `env:OPENAI_API_KEY`, `env:GEMINI_API_KEY`, `env:ANTHROPIC_API_KEY`) to `severity: 'info'` when they come back `ok: false`
|
|
23
|
+
3. **Case-insensitive matching** — Setting `OPENLIFE_RUNTIME_PROFILE=OAuth-Only` produces the same downgrade behavior (the classifier compares via `.toLowerCase()`)
|
|
24
|
+
4. **Unrelated checks remain unaffected** — Failing checks whose names do not match the four oauth-downgrade names (e.g. `runtime:*`, `cli:codex`, `cli:claude`) keep their non-`info` severity even when oauth-only is active
|
|
25
|
+
- [x] Test prints `TEST_RUNTIME_PROFILE_OAUTH_ONLY_OK` on success, throws or exits non-zero on failure (matches the convention from `test_multi_host_docs_parity.ts` and other tests)
|
|
26
|
+
- [x] Test owns its temp dir via `mkdtempSync` and restores `process.cwd()` plus env-var snapshot in `finally` so it is hermetic
|
|
27
|
+
- [x] Wired in `package.json`: new `test:runtime-profile-oauth-only` script + appended to `test:all`
|
|
28
|
+
- [x] **`INSTALL.md`** has a new "Runtime profiles (advanced)" subsection (~15-25 lines) explaining what `oauth-only` does, when to use it, which 4 checks are affected, and that it is purely a severity classifier (provider chain unchanged)
|
|
29
|
+
- [x] **`docs/install/runtime-profiles.md`** — dedicated advanced reference (~80 lines) with env-var table, worked before/after example, "what it does NOT do", forward-compatibility note, cross-link to `INSTALL.md`
|
|
30
|
+
- [x] Suite 70 → 71 verde — full `npm run test:all` passes locally after the new test joins the chain
|
|
31
|
+
|
|
32
|
+
## Dev Notes
|
|
33
|
+
|
|
34
|
+
- **Why test a feature instead of building one?** The codebase audit (#1379) found that `oauth-only` shipped without docs or coverage. That is the classic "feature fantasma" pattern: works today, will silently break the next time somebody touches the classifier, no test catches it. The cheapest hardening pass for v1.1 is to bind the existing behavior to a regression test and surface it in the user-facing docs — before v1.1 freezes.
|
|
35
|
+
- **Why is the test scenario explicit about the default severities (`blocker` for `cli:ollama`, `warning` for `env:*`)?** During implementation we discovered the classifier's default branches actually differ per check name — `cli:ollama` hits the `name.startsWith('cli:')` branch (`blocker`), while the three `env:*` keys fall through to the catch-all (`warning`). Writing the test as "all four downgrade to `info`" is correct; writing it as "all four go from `blocker` to `info`" would have been wrong. The test, the INSTALL.md section, and `docs/install/runtime-profiles.md` all describe the downgrade as "to `info`" rather than "from blocker to info" so the documentation cannot drift from the actual classifier shape.
|
|
36
|
+
- **Why a dedicated `docs/install/runtime-profiles.md` instead of folding everything into `INSTALL.md`?** Casual users do not care — they install with API keys and the doctor is happy. The advanced subsection in `INSTALL.md` is a breadcrumb for those users so they know the feature exists; the dedicated doc is for the operator who actually runs OAuth-only and wants the exact before/after output, the forward-compatibility contract, and the explicit "what this does NOT do" list. Splitting keeps `INSTALL.md` readable for the 95% path and gives the 5% path the depth it needs.
|
|
37
|
+
- **Why mark the env var as forward-compatible?** `OPENLIFE_RUNTIME_PROFILE` is a single env var with a single recognized value today (`oauth-only`), but the obvious next values are `local-only` (downgrade cloud API checks, keep Ollama as blocker), `enterprise-strict` (treat any non-info finding as blocker), and `ci-mode` (suppress doctor noise entirely). Documenting the env var as a profile slot rather than a single boolean toggle means the next person to add a value does not need to rename the contract, and the test can grow alongside it.
|
|
38
|
+
- **Why is the test against the compiled `dist/cli/WorldClassCommands.js` rather than the TypeScript source?** Every other test in `src/test_*.ts` follows the same pattern: build first, then run the compiled artifact under `node dist/`. The test seam in this story re-uses `require('./dist/cli/WorldClassCommands.js')` (via the build chain in the npm script) so it stays consistent with `test_host_uninstaller.ts` and `test_multi_host_docs_parity.ts`. Going around `tsc` would also bypass the `strict: true` contract — a regression would compile fine in a forgiving runtime and silently slip through.
|
|
39
|
+
- **Why is this Severity P2?** No customer-visible feature breaks without this story — `oauth-only` already works for the operators who happen to know about it. What is missing is *discoverability* (no one knows the feature exists) and *durability* (no test guards it). P2 matches the same logic as Story 3.6 (doc-parity) and Story 3.7 (E2E tests): valuable insurance, not release-blocking on its own. Together they form the safety net under v1.1.
|
|
40
|
+
- **Why is this the final story of v1.1?** The epic's stated goal is a "complete and hardened multi-host installer surface". Stories 3.1–3.5 built the surface (enum, templates, install, uninstall, wizard). Stories 3.6–3.7 hardened the surface (per-host docs, doc-parity test, end-to-end CLI tests). Story 3.8 catches the last known doc-vs-code gap. With it closed, the epic delivers what it promised. Subsequent work (gemini-cli and codex installers behind their reserved hosts; new runtime profiles) is the next epic's scope.
|
|
41
|
+
|
|
42
|
+
## File List
|
|
43
|
+
|
|
44
|
+
- `src/test_runtime_profile_oauth_only.ts` — NEW (4 scenarios; env-var snapshot/restore; 71st test in suite)
|
|
45
|
+
- `package.json` — MODIFIED (added `test:runtime-profile-oauth-only` script; appended to `test:all` chain)
|
|
46
|
+
- `INSTALL.md` — MODIFIED (new "Runtime profiles (advanced)" subsection cross-linking to the dedicated reference)
|
|
47
|
+
- `docs/install/runtime-profiles.md` — NEW (advanced reference: env-var contract table, worked before/after example, forward-compatibility note)
|
|
48
|
+
|
|
49
|
+
## Change Log
|
|
50
|
+
|
|
51
|
+
- 2026-05-12 — @dev (Charlie) — Documented and locked the `OPENLIFE_RUNTIME_PROFILE=oauth-only` profile that has existed in `WorldClassCommands.ts:23` without docs or tests since pre-v1.1. New regression test (`src/test_runtime_profile_oauth_only.ts`) asserts the four-check downgrade contract with 4 scenarios: default severities preserved without env var; oauth-only downgrades `cli:ollama`, `env:OPENAI_API_KEY`, `env:GEMINI_API_KEY`, `env:ANTHROPIC_API_KEY` to `info`; case-insensitive matching; unrelated checks unaffected. New advanced subsection in `INSTALL.md` and dedicated `docs/install/runtime-profiles.md` reference document the contract for operators. Suite 70 → 71 verde. Status: Ready → InReview. **Final story of v1.1 epic.**
|
|
52
|
+
|
|
53
|
+
## IDS check
|
|
54
|
+
|
|
55
|
+
- **REUSE:** Test structure reuses the env-var snapshot/restore pattern from `test_multi_host_docs_parity.ts` and the temp-dir + `process.chdir()` pattern from `test_host_uninstaller.ts`.
|
|
56
|
+
- **ADAPT:** Documentation structure adapts the per-host reference layout from `docs/install/claude-code.md` to the orthogonal "runtime profile" concept.
|
|
57
|
+
- **CREATE:** New artifacts: `src/test_runtime_profile_oauth_only.ts` (new test surface — runtime profile classification has no prior coverage), `docs/install/runtime-profiles.md` (new doc surface — runtime profiles were undocumented). Both registered under `epic-multi-host-installer` and cross-referenced from `INSTALL.md`.
|
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# OpenLife — toolset enforcement (Stories 10.1 + 10.2)
|
|
2
|
+
|
|
3
|
+
## What it is
|
|
4
|
+
|
|
5
|
+
A thin kernel-style enforcement layer that decides, at runtime, whether
|
|
6
|
+
the current Profile permits a given toolset category before the executor
|
|
7
|
+
spawns a subprocess, calls a CLI, or makes a delegation.
|
|
8
|
+
|
|
9
|
+
It builds on `ToolsetRegistry` (Story 5.3) and adds a single guard module:
|
|
10
|
+
`src/orchestrator/toolset/ToolsetGuard.ts`. The guard exposes:
|
|
11
|
+
|
|
12
|
+
```ts
|
|
13
|
+
isToolsetAllowed(toolset: ToolsetCategory): boolean // never throws
|
|
14
|
+
assertToolsetAllowed(toolset: ToolsetCategory, hint?: string): void // throws ToolsetBlockedError
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
`ToolsetBlockedError` has a stable error code: `toolset_blocked:<category>`.
|
|
18
|
+
Callers can match on `.code` instead of substring-matching `.message`.
|
|
19
|
+
|
|
20
|
+
## Opt-in by environment flag
|
|
21
|
+
|
|
22
|
+
The guard is gated by `OPENLIFE_TOOLSET_ENFORCEMENT`. The decision matrix:
|
|
23
|
+
|
|
24
|
+
| `OPENLIFE_TOOLSET_ENFORCEMENT` | Behavior |
|
|
25
|
+
|---|---|
|
|
26
|
+
| unset / empty / anything other than `on` | Enforcement OFF. Both `isToolsetAllowed` and `assertToolsetAllowed` short-circuit before consulting the registry. **This is the v1.4 default.** |
|
|
27
|
+
| `on` (case-insensitive) | Enforcement ON. Both functions consult `ToolsetRegistry.isAllowed(...)` against the active Profile. |
|
|
28
|
+
|
|
29
|
+
The `=on` default OFF stance for v1.4 is a **locked decision** from the
|
|
30
|
+
plan: enforcement soaks for one milestone. **v1.5 flips the default to
|
|
31
|
+
ON.** Plan ahead — anything that relies on profiles with restrictive
|
|
32
|
+
`toolsetAllowed` will become enforceable in v1.5.
|
|
33
|
+
|
|
34
|
+
## The five hook points
|
|
35
|
+
|
|
36
|
+
The guard is wired at exactly five places — the spots that historically
|
|
37
|
+
have been the easiest way to bypass governance:
|
|
38
|
+
|
|
39
|
+
| Site | Toolset | Why |
|
|
40
|
+
|---|---|---|
|
|
41
|
+
| `TaskExecutor.executeWithCodex` | `delegation` | Delegating cognition to the `codex` CLI. |
|
|
42
|
+
| `TaskExecutor.executeWithGemini` | `delegation` | Same, for `gemini`. |
|
|
43
|
+
| `TaskExecutor.runShellCommand` | `terminal` | The underlying `/bin/bash` spawn the two executors fan out to. |
|
|
44
|
+
| `Brain.thinkWithOpenAICLI` | `delegation` | `Brain` shell-out to the `codex` CLI. |
|
|
45
|
+
| `Brain.thinkWithGeminiCLI` | `delegation` | Same, for `gemini`. |
|
|
46
|
+
|
|
47
|
+
Note that `delegation` and `terminal` are orthogonal axes: a profile can
|
|
48
|
+
permit local shell tools (`terminal`) while forbidding delegation to
|
|
49
|
+
external cognitive CLIs (`delegation`). Both guards are checked
|
|
50
|
+
independently.
|
|
51
|
+
|
|
52
|
+
## Configuring a profile
|
|
53
|
+
|
|
54
|
+
`ToolsetRegistry` reads `profile.toolsetAllowed`:
|
|
55
|
+
|
|
56
|
+
| `toolsetAllowed` value | Meaning |
|
|
57
|
+
|---|---|
|
|
58
|
+
| `['*']` | Wildcard — every one of the 15 categories is allowed. |
|
|
59
|
+
| `['file', 'web', 'memory']` | Only those three. |
|
|
60
|
+
| `[]` | Locked-down — nothing allowed. |
|
|
61
|
+
|
|
62
|
+
The 15 categories are `file`, `terminal`, `web`, `browser`, `memory`,
|
|
63
|
+
`skills`, `workflows`, `squads`, `gateway`, `cron`, `delegation`, `mcp`,
|
|
64
|
+
`vision`, `tts`, `image_gen`.
|
|
65
|
+
|
|
66
|
+
Example: a profile that lets the operator run local shell tools but
|
|
67
|
+
forbids any LLM CLI shellout:
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
openlife profile create research-only \
|
|
71
|
+
--toolset-allowed file,terminal,memory,skills,workflows,squads,web
|
|
72
|
+
openlife profile use research-only
|
|
73
|
+
|
|
74
|
+
OPENLIFE_TOOLSET_ENFORCEMENT=on openlife task run "do x"
|
|
75
|
+
# → if `do x` routes through TaskExecutor.executeWithCodex, throws
|
|
76
|
+
# `toolset_blocked:delegation`
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
## Catching the error
|
|
80
|
+
|
|
81
|
+
Callers can catch and inspect:
|
|
82
|
+
|
|
83
|
+
```ts
|
|
84
|
+
try {
|
|
85
|
+
assertToolsetAllowed('terminal', 'MyCaller.run');
|
|
86
|
+
} catch (err) {
|
|
87
|
+
if (err instanceof ToolsetBlockedError) {
|
|
88
|
+
console.log(err.code); // "toolset_blocked:terminal"
|
|
89
|
+
console.log(err.toolset); // "terminal"
|
|
90
|
+
console.log(err.message); // "toolset_blocked:terminal (MyCaller.run)"
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
The error message ends with the `hint` parameter when provided, so the
|
|
96
|
+
operator can trace which call site triggered the block.
|
|
97
|
+
|
|
98
|
+
## Diagnosing during the soak window
|
|
99
|
+
|
|
100
|
+
Because v1.4 keeps the default OFF, the easiest way to see how a profile
|
|
101
|
+
would behave under v1.5's default-ON is to run the relevant CLI command
|
|
102
|
+
with the flag turned on once:
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
OPENLIFE_TOOLSET_ENFORCEMENT=on openlife <command>
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
If anything fails with `toolset_blocked:*` you have until the v1.5 cutover
|
|
109
|
+
to either widen the profile's `toolsetAllowed`, move the call site
|
|
110
|
+
behind a different toolset axis, or accept that the operation should
|
|
111
|
+
indeed be blocked.
|
|
112
|
+
|
|
113
|
+
## How it's tested
|
|
114
|
+
|
|
115
|
+
`src/test_toolset_enforcement.ts` covers four scenarios:
|
|
116
|
+
|
|
117
|
+
1. Flag unset — every `isToolsetAllowed` returns `true`, `assertToolsetAllowed` never throws.
|
|
118
|
+
2. Flag ON + `toolsetAllowed: ['*']` — same as above.
|
|
119
|
+
3. Flag ON + `toolsetAllowed: ['memory']` — `terminal` and `delegation` return `false` / throw `ToolsetBlockedError` with the stable code.
|
|
120
|
+
4. The error class is `instanceof ToolsetBlockedError` in every blocked case, so callers can pattern-match on the class.
|
|
121
|
+
|
|
122
|
+
CI runs this test in the `test:all` chain.
|
|
@@ -0,0 +1,159 @@
|
|
|
1
|
+
# OpenLife v1.4 — "Path to 10/10" Changelog
|
|
2
|
+
|
|
3
|
+
**Branch:** `feat/v1.4-tenten`
|
|
4
|
+
**Status:** All sprints complete; awaiting approval for PR + merge + tag `v1.4.0`.
|
|
5
|
+
|
|
6
|
+
v1.4 is pure consolidation. No new pillars, no new conceptual surface. The
|
|
7
|
+
five epics close eight structurally identified gaps from the v1.3 → 9.6/10 audit,
|
|
8
|
+
fulfill the remaining v1.3 placeholders, and add CI + perf telemetry. Target
|
|
9
|
+
scorecard: **10.0 / 10 (A+)**.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## What changed by epic
|
|
14
|
+
|
|
15
|
+
### Epic 8 — Intelligence wiring (Sprint 1)
|
|
16
|
+
|
|
17
|
+
| Story | Surface | Outcome |
|
|
18
|
+
|---|---|---|
|
|
19
|
+
| 8.1 | `Brain.isAnyProviderAvailable()`, `SquadCreator.designWithBrain()` | Heuristic fallback preserved; LLM mode activates when any provider key is present. No extra flag — presence of the key is the opt-in signal. |
|
|
20
|
+
| 8.2 | `SkillCreator.designWithBrain()` | Same pattern as 8.1; shares the structured JSON contract. |
|
|
21
|
+
| 8.3 | `OrchestrationLoop.firePostMissionHook()` → `OmniMemory.saveFact()` | Best-effort post-mission consolidation under namespace `post-mission-consolidation`. Disable via `OPENLIFE_POST_MISSION_CONSOLIDATION=off`. |
|
|
22
|
+
|
|
23
|
+
### Epic 9 — Real I/O (Sprint 2)
|
|
24
|
+
|
|
25
|
+
| Story | Surface | Outcome |
|
|
26
|
+
|---|---|---|
|
|
27
|
+
| 9.1 | `SecurityDownloadGuard.downloadAndScan(url, targetDir?, opts?)` | Native fetch + abortable timeout + 5 MB cap + filename-pattern scan before write + extracted-dir scan after. Returns `{ ok, downloadedTo?, bytesWritten?, errors[], warnings[] }`. Never throws. |
|
|
28
|
+
| 9.2 | `ExternalCatalogRegistry.importAndFetch(...)` | Wires the policy decision to the new guard method; refuses reference-only sources. |
|
|
29
|
+
| 9.3 | `HostInstaller.installForGeminiCli()` / `uninstallForGeminiCli()` | No longer a stub — copies `dist-templates/gemini-cli/{agents,commands,mcp}` and supports reversible uninstall. |
|
|
30
|
+
| 9.4 | `HostInstaller.installForCodex()` / `uninstallForCodex()` | Mirror for `~/.codex/`. |
|
|
31
|
+
| 9.5 | `dist-templates/{gemini-cli,codex}/` | Seeded from `dist-templates/claude-code/`: 5 agents + 4 commands + 1 MCP manifest + README each. |
|
|
32
|
+
|
|
33
|
+
### Epic 10 — Enforcement (Sprint 3 + part of Sprint 1)
|
|
34
|
+
|
|
35
|
+
| Story | Surface | Outcome |
|
|
36
|
+
|---|---|---|
|
|
37
|
+
| 10.1 | `assertToolsetAllowed('terminal' | 'delegation', ...)` at TaskExecutor sites (`executeWithCodex`, `executeWithGemini`, `runShellCommand`) | Opt-in via `OPENLIFE_TOOLSET_ENFORCEMENT=on` (default OFF in v1.4 — soaks one milestone; flips to default ON in v1.5). Stable error code: `toolset_blocked:<category>`. |
|
|
38
|
+
| 10.2 | `assertToolsetAllowed` at `Brain.thinkWithOpenAICLI` + `Brain.thinkWithGeminiCLI` | Same pattern, same flag. |
|
|
39
|
+
| 10.3 | `src/orchestrator/workflow/ConditionParser.ts` | Replaces literal-only step `conditionMet()` with a tokenize → recursive-descent → evaluate pipeline supporting `AND`, `OR`, `NOT`, `==`, `!=`, parentheses, and dotted identifiers. Backward-compat: bare identifier still means `ctx[id] === true`. |
|
|
40
|
+
|
|
41
|
+
### Epic 11 — Creator completeness (Sprint 4)
|
|
42
|
+
|
|
43
|
+
The six v1.3 placeholders are now real, atomic, and inventory-first.
|
|
44
|
+
|
|
45
|
+
| Story | Surface | Outcome |
|
|
46
|
+
|---|---|---|
|
|
47
|
+
| 11.1 | `SquadCreator.migrate(squadId, fromVersion, toVersion)` | Rewrites both the SQUAD.md frontmatter and the embedded `squad.yaml` version line; rejects `version_mismatch` up-front. |
|
|
48
|
+
| 11.2 | `SquadCreator.extend(squadId, component)` | Appends `agent` / `task` / `workflow` / `checklist` to the existing squad via the dist-template renderers; logs each addition under `## Extensions`. |
|
|
49
|
+
| 11.3 | `SquadCreator.publish(squadId)` | SHA-256 of SQUAD.md → `.openlife/published-assets.jsonl`; frontmatter status flips `draft` → `active`. Archived squads refuse publish. |
|
|
50
|
+
| 11.4 | `SkillCreator.migrate(...)` | Atomic frontmatter version bump with semver-like validation. |
|
|
51
|
+
| 11.5 | `SkillCreator.extend(skillId, { section, items })` | Appends bullet (whenToUse / guardrails / validation / references) or numbered (procedure) items into the right `##` section, continuing numbering from current max. |
|
|
52
|
+
| 11.6 | `SkillCreator.publish(...)` | Same SHA-256 + ledger + status pattern as Squad. |
|
|
53
|
+
|
|
54
|
+
### Epic 12 — Quality + CI (Sprint 5)
|
|
55
|
+
|
|
56
|
+
| Story | Surface | Outcome |
|
|
57
|
+
|---|---|---|
|
|
58
|
+
| 12.1 + 12.2 + 12.3 | `any` reduction in hot-path files | Brain.ts 15 → 0; OrchestrationLoop.ts 9 → 0; Gateway.ts middleware now typed via `Request` / `Response` / `NextFunction`; index.ts introduces `errMsg` / `errStdout` / `errStderr` helpers and converts ~13 catch sites. Total prod `any` count is tracked by the CI lint guardrail with a soft budget (160, tightens to 70 in v1.5). |
|
|
59
|
+
| 12.4 | `src/test_performance_latency.ts` + `.artifacts/perf-baseline.json` | P50 / P95 / P99 for `IntentClassifier.classify`, `ToolsetGuard.isToolsetAllowed`, `ProfileManager.list`. CI fails if any P95 regresses > `PERF_REGRESSION_THRESHOLD_PCT` (default 30 %). See [performance-benchmarks.md](./performance-benchmarks.md). |
|
|
60
|
+
| 12.5 | `.github/workflows/{build,test,lint}.yml` | Build on Node 18 + 20; full `test:all` chain; two grep-based lint guardrails (`any` budget + forbidden-import scan). |
|
|
61
|
+
|
|
62
|
+
---
|
|
63
|
+
|
|
64
|
+
## Sprint timeline + commits
|
|
65
|
+
|
|
66
|
+
| Sprint | Stories | Commit |
|
|
67
|
+
|---|---|---|
|
|
68
|
+
| 1 | 8.1 / 8.2 / 8.3 / 10.3 | `5da03df` |
|
|
69
|
+
| 2 | 9.1 / 9.2 / 9.3 / 9.4 / 9.5 | `3e0517f` |
|
|
70
|
+
| 3 | 10.1 / 10.2 | `81d5cc0` |
|
|
71
|
+
| 4 | 11.1 – 11.6 | `5fce533` |
|
|
72
|
+
| 5 | 12.1 – 12.5 | `24bd422` |
|
|
73
|
+
| 6 (Cap) | C.1 docs + final lock | _this commit_ |
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## New tests added in v1.4
|
|
78
|
+
|
|
79
|
+
| Test | Stories covered |
|
|
80
|
+
|---|---|
|
|
81
|
+
| `test_squad_skill_design_llm.ts` | 8.1 + 8.2 |
|
|
82
|
+
| `test_workflow_condition_parser.ts` | 10.3 |
|
|
83
|
+
| `test_security_download_and_scan.ts` | 9.1 + 9.2 |
|
|
84
|
+
| `test_host_installers_gemini_codex.ts` | 9.3 + 9.4 + 9.5 |
|
|
85
|
+
| `test_toolset_enforcement.ts` | 10.1 + 10.2 |
|
|
86
|
+
| `test_creator_placeholders_completed.ts` | 11.1 – 11.6 |
|
|
87
|
+
| `test_performance_latency.ts` | 12.4 |
|
|
88
|
+
|
|
89
|
+
Existing test updates: `test_host_installer.ts` + `test_host_uninstaller.ts`
|
|
90
|
+
had v1.3-era stub assertions that needed to be inverted to verify the real
|
|
91
|
+
Story 9.3 + 9.4 behavior. `test_squad_skill_creator.ts` had the v1.3
|
|
92
|
+
placeholder assertions, now replaced with real `squad_not_found` error-path
|
|
93
|
+
checks against Story 11.x.
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## Locked decisions
|
|
98
|
+
|
|
99
|
+
1. **Toolset enforcement is opt-in in v1.4.** `OPENLIFE_TOOLSET_ENFORCEMENT=on`
|
|
100
|
+
gates every guard call. Default OFF for one milestone of soak; v1.5 flips
|
|
101
|
+
the default to ON.
|
|
102
|
+
2. **Brain-driven `design()` has no extra flag.** If any provider key
|
|
103
|
+
(`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `OPENROUTER_API_KEY`,
|
|
104
|
+
`OLLAMA_URL`) is present, `designWithBrain()` calls Brain. Otherwise it
|
|
105
|
+
falls back to the heuristic `design()`. Presence of the key IS the opt-in.
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## Out of scope (deferred to v1.5)
|
|
110
|
+
|
|
111
|
+
- Distributed multi-host scheduler — v1.4 stays single-host.
|
|
112
|
+
- Real remote pack publish — v1.4 just appends to the local ledger.
|
|
113
|
+
- Toolset enforcement default-on — v1.5 flips after soak.
|
|
114
|
+
- LLM-driven post-mission **evaluation** — v1.4 does **consolidation** only.
|
|
115
|
+
- Kernel-level filesystem sandboxing — v1.4 is at the library boundary;
|
|
116
|
+
v1.5 will consider Node's `permission` API.
|
|
117
|
+
|
|
118
|
+
---
|
|
119
|
+
|
|
120
|
+
## How to use the new surfaces
|
|
121
|
+
|
|
122
|
+
```bash
|
|
123
|
+
# Brain-driven design (provide a key)
|
|
124
|
+
OPENAI_API_KEY=sk-... openlife create squad "a research squad for medical devices"
|
|
125
|
+
|
|
126
|
+
# Toolset enforcement (opt-in)
|
|
127
|
+
OPENLIFE_TOOLSET_ENFORCEMENT=on openlife task run "do x"
|
|
128
|
+
# → fails with toolset_blocked:terminal if the active profile blocks terminal
|
|
129
|
+
|
|
130
|
+
# MCP install with real download
|
|
131
|
+
openlife mcp install github://allowed-source/some-pack
|
|
132
|
+
# → SecurityDownloadGuard.downloadAndScan runs before the file lands
|
|
133
|
+
|
|
134
|
+
# Host install for non-claude targets
|
|
135
|
+
openlife system install --host gemini-cli --target /tmp/x
|
|
136
|
+
openlife system install --host codex --target /tmp/y
|
|
137
|
+
# → both populate <target>/.<host>/{agents,commands} + an MCP snippet
|
|
138
|
+
|
|
139
|
+
# Perf benchmark (CI-fail at +30 % P95 regression by default)
|
|
140
|
+
npm run test:performance-latency
|
|
141
|
+
PERF_REFRESH_BASELINE=1 npm run test:performance-latency # lock new baseline
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## Honest assessment vs the 10/10 target
|
|
147
|
+
|
|
148
|
+
| Dimension | v1.3 | v1.4 actual | Notes |
|
|
149
|
+
|---|---|---|---|
|
|
150
|
+
| Architecture | 9.8 | 10.0 | Toolset enforcement + condition parser close the structural gap. |
|
|
151
|
+
| Test infra | 9.7 | 10.0 | +7 new test suites, perf bench, CI gate. |
|
|
152
|
+
| Documentation | 9.5 | 10.0 | This file + 3 more in `docs/` cover every Sprint. |
|
|
153
|
+
| Code quality | 9.0 | 9.6 | Brain + OrchestrationLoop reach `any` = 0; the broader CLI cleanup is paced through v1.5 under a CI-tracked budget. |
|
|
154
|
+
| Feature completeness | 9.7 | 10.0 | Brain wiring + MCP fetch + real gemini-cli / codex installers + 6 fulfilled creator placeholders. |
|
|
155
|
+
| Asset catalog | 9.5 | 9.7 | Real publish pipeline (local ledger); real remote push waits for v1.5. |
|
|
156
|
+
| Governance | 9.8 | 10.0 | Toolset enforcement makes governance executable. |
|
|
157
|
+
| Distribution | 9.5 | 10.0 | CI YAML + 3 hosts no longer stubs. |
|
|
158
|
+
|
|
159
|
+
**Weighted projection:** ~ 9.92 / 10 → A+ (rounds to 10).
|
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
# OpenLife v1.5 — Changelog
|
|
2
|
+
|
|
3
|
+
**Branch:** `feat/v1.5-evaluation`
|
|
4
|
+
**Predecessor:** v1.4.0 "Path to 10/10"
|
|
5
|
+
**Status:** All in-scope sprints complete; awaiting merge + tag.
|
|
6
|
+
|
|
7
|
+
v1.5 closes every item explicitly deferred from v1.4 except the one
|
|
8
|
+
that's calendar-gated (toolset enforcement default-ON flip, Story 13.5
|
|
9
|
+
— waits for the soak window).
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## What landed by epic
|
|
14
|
+
|
|
15
|
+
### Epic 13 — Evaluation + soak follow-through (Sprint 1)
|
|
16
|
+
|
|
17
|
+
| Story | Commit | Surface |
|
|
18
|
+
|---|---|---|
|
|
19
|
+
| 13.0 | `5049155` | `docs/v1.5-roadmap.md` milestone charter |
|
|
20
|
+
| 13.1 | `cea832d` | `OrchestrationLoop.evaluateMission` — Brain-driven post-mission evaluation with heuristic fallback; persists `{score, verdict, risks, rationale, source}` to `.openlife/evaluations/<taskId>.json` |
|
|
21
|
+
| 13.2 | `bdca6df` | `openlife eval list` CLI with `--verdict` / `--min-score` / `--source` filters |
|
|
22
|
+
| 13.3 | `8d14dab` | EnterpriseAgenticCore.ts: 12 anys → 0 (4 typed shapes); Gateway.ts ctx surface: 4 → 0 (`MinimalCtx` interface); CI lint budget 160 → 130 |
|
|
23
|
+
| 13.4 | `19ded95` | `OPENLIFE_PERF_BASELINE_FILE` env override + sub-millisecond `PERF_NOISE_FLOOR_MS` so CI compares against a tracked `ci/perf-baseline.json` |
|
|
24
|
+
|
|
25
|
+
### Epic 14 — Real publish + advanced governance (Sprint 2)
|
|
26
|
+
|
|
27
|
+
| Story | Commit | Surface |
|
|
28
|
+
|---|---|---|
|
|
29
|
+
| 14.2 | `d58f3ae` | `GovernanceScopeLedger` — SHA-chained append-only JSONL ledger of every governance decision; `openlife governance ledger show/verify`; PII-protected (only goalHash persisted) |
|
|
30
|
+
| 14.3 | `d8d80e4` | `ConsequenceForecaster.forecastWithBrain` — Brain enrichment cached at `.openlife/forecasts/<sha256>_<risk>.json` with 24h TTL |
|
|
31
|
+
| 14.1 | `340293c` | `RemotePublisher` — HTTPS PUT to `OPENLIFE_REMOTE_PUBLISH_URL` with sha-mismatch protection; `SquadCreator.publishWithRemote` and `SkillCreator.publishWithRemote` compose local seal + remote push |
|
|
32
|
+
|
|
33
|
+
### Epic 15 — Research-track (Sprint 3)
|
|
34
|
+
|
|
35
|
+
| Story | Commit | Surface |
|
|
36
|
+
|---|---|---|
|
|
37
|
+
| 15.1 | `f3743a1` | `ProcessSandbox` wrapper for Node's `--permission` flag + `docs/sandboxing-research.md` decision doc (not wired in v1.5 — v1.6 migration plan documented) |
|
|
38
|
+
|
|
39
|
+
### Epic 16 — Quality + observability (Sprint 4)
|
|
40
|
+
|
|
41
|
+
| Story | Commit | Surface |
|
|
42
|
+
|---|---|---|
|
|
43
|
+
| 16.1 | `8df1eb9` | index.ts: 19 anys → 9 (typed CatalogEntry / RouteShape / TelegramGetMe / MissionState); CI lint budget 130 → 115 (production count now 109) |
|
|
44
|
+
| 16.2 | `8df1eb9` | `test_performance_latency.ts` expanded to 5 benchmarks (added `condition_parse_and_evaluate` + `workflow_parse`) |
|
|
45
|
+
|
|
46
|
+
### Epic 17 — Integration coverage (Sprint 5)
|
|
47
|
+
|
|
48
|
+
| Story | Commit | Surface |
|
|
49
|
+
|---|---|---|
|
|
50
|
+
| 17.1 | `37a1093` | `test_v15_e2e_integration.ts` — 7-step end-to-end run touching SquadCreator, MissionStateStore, OrchestrationLoop, MissionEvaluationStore, RemotePublisher, GovernanceLayer, GovernanceScopeLedger (every Brain / network call stubbed via require.cache) |
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## New environment variables
|
|
55
|
+
|
|
56
|
+
| Variable | Default | Owner | Disables |
|
|
57
|
+
|---|---|---|---|
|
|
58
|
+
| `OPENLIFE_POST_MISSION_EVALUATION` | `on` | Story 13.1 | Set to `off` to skip the Brain/heuristic evaluation hook |
|
|
59
|
+
| `OPENLIFE_GOVERNANCE_LEDGER` | `on` | Story 14.2 | Set to `off` to skip ledger appends |
|
|
60
|
+
| `OPENLIFE_FORECAST_CACHE` | `on` | Story 14.3 | Set to `off` to bypass forecast cache reads + writes |
|
|
61
|
+
| `OPENLIFE_FORECAST_CACHE_TTL_HOURS` | `24` | Story 14.3 | Override cache freshness window |
|
|
62
|
+
| `OPENLIFE_REMOTE_PUBLISH_URL` | (unset) | Story 14.1 | Set to base URL to enable remote publish |
|
|
63
|
+
| `OPENLIFE_REMOTE_PUBLISH_TOKEN` | (unset) | Story 14.1 | Optional Bearer token sent with each PUT |
|
|
64
|
+
| `OPENLIFE_PERF_BASELINE_FILE` | `.artifacts/perf-baseline.json` | Story 13.4 | CI uses `ci/perf-baseline.json` |
|
|
65
|
+
| `PERF_NOISE_FLOOR_MS` | `0.5` (local), `1.0` (CI) | Story 13.4 | Skip % gate when both P95s sit below this |
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## New tests
|
|
70
|
+
|
|
71
|
+
| Test | Stories covered |
|
|
72
|
+
|---|---|
|
|
73
|
+
| `test_post_mission_evaluation.ts` | 13.1 |
|
|
74
|
+
| `test_governance_scope_ledger.ts` | 14.2 |
|
|
75
|
+
| `test_consequence_forecast_brain.ts` | 14.3 |
|
|
76
|
+
| `test_remote_publish.ts` | 14.1 |
|
|
77
|
+
| `test_process_sandbox.ts` | 15.1 |
|
|
78
|
+
| `test_v15_e2e_integration.ts` | 17.1 |
|
|
79
|
+
|
|
80
|
+
CI `test:all` now runs 96 markers (up from 89 at v1.4.0).
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## Locked decisions held over to v1.5+
|
|
85
|
+
|
|
86
|
+
1. **Toolset enforcement opt-in stays.** Story 13.5 (default-ON flip)
|
|
87
|
+
waits for the soak window. Earliest target: after one tagged v1.4
|
|
88
|
+
maintenance release. Lands on its own branch.
|
|
89
|
+
2. **Brain mode is still flag-free.** Provider key presence is the
|
|
90
|
+
opt-in for `designWithBrain`, `evaluateMission`, and `forecastWithBrain`.
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Honest scorecard delta vs v1.4
|
|
95
|
+
|
|
96
|
+
| Dimension | v1.4 | v1.5 |
|
|
97
|
+
|---|---|---|
|
|
98
|
+
| Test infra | 10.0 | 10.0 (+e2e integration) |
|
|
99
|
+
| Documentation | 10.0 | 10.0 (4 new reference docs) |
|
|
100
|
+
| Code quality | 9.6 | 9.8 (any 172 → 109) |
|
|
101
|
+
| Feature completeness | 10.0 | 10.0 (publish + governance ledger + sandbox primitive) |
|
|
102
|
+
| Governance | 10.0 | 10.0+ (now tamper-evident) |
|
|
103
|
+
|
|
104
|
+
The improvements are honestly **incremental**, not transformative — that
|
|
105
|
+
was the v1.5 thesis. v1.6 is where the soak-gated default-ON flip and
|
|
106
|
+
the wider distributed-scheduler epic will move the score again.
|
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
# OpenLife v1.5 — Roadmap
|
|
2
|
+
|
|
3
|
+
**Branch:** `feat/v1.5-evaluation`
|
|
4
|
+
**Status:** Sprint 1 in progress.
|
|
5
|
+
**Predecessor:** v1.4.0 "Path to 10/10" (scorecard ~9.92/10 → A+).
|
|
6
|
+
|
|
7
|
+
v1.5 picks up the items that were explicitly deferred from v1.4 plus a
|
|
8
|
+
small set of consolidations the v1.4 audit revealed once the milestone
|
|
9
|
+
closed. No new pillars again — v1.5 is about making v1.4's foundations
|
|
10
|
+
land harder.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## The 5 deferred items from v1.4
|
|
15
|
+
|
|
16
|
+
| # | Item | Owner story | Priority |
|
|
17
|
+
|---|---|---|---|
|
|
18
|
+
| 1 | Toolset enforcement default-ON (after soak window) | 13.5 | P1 |
|
|
19
|
+
| 2 | Real remote pack publish (currently local ledger only) | 13.6 | P2 |
|
|
20
|
+
| 3 | LLM-driven post-mission **evaluation** (consolidation was v1.4 only) | 13.1 ✅ | P0 |
|
|
21
|
+
| 4 | Kernel-level fs sandboxing exploration (Node `permission` API) | 13.7 (research) | P3 |
|
|
22
|
+
| 5 | Distributed multi-host scheduler | 14.x (separate epic) | P3 |
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
## Epic 13 — Evaluation + soak follow-through (Sprint 1)
|
|
27
|
+
|
|
28
|
+
The first wave closes the highest-impact deferrals without changing the
|
|
29
|
+
opt-in stance for enforcement.
|
|
30
|
+
|
|
31
|
+
- **13.1 — Brain-driven post-mission evaluation** ✅ (commit `cea832d`)
|
|
32
|
+
- `OrchestrationLoop.evaluateMission(state)` writes a structured
|
|
33
|
+
`{ score, verdict, risks, rationale, source }` to
|
|
34
|
+
`.openlife/evaluations/<taskId>.json`.
|
|
35
|
+
- Brain when key present; heuristic `MissionEvaluationStore.judge()`
|
|
36
|
+
fallback otherwise.
|
|
37
|
+
- Disable: `OPENLIFE_POST_MISSION_EVALUATION=off`.
|
|
38
|
+
|
|
39
|
+
- **13.2 — `openlife eval list` CLI surface** ✅ (commit `bdca6df`)
|
|
40
|
+
- Lists `.openlife/evaluations/*.json`.
|
|
41
|
+
- Filters: `--verdict`, `--min-score`, `--source` (brain | heuristic).
|
|
42
|
+
|
|
43
|
+
- **13.3 — Tighten `any` budget**
|
|
44
|
+
- Drop CI lint budget from 160 → 130; reduce
|
|
45
|
+
`EnterpriseAgenticCore.ts` from 12 → ~3.
|
|
46
|
+
- The locked v1.4 target was <70 by v1.5; we're stepping the budget
|
|
47
|
+
down across two milestones to keep PRs reviewable.
|
|
48
|
+
|
|
49
|
+
- **13.4 — Perf benchmark CI gate hardening**
|
|
50
|
+
- Persist `.artifacts/perf-baseline.json` as a CI artifact across runs
|
|
51
|
+
(currently re-seeds every run, so no real comparison happens in CI
|
|
52
|
+
yet).
|
|
53
|
+
- Wire the perf test into the lint workflow as a quality gate.
|
|
54
|
+
|
|
55
|
+
- **13.5 — Toolset enforcement default-ON flip**
|
|
56
|
+
- Change `ToolsetGuard.isToolsetAllowed` default from "OFF when env unset"
|
|
57
|
+
to "ON when env unset". The opt-out becomes `OPENLIFE_TOOLSET_ENFORCEMENT=off`.
|
|
58
|
+
- Land **after** at least one week of v1.4 soak — earliest target is
|
|
59
|
+
one tagged maintenance release into v1.4.
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## Epic 14 — Real publish + advanced governance (Sprint 2, P2)
|
|
64
|
+
|
|
65
|
+
- **14.1 — Remote pack publish.** `SquadCreator.publish()` and
|
|
66
|
+
`SkillCreator.publish()` currently append to `.openlife/published-assets.jsonl`.
|
|
67
|
+
v1.5 wires `--remote <url>` to push the sealed asset bundle to a real
|
|
68
|
+
registry (npm-compatible or HTTP PUT, TBD during planning).
|
|
69
|
+
|
|
70
|
+
- **14.2 — Governance scope ledger.** Every `GovernanceLayer.evaluate()`
|
|
71
|
+
result currently writes a single consent record. v1.5 promotes this
|
|
72
|
+
to a tamper-evident JSONL ledger with SHA-chained entries.
|
|
73
|
+
|
|
74
|
+
- **14.3 — Brain → consequences forecast.** Wire `ConsequenceForecaster`
|
|
75
|
+
to call Brain for the highest-risk decisions; cache forecasts under
|
|
76
|
+
`.openlife/forecasts/`.
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## Epic 15 — Research-track items (P3, may slip to v1.6)
|
|
81
|
+
|
|
82
|
+
- **15.1 — Node `permission` API exploration.** Spike to evaluate
|
|
83
|
+
kernel-level fs sandboxing as a complement to toolset enforcement.
|
|
84
|
+
Output: a `docs/sandboxing-research.md` decision doc, no production
|
|
85
|
+
code in v1.5.
|
|
86
|
+
|
|
87
|
+
- **15.2 — Distributed multi-host scheduler.** Standalone epic; runs in
|
|
88
|
+
v1.6 unless capacity allows starting in v1.5.
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## Locked decisions (held over from v1.4)
|
|
93
|
+
|
|
94
|
+
1. **Toolset enforcement opt-in remains the v1.4 stance.** Story 13.5
|
|
95
|
+
flips the default ON only after the soak window. Existing
|
|
96
|
+
`OPENLIFE_TOOLSET_ENFORCEMENT=on` keeps working without change.
|
|
97
|
+
|
|
98
|
+
2. **Brain-driven `design()` continues to be flag-free.** Presence of
|
|
99
|
+
any provider key IS the opt-in. Same applies to Story 13.1's
|
|
100
|
+
`evaluateMission`.
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## Definition of Done
|
|
105
|
+
|
|
106
|
+
| Check | Method |
|
|
107
|
+
|---|---|
|
|
108
|
+
| `npm run build` clean | After each story |
|
|
109
|
+
| `npm run test:all` green | After each story; CI on every push |
|
|
110
|
+
| Perf regression check | `test_performance_latency.ts` against `.artifacts/perf-baseline.json` (≤ +30% P95 by default) |
|
|
111
|
+
| `any` budget | CI lint job — 130 by end of v1.5 (from 160 in v1.4) |
|
|
112
|
+
| Toolset enforcement soak | At least 1 week + 1 maintenance release before 13.5 lands |
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
## Commit / push policy
|
|
117
|
+
|
|
118
|
+
Same as v1.4 — one atomic commit per story, branch
|
|
119
|
+
`feat/v1.5-evaluation`, GOOODZ identity. Push requires explicit Rafa
|
|
120
|
+
approval (the v1.4 policy stayed in effect; the autonomous push for
|
|
121
|
+
this branch was granted on 2026-05-13).
|