agent-os-kernel 1.1.0__py3-none-any.whl → 1.2.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- agent_os/__init__.py +66 -4
- agent_os/agents_compat.py +286 -0
- agent_os/base_agent.py +308 -0
- agent_os/cli.py +1079 -19
- agent_os/integrations/__init__.py +37 -2
- agent_os/integrations/openai_adapter.py +502 -0
- agent_os/integrations/semantic_kernel_adapter.py +569 -0
- agent_os/stateless.py +349 -0
- agent_os_kernel-1.2.0.dist-info/METADATA +676 -0
- agent_os_kernel-1.2.0.dist-info/RECORD +1053 -0
- {agent_os_kernel-1.1.0.dist-info → agent_os_kernel-1.2.0.dist-info}/entry_points.txt +0 -1
- modules/amb/.github/workflows/ci.yml +102 -0
- modules/amb/.github/workflows/publish.yml +146 -0
- modules/amb/.gitignore +134 -0
- modules/amb/CHANGELOG.md +118 -0
- modules/amb/CONTRIBUTING.md +141 -0
- modules/amb/LICENSE +21 -0
- modules/amb/README.md +188 -0
- modules/amb/amb_core/__init__.py +175 -0
- modules/amb/amb_core/adapters/__init__.py +55 -0
- modules/amb/amb_core/adapters/aws_sqs_broker.py +374 -0
- modules/amb/amb_core/adapters/azure_servicebus_broker.py +338 -0
- modules/amb/amb_core/adapters/kafka_broker.py +258 -0
- modules/amb/amb_core/adapters/nats_broker.py +283 -0
- modules/amb/amb_core/adapters/rabbitmq_broker.py +233 -0
- modules/amb/amb_core/adapters/redis_broker.py +260 -0
- modules/amb/amb_core/broker.py +143 -0
- modules/amb/amb_core/bus.py +479 -0
- modules/amb/amb_core/cloudevents.py +507 -0
- modules/amb/amb_core/dlq.py +343 -0
- modules/amb/amb_core/hf_utils.py +534 -0
- modules/amb/amb_core/memory_broker.py +408 -0
- modules/amb/amb_core/models.py +139 -0
- modules/amb/amb_core/persistence.py +527 -0
- modules/amb/amb_core/schema.py +292 -0
- modules/amb/amb_core/tracing.py +356 -0
- modules/amb/examples/advanced_features.py +223 -0
- modules/amb/examples/backpressure_demo.py +225 -0
- modules/amb/examples/basic_usage.py +117 -0
- modules/amb/examples/tracing_demo.py +104 -0
- modules/amb/experiments/README.md +52 -0
- modules/amb/experiments/reproduce_results.py +467 -0
- modules/amb/experiments/results.json +324 -0
- modules/amb/paper/README.md +40 -0
- modules/amb/paper/paper.tex +365 -0
- modules/amb/paper/whitepaper.md +377 -0
- modules/amb/pyproject.toml +117 -0
- modules/amb/tests/__init__.py +1 -0
- modules/amb/tests/test_backpressure_priority.py +280 -0
- modules/amb/tests/test_bus.py +198 -0
- modules/amb/tests/test_cloudevents.py +443 -0
- modules/amb/tests/test_features.py +531 -0
- modules/amb/tests/test_models.py +74 -0
- modules/amb/tests/test_tracing.py +254 -0
- modules/atr/.github/workflows/ci.yml +101 -0
- modules/atr/.github/workflows/publish.yml +140 -0
- modules/atr/.gitignore +134 -0
- modules/atr/.pre-commit-config.yaml +37 -0
- modules/atr/CHANGELOG.md +39 -0
- modules/atr/CONTRIBUTING.md +96 -0
- modules/atr/IMPLEMENTATION_SUMMARY.md +143 -0
- modules/atr/README.md +180 -0
- modules/atr/atr/__init__.py +638 -0
- modules/atr/atr/access.py +346 -0
- modules/atr/atr/composition.py +643 -0
- modules/atr/atr/decorator.py +355 -0
- modules/atr/atr/executor.py +382 -0
- modules/atr/atr/health.py +555 -0
- modules/atr/atr/hf_utils.py +447 -0
- modules/atr/atr/injection.py +420 -0
- modules/atr/atr/metrics.py +438 -0
- modules/atr/atr/policies.py +401 -0
- modules/atr/atr/py.typed +2 -0
- modules/atr/atr/registry.py +450 -0
- modules/atr/atr/schema.py +478 -0
- modules/atr/atr/tools/safe/__init__.py +73 -0
- modules/atr/atr/tools/safe/calculator.py +380 -0
- modules/atr/atr/tools/safe/datetime_tool.py +441 -0
- modules/atr/atr/tools/safe/file_reader.py +400 -0
- modules/atr/atr/tools/safe/http_client.py +314 -0
- modules/atr/atr/tools/safe/json_parser.py +372 -0
- modules/atr/atr/tools/safe/text_tool.py +526 -0
- modules/atr/atr/tools/safe/toolkit.py +173 -0
- modules/atr/docs/PYPI_SETUP.md +113 -0
- modules/atr/examples/README.md +27 -0
- modules/atr/examples/demo.py +144 -0
- modules/atr/examples/sandbox_demo.py +218 -0
- modules/atr/experiments/README.md +69 -0
- modules/atr/experiments/reproduce_results.py +509 -0
- modules/atr/experiments/results/.gitkeep +0 -0
- modules/atr/experiments/results/results_20260123_140334.json +71 -0
- modules/atr/paper/README.md +36 -0
- modules/atr/paper/figures/.gitkeep +0 -0
- modules/atr/paper/references.bib +84 -0
- modules/atr/paper/structure.tex +293 -0
- modules/atr/paper/whitepaper.md +234 -0
- modules/atr/pyproject.toml +148 -0
- modules/atr/requirements.txt +1 -0
- modules/atr/setup.py +30 -0
- modules/atr/tests/__init__.py +1 -0
- modules/atr/tests/test_decorator.py +317 -0
- modules/atr/tests/test_executor.py +245 -0
- modules/atr/tests/test_integration_executor.py +184 -0
- modules/atr/tests/test_registry.py +312 -0
- modules/atr/tests/test_schema.py +182 -0
- modules/atr/tests/test_v2_features.py +708 -0
- modules/caas/.dockerignore +63 -0
- modules/caas/.github/ISSUE_TEMPLATE/bug_report.md +38 -0
- modules/caas/.github/ISSUE_TEMPLATE/custom.md +10 -0
- modules/caas/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
- modules/caas/.github/workflows/ci.yml +100 -0
- modules/caas/.github/workflows/lint.yml +39 -0
- modules/caas/.github/workflows/publish-pypi.yml +124 -0
- modules/caas/.gitignore +73 -0
- modules/caas/.pre-commit-config.yaml +33 -0
- modules/caas/CHANGELOG.md +58 -0
- modules/caas/CONTRIBUTING.md +346 -0
- modules/caas/Dockerfile +41 -0
- modules/caas/LICENSE +21 -0
- modules/caas/MANIFEST.in +11 -0
- modules/caas/README.md +158 -0
- modules/caas/benchmarks/README.md +255 -0
- modules/caas/benchmarks/create_hf_dataset.py +502 -0
- modules/caas/benchmarks/data/sample_corpus/README.md +86 -0
- modules/caas/benchmarks/data/sample_corpus/auth_module.py +211 -0
- modules/caas/benchmarks/data/sample_corpus/contribution_guide.md +185 -0
- modules/caas/benchmarks/data/sample_corpus/remote_work_policy.html +57 -0
- modules/caas/benchmarks/hf_dataset/README.md +214 -0
- modules/caas/benchmarks/hf_dataset/caas_benchmark_corpus.py +73 -0
- modules/caas/benchmarks/hf_dataset/corpus_preview.json +193 -0
- modules/caas/benchmarks/results/README.md +66 -0
- modules/caas/benchmarks/results/evaluation_2026-01-20.json +121 -0
- modules/caas/benchmarks/run_evaluation.py +561 -0
- modules/caas/benchmarks/statistical_tests.py +289 -0
- modules/caas/benchmarks/verify_sample_corpus.py +83 -0
- modules/caas/docker-compose.yml +38 -0
- modules/caas/docs/CONTEXT_TRIAD.md +462 -0
- modules/caas/docs/CONTRIBUTING.md +346 -0
- modules/caas/docs/ETHICS_AND_LIMITATIONS.md +336 -0
- modules/caas/docs/HEURISTIC_ROUTER.md +442 -0
- modules/caas/docs/IMPLEMENTATION_SUMMARY.md +363 -0
- modules/caas/docs/IMPLEMENTATION_SUMMARY_CONTEXT_TRIAD.md +277 -0
- modules/caas/docs/IMPLEMENTATION_SUMMARY_HEURISTIC_ROUTER.md +231 -0
- modules/caas/docs/IMPLEMENTATION_SUMMARY_METADATA_INJECTION.md +258 -0
- modules/caas/docs/IMPLEMENTATION_SUMMARY_PRAGMATIC_TRUTH.md +212 -0
- modules/caas/docs/IMPLEMENTATION_SUMMARY_TRUST_GATEWAY.md +319 -0
- modules/caas/docs/LAYER_1_PRIMITIVE.md +202 -0
- modules/caas/docs/METADATA_INJECTION.md +404 -0
- modules/caas/docs/PRAGMATIC_TRUTH.md +431 -0
- modules/caas/docs/RELATED_WORK.md +312 -0
- modules/caas/docs/RELEASE_CHECKLIST.md +219 -0
- modules/caas/docs/RELEASE_GUIDE.md +285 -0
- modules/caas/docs/REPRODUCIBILITY.md +386 -0
- modules/caas/docs/SLIDING_WINDOW.md +387 -0
- modules/caas/docs/STRUCTURE_AWARE_INDEXING.md +158 -0
- modules/caas/docs/TESTING.md +259 -0
- modules/caas/docs/THREAT_MODEL.md +247 -0
- modules/caas/docs/TRUST_GATEWAY.md +575 -0
- modules/caas/docs/VFS.md +298 -0
- modules/caas/examples/agents/enterprise_security_agent.py +414 -0
- modules/caas/examples/agents/intelligent_document_analyzer.py +380 -0
- modules/caas/examples/demos/demo.py +309 -0
- modules/caas/examples/demos/demo_context_triad.py +225 -0
- modules/caas/examples/demos/demo_conversation_manager.py +285 -0
- modules/caas/examples/demos/demo_heuristic_router.py +133 -0
- modules/caas/examples/demos/demo_metadata_injection.py +198 -0
- modules/caas/examples/demos/demo_pragmatic_truth.py +303 -0
- modules/caas/examples/demos/demo_structure_aware.py +140 -0
- modules/caas/examples/demos/demo_time_decay.py +247 -0
- modules/caas/examples/demos/demo_trust_gateway.py +383 -0
- modules/caas/examples/multi_agent/README.md +159 -0
- modules/caas/examples/multi_agent/research_team.py +369 -0
- modules/caas/examples/multi_agent/vfs_collaboration.py +393 -0
- modules/caas/examples/usage/auth_module.py +142 -0
- modules/caas/examples/usage/usage_example.py +173 -0
- modules/caas/experiments/README.md +42 -0
- modules/caas/experiments/reproduce_results.py +462 -0
- modules/caas/paper/ARXIV_METADATA.md +145 -0
- modules/caas/paper/ARXIV_README.md +47 -0
- modules/caas/paper/CHECKLIST.md +103 -0
- modules/caas/paper/GITHUB_RELEASE_NOTES.md +105 -0
- modules/caas/paper/README.md +71 -0
- modules/caas/paper/abstract.md +24 -0
- modules/caas/paper/arxiv_submission.tar +0 -0
- modules/caas/paper/arxiv_submission.zip +0 -0
- modules/caas/paper/build_pdf.py +355 -0
- modules/caas/paper/experiments.md +149 -0
- modules/caas/paper/figures/.gitkeep +0 -0
- modules/caas/paper/figures/README.md +237 -0
- modules/caas/paper/figures/fig1_system_architecture.png +0 -0
- modules/caas/paper/figures/fig1_system_architecture.svg +198 -0
- modules/caas/paper/figures/fig2_context_triad.png +0 -0
- modules/caas/paper/figures/fig2_context_triad.svg +105 -0
- modules/caas/paper/figures/fig3_ablation_results.png +0 -0
- modules/caas/paper/figures/fig3_ablation_results.svg +113 -0
- modules/caas/paper/figures/fig4_routing_latency.png +0 -0
- modules/caas/paper/figures/fig4_routing_latency.svg +97 -0
- modules/caas/paper/intro.md +103 -0
- modules/caas/paper/latex/figures/fig1_system_architecture.png +0 -0
- modules/caas/paper/latex/figures/fig2_context_triad.png +0 -0
- modules/caas/paper/latex/figures/fig3_ablation_results.png +0 -0
- modules/caas/paper/latex/figures/fig4_routing_latency.png +0 -0
- modules/caas/paper/latex/main.tex +468 -0
- modules/caas/paper/latex/references.bib +140 -0
- modules/caas/paper/method.md +350 -0
- modules/caas/paper/outline.md +123 -0
- modules/caas/paper/related_work.md +101 -0
- modules/caas/paper/tables/.gitkeep +0 -0
- modules/caas/paper/tables/results_tables.md +50 -0
- modules/caas/pyproject.toml +172 -0
- modules/caas/requirements.txt +11 -0
- modules/caas/src/caas/__init__.py +232 -0
- modules/caas/src/caas/api/__init__.py +7 -0
- modules/caas/src/caas/api/server.py +1326 -0
- modules/caas/src/caas/caching.py +832 -0
- modules/caas/src/caas/cli.py +208 -0
- modules/caas/src/caas/conversation.py +221 -0
- modules/caas/src/caas/decay.py +118 -0
- modules/caas/src/caas/detection/__init__.py +7 -0
- modules/caas/src/caas/detection/detector.py +236 -0
- modules/caas/src/caas/enrichment.py +127 -0
- modules/caas/src/caas/gateway/__init__.py +24 -0
- modules/caas/src/caas/gateway/trust_gateway.py +471 -0
- modules/caas/src/caas/hf_utils.py +477 -0
- modules/caas/src/caas/ingestion/__init__.py +21 -0
- modules/caas/src/caas/ingestion/processors.py +251 -0
- modules/caas/src/caas/ingestion/structure_parser.py +185 -0
- modules/caas/src/caas/models.py +354 -0
- modules/caas/src/caas/pragmatic_truth.py +441 -0
- modules/caas/src/caas/routing/__init__.py +8 -0
- modules/caas/src/caas/routing/heuristic_router.py +242 -0
- modules/caas/src/caas/storage/__init__.py +7 -0
- modules/caas/src/caas/storage/store.py +450 -0
- modules/caas/src/caas/triad.py +472 -0
- modules/caas/src/caas/tuning/__init__.py +7 -0
- modules/caas/src/caas/tuning/tuner.py +322 -0
- modules/caas/src/caas/vfs/__init__.py +12 -0
- modules/caas/src/caas/vfs/filesystem.py +450 -0
- modules/caas/tests/__init__.py +3 -0
- modules/caas/tests/conftest.py +8 -0
- modules/caas/tests/test_caching.py +628 -0
- modules/caas/tests/test_context_triad.py +385 -0
- modules/caas/tests/test_conversation_manager.py +289 -0
- modules/caas/tests/test_functionality.py +215 -0
- modules/caas/tests/test_heuristic_router.py +370 -0
- modules/caas/tests/test_metadata_injection.py +328 -0
- modules/caas/tests/test_pragmatic_truth.py +322 -0
- modules/caas/tests/test_structure_aware_indexing.py +283 -0
- modules/caas/tests/test_time_decay.py +268 -0
- modules/caas/tests/test_trust_gateway.py +445 -0
- modules/caas/tests/test_vfs.py +298 -0
- modules/cmvk/.github/FUNDING.yml +9 -0
- modules/cmvk/.github/dependabot.yml +54 -0
- modules/cmvk/.github/workflows/ci.yml +205 -0
- modules/cmvk/.github/workflows/publish.yml +143 -0
- modules/cmvk/.gitignore +147 -0
- modules/cmvk/.pre-commit-config.yaml +58 -0
- modules/cmvk/CHANGELOG.md +146 -0
- modules/cmvk/CITATION.cff +48 -0
- modules/cmvk/CONTRIBUTING.md +229 -0
- modules/cmvk/Dockerfile +87 -0
- modules/cmvk/HF_MODEL_CARD.md +185 -0
- modules/cmvk/LICENSE +21 -0
- modules/cmvk/README.md +149 -0
- modules/cmvk/SECURITY.md +114 -0
- modules/cmvk/config/prompts/generator_v1.txt +23 -0
- modules/cmvk/config/prompts/verifier_hostile.txt +32 -0
- modules/cmvk/config/settings.yaml +40 -0
- modules/cmvk/coverage_html/.gitignore +2 -0
- modules/cmvk/coverage_html/class_index.html +658 -0
- modules/cmvk/coverage_html/coverage_html_cb_188fc9a4.js +735 -0
- modules/cmvk/coverage_html/favicon_32_cb_c827f16f.png +0 -0
- modules/cmvk/coverage_html/function_index.html +1978 -0
- modules/cmvk/coverage_html/index.html +255 -0
- modules/cmvk/coverage_html/keybd_closed_cb_900cfef5.png +0 -0
- modules/cmvk/coverage_html/status.json +1 -0
- modules/cmvk/coverage_html/style_cb_5c747636.css +389 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38___init___py.html +315 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_audit_py.html +499 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_benchmarks_py.html +575 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_constitutional_py.html +1001 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_hf_utils_py.html +398 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_metrics_py.html +570 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_profiles_py.html +397 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_types_py.html +109 -0
- modules/cmvk/coverage_html/z_2c49bd2ed3e01e38_verification_py.html +1053 -0
- modules/cmvk/docs/DIAGRAMS.md +325 -0
- modules/cmvk/docs/architecture.md +345 -0
- modules/cmvk/docs/features.md +308 -0
- modules/cmvk/docs/getting_started.md +279 -0
- modules/cmvk/docs/innovation_layer.md +377 -0
- modules/cmvk/docs/safety.md +281 -0
- modules/cmvk/docs/traceability.md +150 -0
- modules/cmvk/examples/basic_example.py +62 -0
- modules/cmvk/examples/demo_complete_pipeline.py +209 -0
- modules/cmvk/examples/demo_innovation_layer.py +197 -0
- modules/cmvk/examples/example.py +112 -0
- modules/cmvk/examples/model_diversity_comparison.py +110 -0
- modules/cmvk/examples/real_api_integration.py +121 -0
- modules/cmvk/examples/test_full_pipeline.py +303 -0
- modules/cmvk/experiments/FEATURE_2_LATERAL_THINKING.md +187 -0
- modules/cmvk/experiments/README.md +216 -0
- modules/cmvk/experiments/ablation_runner.py +666 -0
- modules/cmvk/experiments/baseline_runner.py +158 -0
- modules/cmvk/experiments/blind_spot_benchmark.py +364 -0
- modules/cmvk/experiments/datasets/README.md +85 -0
- modules/cmvk/experiments/datasets/humaneval_50.json +352 -0
- modules/cmvk/experiments/datasets/humaneval_full.json +1150 -0
- modules/cmvk/experiments/datasets/humaneval_sample.json +32 -0
- modules/cmvk/experiments/datasets/sabotage.json +262 -0
- modules/cmvk/experiments/datasets/sample.json +40 -0
- modules/cmvk/experiments/demo_with_traces.py +110 -0
- modules/cmvk/experiments/efficiency_curve.py +259 -0
- modules/cmvk/experiments/experiment_runner.py +243 -0
- modules/cmvk/experiments/paper_data_generator.py +183 -0
- modules/cmvk/experiments/reproduce_results.py +407 -0
- modules/cmvk/experiments/reproducible_runner.py +352 -0
- modules/cmvk/experiments/sabotage_stress_test.py +311 -0
- modules/cmvk/experiments/test_lateral_thinking.py +116 -0
- modules/cmvk/experiments/test_prosecutor.py +41 -0
- modules/cmvk/experiments/visualize_results.py +735 -0
- modules/cmvk/logs/traces/demo_HumanEval_0_20260121-204900.json +36 -0
- modules/cmvk/notebooks/analysis.ipynb +124 -0
- modules/cmvk/paper/PAPER.md +561 -0
- modules/cmvk/paper/arxiv_checklist.md +230 -0
- modules/cmvk/paper/cmvk_neurips.aux +77 -0
- modules/cmvk/paper/cmvk_neurips.bbl +81 -0
- modules/cmvk/paper/cmvk_neurips.blg +48 -0
- modules/cmvk/paper/cmvk_neurips.out +16 -0
- modules/cmvk/paper/cmvk_neurips.pdf +0 -0
- modules/cmvk/paper/cmvk_neurips.tex +309 -0
- modules/cmvk/paper/figures/ablation.png +0 -0
- modules/cmvk/paper/figures/ablation.svg +39 -0
- modules/cmvk/paper/figures/architecture.png +0 -0
- modules/cmvk/paper/figures/architecture.svg +115 -0
- modules/cmvk/paper/figures/results_bar.png +0 -0
- modules/cmvk/paper/figures/results_bar.svg +70 -0
- modules/cmvk/paper/generate_figures.py +383 -0
- modules/cmvk/paper/neurips_2024.sty +101 -0
- modules/cmvk/paper/references.bib +98 -0
- modules/cmvk/paper/structure.tex +200 -0
- modules/cmvk/pyproject.toml +189 -0
- modules/cmvk/requirements-dev.txt +19 -0
- modules/cmvk/requirements.txt +14 -0
- modules/cmvk/src/cmvk/__init__.py +216 -0
- modules/cmvk/src/cmvk/audit.py +400 -0
- modules/cmvk/src/cmvk/benchmarks.py +476 -0
- modules/cmvk/src/cmvk/constitutional.py +902 -0
- modules/cmvk/src/cmvk/hf_utils.py +299 -0
- modules/cmvk/src/cmvk/metrics.py +471 -0
- modules/cmvk/src/cmvk/profiles.py +298 -0
- modules/cmvk/src/cmvk/py.typed +0 -0
- modules/cmvk/src/cmvk/types.py +10 -0
- modules/cmvk/src/cmvk/verification.py +954 -0
- modules/cmvk/src/cross_model_verification_kernel/__init__.py +91 -0
- modules/cmvk/src/cross_model_verification_kernel/__main__.py +10 -0
- modules/cmvk/src/cross_model_verification_kernel/agents/__init__.py +16 -0
- modules/cmvk/src/cross_model_verification_kernel/agents/base_agent.py +142 -0
- modules/cmvk/src/cross_model_verification_kernel/agents/generator_openai.py +223 -0
- modules/cmvk/src/cross_model_verification_kernel/agents/verifier_anthropic.py +448 -0
- modules/cmvk/src/cross_model_verification_kernel/agents/verifier_gemini.py +481 -0
- modules/cmvk/src/cross_model_verification_kernel/cli.py +570 -0
- modules/cmvk/src/cross_model_verification_kernel/core/__init__.py +26 -0
- modules/cmvk/src/cross_model_verification_kernel/core/graph_memory.py +308 -0
- modules/cmvk/src/cross_model_verification_kernel/core/kernel.py +413 -0
- modules/cmvk/src/cross_model_verification_kernel/core/trace_logger.py +75 -0
- modules/cmvk/src/cross_model_verification_kernel/core/types.py +121 -0
- modules/cmvk/src/cross_model_verification_kernel/datasets/__init__.py +20 -0
- modules/cmvk/src/cross_model_verification_kernel/datasets/humaneval_loader.py +271 -0
- modules/cmvk/src/cross_model_verification_kernel/generator.py +118 -0
- modules/cmvk/src/cross_model_verification_kernel/kernel.py +292 -0
- modules/cmvk/src/cross_model_verification_kernel/models.py +111 -0
- modules/cmvk/src/cross_model_verification_kernel/py.typed +1 -0
- modules/cmvk/src/cross_model_verification_kernel/simple_kernel.py +185 -0
- modules/cmvk/src/cross_model_verification_kernel/tools/__init__.py +94 -0
- modules/cmvk/src/cross_model_verification_kernel/tools/huggingface_upload.py +394 -0
- modules/cmvk/src/cross_model_verification_kernel/tools/sandbox.py +159 -0
- modules/cmvk/src/cross_model_verification_kernel/tools/statistics.py +468 -0
- modules/cmvk/src/cross_model_verification_kernel/tools/visualizer.py +312 -0
- modules/cmvk/src/cross_model_verification_kernel/tools/web_search.py +86 -0
- modules/cmvk/src/cross_model_verification_kernel/verifier.py +257 -0
- modules/cmvk/tests/__init__.py +3 -0
- modules/cmvk/tests/conftest.py +61 -0
- modules/cmvk/tests/integration/__init__.py +1 -0
- modules/cmvk/tests/integration/test_anthropic_verifier.py +269 -0
- modules/cmvk/tests/integration/test_integration.py +53 -0
- modules/cmvk/tests/integration/test_lateral_thinking_integration.py +199 -0
- modules/cmvk/tests/integration/test_lateral_thinking_witness.py +208 -0
- modules/cmvk/tests/integration/test_prosecutor_mode.py +131 -0
- modules/cmvk/tests/test_constitutional.py +611 -0
- modules/cmvk/tests/test_enhanced_features.py +603 -0
- modules/cmvk/tests/test_verification.py +255 -0
- modules/cmvk/tests/unit/__init__.py +1 -0
- modules/cmvk/tests/unit/test_agents.py +64 -0
- modules/cmvk/tests/unit/test_cli.py +224 -0
- modules/cmvk/tests/unit/test_core.py +126 -0
- modules/cmvk/tests/unit/test_humaneval_loader.py +197 -0
- modules/cmvk/tests/unit/test_kernel.py +255 -0
- modules/cmvk/tests/unit/test_reproducibility.py +160 -0
- modules/cmvk/tests/unit/test_trace_logger.py +115 -0
- modules/cmvk/tests/unit/test_visualizer.py +218 -0
- modules/control-plane/.github/ISSUE_TEMPLATE/bug_report.yml +82 -0
- modules/control-plane/.github/ISSUE_TEMPLATE/config.yml +11 -0
- modules/control-plane/.github/ISSUE_TEMPLATE/feature_request.yml +104 -0
- modules/control-plane/.github/ISSUE_TEMPLATE/question.yml +70 -0
- modules/control-plane/.github/ISSUE_TEMPLATE/security_vulnerability.yml +84 -0
- modules/control-plane/.github/discussions.yml +73 -0
- modules/control-plane/.github/pull_request_template.md +82 -0
- modules/control-plane/.github/workflows/publish.yml +146 -0
- modules/control-plane/.github/workflows/release.yml +39 -0
- modules/control-plane/.github/workflows/tests.yml +58 -0
- modules/control-plane/.gitignore +55 -0
- modules/control-plane/CHANGELOG.md +203 -0
- modules/control-plane/CONTRIBUTING.md +311 -0
- modules/control-plane/CONTRIBUTORS.md +88 -0
- modules/control-plane/Dockerfile +82 -0
- modules/control-plane/LICENSE +21 -0
- modules/control-plane/MANIFEST.in +17 -0
- modules/control-plane/README.md +1264 -0
- modules/control-plane/ROADMAP.md +228 -0
- modules/control-plane/SECURITY.md +210 -0
- modules/control-plane/SUPPORT.md +106 -0
- modules/control-plane/acp-cli.py +212 -0
- modules/control-plane/benchmark/README.md +257 -0
- modules/control-plane/benchmark/__init__.py +19 -0
- modules/control-plane/benchmark/red_team_dataset.py +517 -0
- modules/control-plane/benchmark.py +563 -0
- modules/control-plane/build_and_publish.sh +130 -0
- modules/control-plane/docker-compose.yml +74 -0
- modules/control-plane/docs/ABLATION_STUDIES.md +528 -0
- modules/control-plane/docs/ADAPTER_GUIDE.md +544 -0
- modules/control-plane/docs/ADVANCED_FEATURES.md +543 -0
- modules/control-plane/docs/AIOS_COMPARISON.md +296 -0
- modules/control-plane/docs/BIBLIOGRAPHY.md +367 -0
- modules/control-plane/docs/CASE_STUDIES.md +645 -0
- modules/control-plane/docs/DOCKER_DEPLOYMENT.md +184 -0
- modules/control-plane/docs/ECOSYSTEM_STATUS.md +98 -0
- modules/control-plane/docs/HF_MODEL_CARD.md +168 -0
- modules/control-plane/docs/KERNEL_V1_RELEASE.md +454 -0
- modules/control-plane/docs/LAYER3_FRAMEWORK.md +227 -0
- modules/control-plane/docs/LIMITATIONS.md +523 -0
- modules/control-plane/docs/PYPI_PUBLISHING.md +195 -0
- modules/control-plane/docs/README.md +58 -0
- modules/control-plane/docs/RELATED_WORK.md +319 -0
- modules/control-plane/docs/RELEASE_v1.1.0.md +252 -0
- modules/control-plane/docs/REPRODUCIBILITY.md +540 -0
- modules/control-plane/docs/RESEARCH_FOUNDATION.md +197 -0
- modules/control-plane/docs/api/CORE.md +270 -0
- modules/control-plane/docs/architecture/architecture.md +120 -0
- modules/control-plane/docs/community/ANNOUNCEMENT_TEMPLATES.md +52 -0
- modules/control-plane/docs/guides/IMPLEMENTATION.md +225 -0
- modules/control-plane/docs/guides/PHILOSOPHY.md +354 -0
- modules/control-plane/docs/guides/QUICKSTART.md +217 -0
- modules/control-plane/examples/README.md +138 -0
- modules/control-plane/examples/a2a_demo.py +410 -0
- modules/control-plane/examples/adapter_demo.py +347 -0
- modules/control-plane/examples/advanced_features.py +403 -0
- modules/control-plane/examples/basic_usage.py +261 -0
- modules/control-plane/examples/benchmark_demo.py +186 -0
- modules/control-plane/examples/compliance_demo.py +333 -0
- modules/control-plane/examples/configuration.py +265 -0
- modules/control-plane/examples/getting_started.py +178 -0
- modules/control-plane/examples/hibernation_and_time_travel_demo.py +406 -0
- modules/control-plane/examples/interactive_tutorial.ipynb +497 -0
- modules/control-plane/examples/kernel_interceptor_demo.py +202 -0
- modules/control-plane/examples/kernel_v1_demo.py +273 -0
- modules/control-plane/examples/langchain_demo.py +281 -0
- modules/control-plane/examples/lifecycle_demo.py +724 -0
- modules/control-plane/examples/mcp_demo.py +378 -0
- modules/control-plane/examples/ml_safety_demo.py +157 -0
- modules/control-plane/examples/multimodal_demo.py +347 -0
- modules/control-plane/examples/observability_demo.py +370 -0
- modules/control-plane/examples/use_cases.py +336 -0
- modules/control-plane/experiments/long_horizon_purge.py +235 -0
- modules/control-plane/experiments/multi_agent_rag.py +165 -0
- modules/control-plane/experiments/reproduce_results.py +667 -0
- modules/control-plane/paper/ARXIV_SUBMISSION_INFO.txt +122 -0
- modules/control-plane/paper/ETHICS_STATEMENT.md +248 -0
- modules/control-plane/paper/PAPER_CHECKLIST.md +72 -0
- modules/control-plane/paper/Paper.pdf +0 -0
- modules/control-plane/paper/README.md +71 -0
- modules/control-plane/paper/appendix.md +152 -0
- modules/control-plane/paper/architecture.md +15 -0
- modules/control-plane/paper/arxiv/figures/ablation_chart.png +0 -0
- modules/control-plane/paper/arxiv/figures/architecture.png +0 -0
- modules/control-plane/paper/arxiv/figures/constraint_graphs.png +0 -0
- modules/control-plane/paper/arxiv/figures/results_chart.png +0 -0
- modules/control-plane/paper/arxiv/main.aux +97 -0
- modules/control-plane/paper/arxiv/main.bbl +112 -0
- modules/control-plane/paper/arxiv/main.blg +48 -0
- modules/control-plane/paper/arxiv/main.out +33 -0
- modules/control-plane/paper/arxiv/main.pdf +0 -0
- modules/control-plane/paper/arxiv/main.tex +479 -0
- modules/control-plane/paper/arxiv/references.bib +234 -0
- modules/control-plane/paper/arxiv_submission.tar +0 -0
- modules/control-plane/paper/arxiv_submission.zip +0 -0
- modules/control-plane/paper/build.sh +68 -0
- modules/control-plane/paper/figures/README.md +47 -0
- modules/control-plane/paper/figures/ablation_chart.pdf +0 -0
- modules/control-plane/paper/figures/ablation_chart.png +0 -0
- modules/control-plane/paper/figures/architecture.pdf +0 -0
- modules/control-plane/paper/figures/architecture.png +0 -0
- modules/control-plane/paper/figures/constraint_graphs.pdf +0 -0
- modules/control-plane/paper/figures/constraint_graphs.png +0 -0
- modules/control-plane/paper/figures/generate_figures.py +252 -0
- modules/control-plane/paper/figures/results_chart.pdf +0 -0
- modules/control-plane/paper/figures/results_chart.png +0 -0
- modules/control-plane/paper/main.md +273 -0
- modules/control-plane/paper/main.tex +214 -0
- modules/control-plane/paper/main_arxiv.aux +53 -0
- modules/control-plane/paper/main_arxiv.out +17 -0
- modules/control-plane/paper/main_arxiv.pdf +0 -0
- modules/control-plane/paper/main_arxiv.tex +264 -0
- modules/control-plane/paper/references.bib +234 -0
- modules/control-plane/pyproject.toml +124 -0
- modules/control-plane/reproducibility/ABLATIONS.md +136 -0
- modules/control-plane/reproducibility/README.md +288 -0
- modules/control-plane/reproducibility/commands.md +467 -0
- modules/control-plane/reproducibility/docker_config/Dockerfile +39 -0
- modules/control-plane/reproducibility/experiment_configs/purge_config.json +46 -0
- modules/control-plane/reproducibility/experiment_configs/rag_config.json +36 -0
- modules/control-plane/reproducibility/hardware_specs.md +317 -0
- modules/control-plane/reproducibility/requirements_frozen.txt +0 -0
- modules/control-plane/reproducibility/run_all_experiments.sh +45 -0
- modules/control-plane/reproducibility/seeds.json +106 -0
- modules/control-plane/scripts/prepare_pypi.py +46 -0
- modules/control-plane/scripts/prepare_release.py +176 -0
- modules/control-plane/scripts/upload_dataset_to_hf.py +316 -0
- modules/control-plane/setup.py +69 -0
- modules/control-plane/src/agent_control_plane/__init__.py +639 -0
- modules/control-plane/src/agent_control_plane/a2a_adapter.py +541 -0
- modules/control-plane/src/agent_control_plane/adapter.py +415 -0
- modules/control-plane/src/agent_control_plane/agent_hibernation.py +364 -0
- modules/control-plane/src/agent_control_plane/agent_kernel.py +464 -0
- modules/control-plane/src/agent_control_plane/compliance.py +718 -0
- modules/control-plane/src/agent_control_plane/constraint_graphs.py +475 -0
- modules/control-plane/src/agent_control_plane/control_plane.py +848 -0
- modules/control-plane/src/agent_control_plane/example_executors.py +193 -0
- modules/control-plane/src/agent_control_plane/execution_engine.py +229 -0
- modules/control-plane/src/agent_control_plane/flight_recorder.py +600 -0
- modules/control-plane/src/agent_control_plane/governance_layer.py +432 -0
- modules/control-plane/src/agent_control_plane/hf_utils.py +561 -0
- modules/control-plane/src/agent_control_plane/interfaces/__init__.py +53 -0
- modules/control-plane/src/agent_control_plane/interfaces/kernel_interface.py +359 -0
- modules/control-plane/src/agent_control_plane/interfaces/plugin_interface.py +495 -0
- modules/control-plane/src/agent_control_plane/interfaces/protocol_interfaces.py +385 -0
- modules/control-plane/src/agent_control_plane/kernel_space.py +707 -0
- modules/control-plane/src/agent_control_plane/langchain_adapter.py +422 -0
- modules/control-plane/src/agent_control_plane/lifecycle.py +3111 -0
- modules/control-plane/src/agent_control_plane/mcp_adapter.py +517 -0
- modules/control-plane/src/agent_control_plane/ml_safety.py +560 -0
- modules/control-plane/src/agent_control_plane/multimodal.py +724 -0
- modules/control-plane/src/agent_control_plane/mute_agent.py +419 -0
- modules/control-plane/src/agent_control_plane/observability.py +785 -0
- modules/control-plane/src/agent_control_plane/orchestrator.py +480 -0
- modules/control-plane/src/agent_control_plane/plugin_registry.py +748 -0
- modules/control-plane/src/agent_control_plane/policy_engine.py +525 -0
- modules/control-plane/src/agent_control_plane/shadow_mode.py +307 -0
- modules/control-plane/src/agent_control_plane/signals.py +491 -0
- modules/control-plane/src/agent_control_plane/supervisor_agents.py +427 -0
- modules/control-plane/src/agent_control_plane/time_travel_debugger.py +554 -0
- modules/control-plane/src/agent_control_plane/tool_registry.py +350 -0
- modules/control-plane/src/agent_control_plane/vfs.py +695 -0
- modules/control-plane/tests/README.md +33 -0
- modules/control-plane/tests/test_a2a_adapter.py +336 -0
- modules/control-plane/tests/test_adapter.py +422 -0
- modules/control-plane/tests/test_advanced_features.py +389 -0
- modules/control-plane/tests/test_benchmark.py +223 -0
- modules/control-plane/tests/test_compliance.py +214 -0
- modules/control-plane/tests/test_control_plane.py +295 -0
- modules/control-plane/tests/test_hibernation.py +274 -0
- modules/control-plane/tests/test_kernel_interception.py +284 -0
- modules/control-plane/tests/test_langchain_adapter.py +258 -0
- modules/control-plane/tests/test_lifecycle.py +1174 -0
- modules/control-plane/tests/test_mcp_adapter.py +293 -0
- modules/control-plane/tests/test_ml_safety.py +142 -0
- modules/control-plane/tests/test_multimodal.py +317 -0
- modules/control-plane/tests/test_new_features.py +435 -0
- modules/control-plane/tests/test_observability.py +338 -0
- modules/control-plane/tests/test_time_travel.py +387 -0
- modules/emk/.github/workflows/ci.yml +105 -0
- modules/emk/.github/workflows/publish.yml +144 -0
- modules/emk/.gitignore +74 -0
- modules/emk/CHANGELOG.md +41 -0
- modules/emk/CONTRIBUTING.md +295 -0
- modules/emk/IMPLEMENTATION.md +174 -0
- modules/emk/LICENSE +21 -0
- modules/emk/MANIFEST.in +8 -0
- modules/emk/README.md +135 -0
- modules/emk/RELEASE_NOTES.md +82 -0
- modules/emk/SECURITY.md +52 -0
- modules/emk/codecov.yml +39 -0
- modules/emk/docs/MEMORY_MANAGEMENT.md +285 -0
- modules/emk/emk/__init__.py +106 -0
- modules/emk/emk/hf_utils.py +419 -0
- modules/emk/emk/indexer.py +144 -0
- modules/emk/emk/py.typed +0 -0
- modules/emk/emk/schema.py +204 -0
- modules/emk/emk/sleep_cycle.py +345 -0
- modules/emk/emk/store.py +479 -0
- modules/emk/examples/basic_usage.py +123 -0
- modules/emk/examples/memory_features_demo.py +154 -0
- modules/emk/experiments/README.md +59 -0
- modules/emk/experiments/reproduce_results.py +461 -0
- modules/emk/experiments/results.json +61 -0
- modules/emk/paper/structure.tex +192 -0
- modules/emk/paper/whitepaper.md +273 -0
- modules/emk/pyproject.toml +91 -0
- modules/emk/setup.py +5 -0
- modules/emk/tests/test_file_adapter.py +195 -0
- modules/emk/tests/test_indexer.py +174 -0
- modules/emk/tests/test_init.py +55 -0
- modules/emk/tests/test_negative_memory.py +83 -0
- modules/emk/tests/test_schema.py +150 -0
- modules/emk/tests/test_semantic_rules.py +175 -0
- modules/emk/tests/test_sleep_cycle.py +335 -0
- modules/emk/tests/test_store_anti_patterns.py +239 -0
- modules/iatp/.github/workflows/docker-build.yml +124 -0
- modules/iatp/.github/workflows/publish.yml +174 -0
- modules/iatp/.github/workflows/python-package.yml +121 -0
- modules/iatp/.gitignore +67 -0
- modules/iatp/.pre-commit-config.yaml +64 -0
- modules/iatp/CHANGELOG.md +120 -0
- modules/iatp/Dockerfile +91 -0
- modules/iatp/IMPLEMENTATION_SUMMARY.md +218 -0
- modules/iatp/MANIFEST.in +9 -0
- modules/iatp/README.md +180 -0
- modules/iatp/docker/Dockerfile.agent +27 -0
- modules/iatp/docker/Dockerfile.sidecar-python +86 -0
- modules/iatp/docker/README.md +258 -0
- modules/iatp/docker-compose.yml +194 -0
- modules/iatp/docs/ARCHITECTURE.md +243 -0
- modules/iatp/docs/CLI_GUIDE.md +220 -0
- modules/iatp/docs/DEPLOYMENT.md +304 -0
- modules/iatp/examples/README.md +132 -0
- modules/iatp/examples/backend_agent.py +39 -0
- modules/iatp/examples/client.py +168 -0
- modules/iatp/examples/demo_attestation_reputation.py +274 -0
- modules/iatp/examples/demo_client.py +240 -0
- modules/iatp/examples/demo_rbac.py +143 -0
- modules/iatp/examples/integration_demo.py +245 -0
- modules/iatp/examples/manifests/coder_agent.json +20 -0
- modules/iatp/examples/manifests/reviewer_agent.json +19 -0
- modules/iatp/examples/manifests/secure_bank.json +14 -0
- modules/iatp/examples/manifests/standard_agent.json +14 -0
- modules/iatp/examples/manifests/untrusted_honeypot.json +14 -0
- modules/iatp/examples/run_secure_bank_sidecar.py +85 -0
- modules/iatp/examples/run_sidecar.py +105 -0
- modules/iatp/examples/run_untrusted_sidecar.py +77 -0
- modules/iatp/examples/secure_bank_agent.py +138 -0
- modules/iatp/examples/test_untrusted.py +82 -0
- modules/iatp/examples/untrusted_agent.py +119 -0
- modules/iatp/experiments/README.md +58 -0
- modules/iatp/experiments/cascading_hallucination/README.md +149 -0
- modules/iatp/experiments/cascading_hallucination/agent_a_user.py +41 -0
- modules/iatp/experiments/cascading_hallucination/agent_b_summarizer.py +54 -0
- modules/iatp/experiments/cascading_hallucination/agent_c_database.py +47 -0
- modules/iatp/experiments/cascading_hallucination/proof_of_concept.py +290 -0
- modules/iatp/experiments/cascading_hallucination/run_experiment.py +226 -0
- modules/iatp/experiments/cascading_hallucination/sidecar_c.py +61 -0
- modules/iatp/experiments/reproduce_results.py +574 -0
- modules/iatp/experiments/results.json +2336 -0
- modules/iatp/iatp/__init__.py +164 -0
- modules/iatp/iatp/attestation.py +401 -0
- modules/iatp/iatp/cli.py +253 -0
- modules/iatp/iatp/hf_utils.py +469 -0
- modules/iatp/iatp/ipc_pipes.py +578 -0
- modules/iatp/iatp/main.py +410 -0
- modules/iatp/iatp/models/__init__.py +445 -0
- modules/iatp/iatp/policy_engine.py +335 -0
- modules/iatp/iatp/py.typed +2 -0
- modules/iatp/iatp/recovery.py +319 -0
- modules/iatp/iatp/security/__init__.py +268 -0
- modules/iatp/iatp/sidecar/__init__.py +517 -0
- modules/iatp/iatp/telemetry/__init__.py +162 -0
- modules/iatp/iatp/tests/__init__.py +1 -0
- modules/iatp/iatp/tests/test_attestation.py +368 -0
- modules/iatp/iatp/tests/test_cli.py +129 -0
- modules/iatp/iatp/tests/test_models.py +128 -0
- modules/iatp/iatp/tests/test_policy_engine.py +345 -0
- modules/iatp/iatp/tests/test_recovery.py +279 -0
- modules/iatp/iatp/tests/test_security.py +220 -0
- modules/iatp/iatp/tests/test_sidecar.py +165 -0
- modules/iatp/iatp/tests/test_telemetry.py +173 -0
- modules/iatp/paper/BLOG.md +307 -0
- modules/iatp/paper/PAPER.md +236 -0
- modules/iatp/paper/RFC_SUBMISSION.md +299 -0
- modules/iatp/paper/whitepaper.md +369 -0
- modules/iatp/proto/README.md +200 -0
- modules/iatp/proto/generate_stubs.py +81 -0
- modules/iatp/proto/iatp.proto +552 -0
- modules/iatp/pyproject.toml +180 -0
- modules/iatp/requirements-dev.txt +2 -0
- modules/iatp/requirements.txt +6 -0
- modules/iatp/setup.py +60 -0
- modules/iatp/sidecar/README.md +487 -0
- modules/iatp/sidecar/go/Dockerfile +32 -0
- modules/iatp/sidecar/go/README.md +237 -0
- modules/iatp/sidecar/go/go.mod +8 -0
- modules/iatp/sidecar/go/main.go +488 -0
- modules/iatp/spec/001-handshake.md +436 -0
- modules/iatp/spec/002-reversibility.md +394 -0
- modules/iatp/spec/schema/capability_manifest.json +266 -0
- modules/iatp/test_integration.py +310 -0
- modules/mcp-kernel-server/README.md +261 -0
- modules/mcp-kernel-server/pyproject.toml +60 -0
- modules/mcp-kernel-server/src/mcp_kernel_server/__init__.py +26 -0
- modules/mcp-kernel-server/src/mcp_kernel_server/cli.py +229 -0
- modules/mcp-kernel-server/src/mcp_kernel_server/resources.py +215 -0
- modules/mcp-kernel-server/src/mcp_kernel_server/server.py +562 -0
- modules/mcp-kernel-server/src/mcp_kernel_server/tools.py +1172 -0
- modules/mute-agent/.github/workflows/safety_check.yml +45 -0
- modules/mute-agent/.gitignore +53 -0
- modules/mute-agent/ARCHITECTURE.md +531 -0
- modules/mute-agent/BENCHMARK_GUIDE.md +384 -0
- modules/mute-agent/COMPLETION_SUMMARY.md +293 -0
- modules/mute-agent/EXPERIMENT_SUMMARY.md +318 -0
- modules/mute-agent/IMPLEMENTATION_SUMMARY.md +212 -0
- modules/mute-agent/LICENSE +21 -0
- modules/mute-agent/PHASE3_SUMMARY.md +297 -0
- modules/mute-agent/README.md +360 -0
- modules/mute-agent/STEEL_MAN_RESULTS.md +353 -0
- modules/mute-agent/USAGE.md +505 -0
- modules/mute-agent/V2_IMPLEMENTATION_SUMMARY.md +253 -0
- modules/mute-agent/V2_STEEL_MAN_IMPLEMENTATION.md +274 -0
- modules/mute-agent/VERIFICATION_REPORT.md +435 -0
- modules/mute-agent/charts/cost_comparison.png +0 -0
- modules/mute-agent/charts/cost_vs_ambiguity.png +0 -0
- modules/mute-agent/charts/metrics_comparison.png +0 -0
- modules/mute-agent/charts/scenario_breakdown.png +0 -0
- modules/mute-agent/charts/trace_attack_blocked.html +140 -0
- modules/mute-agent/charts/trace_attack_blocked.png +0 -0
- modules/mute-agent/charts/trace_failure.html +140 -0
- modules/mute-agent/charts/trace_failure.png +0 -0
- modules/mute-agent/charts/trace_success.html +140 -0
- modules/mute-agent/charts/trace_success.png +0 -0
- modules/mute-agent/examples/__init__.py +1 -0
- modules/mute-agent/examples/advanced_example.py +384 -0
- modules/mute-agent/examples/graph_debugger_demo.py +241 -0
- modules/mute-agent/examples/listener_example.py +297 -0
- modules/mute-agent/examples/simple_example.py +242 -0
- modules/mute-agent/examples/steel_man_demo.py +297 -0
- modules/mute-agent/experiments/README.md +135 -0
- modules/mute-agent/experiments/__init__.py +3 -0
- modules/mute-agent/experiments/agent_comparison.csv +6 -0
- modules/mute-agent/experiments/agent_comparison_50runs.csv +6 -0
- modules/mute-agent/experiments/ambiguity_test.py +335 -0
- modules/mute-agent/experiments/ambiguity_test_results.csv +31 -0
- modules/mute-agent/experiments/ambiguity_test_results_50runs.csv +51 -0
- modules/mute-agent/experiments/baseline_agent.py +189 -0
- modules/mute-agent/experiments/benchmark.py +402 -0
- modules/mute-agent/experiments/demo.py +172 -0
- modules/mute-agent/experiments/generate_cost_curve.py +474 -0
- modules/mute-agent/experiments/jailbreak_test.py +137 -0
- modules/mute-agent/experiments/latent_state_scenario.py +361 -0
- modules/mute-agent/experiments/mute_agent_experiment.py +349 -0
- modules/mute-agent/experiments/run_extended_experiment.py +40 -0
- modules/mute-agent/experiments/run_v2_experiments.py +266 -0
- modules/mute-agent/experiments/run_v2_experiments_auto.py +247 -0
- modules/mute-agent/experiments/v2_scenarios/README.md +214 -0
- modules/mute-agent/experiments/v2_scenarios/__init__.py +4 -0
- modules/mute-agent/experiments/v2_scenarios/scenario_1_deep_dependency.py +325 -0
- modules/mute-agent/experiments/v2_scenarios/scenario_2_adversarial.py +328 -0
- modules/mute-agent/experiments/v2_scenarios/scenario_3_false_positive.py +303 -0
- modules/mute-agent/experiments/v2_scenarios/scenario_4_performance.py +319 -0
- modules/mute-agent/experiments/visualize.py +400 -0
- modules/mute-agent/mute_agent/__init__.py +66 -0
- modules/mute-agent/mute_agent/core/__init__.py +1 -0
- modules/mute-agent/mute_agent/core/execution_agent.py +164 -0
- modules/mute-agent/mute_agent/core/handshake_protocol.py +199 -0
- modules/mute-agent/mute_agent/core/reasoning_agent.py +236 -0
- modules/mute-agent/mute_agent/knowledge_graph/__init__.py +1 -0
- modules/mute-agent/mute_agent/knowledge_graph/graph_elements.py +63 -0
- modules/mute-agent/mute_agent/knowledge_graph/multidimensional_graph.py +168 -0
- modules/mute-agent/mute_agent/knowledge_graph/subgraph.py +222 -0
- modules/mute-agent/mute_agent/listener/__init__.py +41 -0
- modules/mute-agent/mute_agent/listener/adapters/__init__.py +29 -0
- modules/mute-agent/mute_agent/listener/adapters/base_adapter.py +187 -0
- modules/mute-agent/mute_agent/listener/adapters/caas_adapter.py +342 -0
- modules/mute-agent/mute_agent/listener/adapters/control_plane_adapter.py +434 -0
- modules/mute-agent/mute_agent/listener/adapters/iatp_adapter.py +330 -0
- modules/mute-agent/mute_agent/listener/adapters/scak_adapter.py +249 -0
- modules/mute-agent/mute_agent/listener/listener.py +608 -0
- modules/mute-agent/mute_agent/listener/state_observer.py +434 -0
- modules/mute-agent/mute_agent/listener/threshold_config.py +311 -0
- modules/mute-agent/mute_agent/super_system/__init__.py +1 -0
- modules/mute-agent/mute_agent/super_system/router.py +202 -0
- modules/mute-agent/mute_agent/visualization/__init__.py +8 -0
- modules/mute-agent/mute_agent/visualization/graph_debugger.py +495 -0
- modules/mute-agent/requirements-dev.txt +6 -0
- modules/mute-agent/requirements.txt +9 -0
- modules/mute-agent/setup.py +64 -0
- modules/mute-agent/src/__init__.py +0 -0
- modules/mute-agent/src/agents/__init__.py +0 -0
- modules/mute-agent/src/agents/baseline_agent.py +524 -0
- modules/mute-agent/src/agents/interactive_agent.py +113 -0
- modules/mute-agent/src/agents/mute_agent.py +622 -0
- modules/mute-agent/src/benchmarks/__init__.py +0 -0
- modules/mute-agent/src/benchmarks/evaluator.py +481 -0
- modules/mute-agent/src/benchmarks/scenarios.json +985 -0
- modules/mute-agent/src/core/__init__.py +0 -0
- modules/mute-agent/src/core/mock_state.py +320 -0
- modules/mute-agent/src/core/tools.py +441 -0
- modules/nexus/__init__.py +49 -0
- modules/nexus/arbiter.py +357 -0
- modules/nexus/client.py +464 -0
- modules/nexus/dmz.py +417 -0
- modules/nexus/escrow.py +428 -0
- modules/nexus/exceptions.py +284 -0
- modules/nexus/registry.py +391 -0
- modules/nexus/reputation.py +423 -0
- modules/nexus/schemas/__init__.py +49 -0
- modules/nexus/schemas/compliance.py +274 -0
- modules/nexus/schemas/escrow.py +249 -0
- modules/nexus/schemas/manifest.py +223 -0
- modules/nexus/schemas/receipt.py +206 -0
- modules/observability/README.md +192 -0
- modules/observability/alertmanager/alertmanager.yml +116 -0
- modules/observability/alerts/agent-os-alerts.yaml +197 -0
- modules/observability/docker-compose.yml +128 -0
- modules/observability/grafana/dashboards/agent-os-amb.json +448 -0
- modules/observability/grafana/dashboards/agent-os-cmvk.json +441 -0
- modules/observability/grafana/dashboards/agent-os-overview.json +268 -0
- modules/observability/grafana/dashboards/agent-os-performance.json +15 -0
- modules/observability/grafana/dashboards/agent-os-safety.json +50 -0
- modules/observability/grafana/provisioning/dashboards/dashboards.yml +15 -0
- modules/observability/grafana/provisioning/datasources/datasources.yml +33 -0
- modules/observability/otel/otel-collector-config.yml +61 -0
- modules/observability/prometheus/prometheus.yml +63 -0
- modules/observability/pyproject.toml +53 -0
- modules/observability/scripts/export_dashboards.py +55 -0
- modules/observability/src/agent_os_observability/__init__.py +25 -0
- modules/observability/src/agent_os_observability/dashboards.py +896 -0
- modules/observability/src/agent_os_observability/metrics.py +396 -0
- modules/observability/src/agent_os_observability/server.py +221 -0
- modules/observability/src/agent_os_observability/tracer.py +226 -0
- modules/primitives/.gitignore +8 -0
- modules/primitives/README.md +62 -0
- modules/primitives/agent_primitives/__init__.py +22 -0
- modules/primitives/agent_primitives/failures.py +82 -0
- modules/primitives/agent_primitives/py.typed +0 -0
- modules/primitives/pyproject.toml +68 -0
- modules/scak/.github/copilot-instructions.md +396 -0
- modules/scak/.github/workflows/release.yml +117 -0
- modules/scak/.gitignore +32 -0
- modules/scak/CHANGELOG.md +173 -0
- modules/scak/CITATION.cff +62 -0
- modules/scak/CONTRIBUTING.md +429 -0
- modules/scak/Dockerfile +58 -0
- modules/scak/ENTERPRISE_FEATURES.md +518 -0
- modules/scak/IMPLEMENTATION_SUMMARY.md +206 -0
- modules/scak/LIMITATIONS.md +565 -0
- modules/scak/MANIFEST.in +16 -0
- modules/scak/NOVELTY.md +535 -0
- modules/scak/README.md +928 -0
- modules/scak/RESEARCH.md +670 -0
- modules/scak/agent_kernel/__init__.py +66 -0
- modules/scak/agent_kernel/analyzer.py +432 -0
- modules/scak/agent_kernel/auditor.py +31 -0
- modules/scak/agent_kernel/completeness_auditor.py +234 -0
- modules/scak/agent_kernel/detector.py +200 -0
- modules/scak/agent_kernel/kernel.py +741 -0
- modules/scak/agent_kernel/memory_manager.py +82 -0
- modules/scak/agent_kernel/models.py +372 -0
- modules/scak/agent_kernel/nudge_mechanism.py +260 -0
- modules/scak/agent_kernel/outcome_analyzer.py +335 -0
- modules/scak/agent_kernel/patcher.py +579 -0
- modules/scak/agent_kernel/semantic_analyzer.py +313 -0
- modules/scak/agent_kernel/semantic_purge.py +346 -0
- modules/scak/agent_kernel/simulator.py +447 -0
- modules/scak/agent_kernel/teacher.py +82 -0
- modules/scak/agent_kernel/triage.py +149 -0
- modules/scak/build_and_publish.ps1 +74 -0
- modules/scak/build_and_publish.sh +74 -0
- modules/scak/cli.py +471 -0
- modules/scak/dashboard.py +462 -0
- modules/scak/datasets/DATASET_CARD.md +219 -0
- modules/scak/datasets/README.md +143 -0
- modules/scak/datasets/gaia_vague_queries/vague_queries.json +262 -0
- modules/scak/datasets/hf_upload/README.md +219 -0
- modules/scak/datasets/hf_upload/scak_gaia_laziness.jsonl +50 -0
- modules/scak/datasets/prepare_hf_datasets.py +145 -0
- modules/scak/datasets/red_team/jailbreak_patterns.json +202 -0
- modules/scak/docker-compose.yml +99 -0
- modules/scak/docs/Adaptive-Memory-Hierarchy.md +319 -0
- modules/scak/docs/Data-Contracts-and-Schemas.md +285 -0
- modules/scak/docs/Dual-Loop-Architecture.md +344 -0
- modules/scak/docs/Enhanced-Features.md +612 -0
- modules/scak/docs/LANGCHAIN_INTEGRATION.md +572 -0
- modules/scak/docs/README.md +128 -0
- modules/scak/docs/Reference-Implementations.md +163 -0
- modules/scak/docs/SCAK_V2.md +374 -0
- modules/scak/docs/Three-Failure-Types.md +178 -0
- modules/scak/examples/basic_example.py +155 -0
- modules/scak/examples/circuit_breaker_lazy_eval_demo.py +243 -0
- modules/scak/examples/langchain_integration_example.py +339 -0
- modules/scak/examples/layer4_demo.py +243 -0
- modules/scak/examples/production_features_demo.py +353 -0
- modules/scak/examples/quick_demo.py +79 -0
- modules/scak/examples/scak_v2_demo.py +252 -0
- modules/scak/experiments/README.md +438 -0
- modules/scak/experiments/ablation_studies/README.md +192 -0
- modules/scak/experiments/ablation_studies/ablation_no_audit.py +116 -0
- modules/scak/experiments/ablation_studies/ablation_no_purge.py +133 -0
- modules/scak/experiments/chaos_engineering/README.md +332 -0
- modules/scak/experiments/context_efficiency_test.py +328 -0
- modules/scak/experiments/gaia_benchmark/README.md +208 -0
- modules/scak/experiments/laziness_benchmark.py +179 -0
- modules/scak/experiments/long_horizon_task_experiment.py +252 -0
- modules/scak/experiments/multi_agent_rag_experiment.py +284 -0
- modules/scak/experiments/results/ablation_table.md +12 -0
- modules/scak/experiments/results/long_horizon.json +36 -0
- modules/scak/experiments/results/multi_agent_rag.json +66 -0
- modules/scak/experiments/run_comprehensive_ablations.py +332 -0
- modules/scak/experiments/test_auditor_patcher_integration.py +251 -0
- modules/scak/notebooks/getting_started.ipynb +33 -0
- modules/scak/paper/ARXIV_SUBMISSION_METADATA.txt +109 -0
- modules/scak/paper/PAPER_CHECKLIST.md +304 -0
- modules/scak/paper/Paper.pdf +0 -0
- modules/scak/paper/README.md +113 -0
- modules/scak/paper/appendix.md +351 -0
- modules/scak/paper/arxiv/bibliography.bib +284 -0
- modules/scak/paper/arxiv/fig1_ooda_architecture.pdf +0 -0
- modules/scak/paper/arxiv/fig2_memory_hierarchy.pdf +0 -0
- modules/scak/paper/arxiv/fig3_gaia_results.pdf +0 -0
- modules/scak/paper/arxiv/fig4_ablation_heatmap.pdf +0 -0
- modules/scak/paper/arxiv/fig5_context_reduction.pdf +0 -0
- modules/scak/paper/arxiv/fig6_mttr_boxplot.pdf +0 -0
- modules/scak/paper/arxiv/main.aux +103 -0
- modules/scak/paper/arxiv/main.bbl +113 -0
- modules/scak/paper/arxiv/main.blg +55 -0
- modules/scak/paper/arxiv/main.out +31 -0
- modules/scak/paper/arxiv/main.pdf +0 -0
- modules/scak/paper/arxiv/main.tex +482 -0
- modules/scak/paper/arxiv_submission/bibliography.bib +284 -0
- modules/scak/paper/arxiv_submission/fig1_ooda_architecture.pdf +0 -0
- modules/scak/paper/arxiv_submission/fig2_memory_hierarchy.pdf +0 -0
- modules/scak/paper/arxiv_submission/fig3_gaia_results.pdf +0 -0
- modules/scak/paper/arxiv_submission/fig4_ablation_heatmap.pdf +0 -0
- modules/scak/paper/arxiv_submission/fig5_context_reduction.pdf +0 -0
- modules/scak/paper/arxiv_submission/fig6_mttr_boxplot.pdf +0 -0
- modules/scak/paper/arxiv_submission/main.aux +103 -0
- modules/scak/paper/arxiv_submission/main.bbl +113 -0
- modules/scak/paper/arxiv_submission/main.blg +55 -0
- modules/scak/paper/arxiv_submission/main.out +31 -0
- modules/scak/paper/arxiv_submission/main.pdf +0 -0
- modules/scak/paper/arxiv_submission/main.tex +482 -0
- modules/scak/paper/arxiv_submission.tar.gz +0 -0
- modules/scak/paper/bibliography.bib +284 -0
- modules/scak/paper/build.sh +55 -0
- modules/scak/paper/figures/README.md +32 -0
- modules/scak/paper/figures/fig1_ooda_architecture.md +75 -0
- modules/scak/paper/figures/fig1_ooda_architecture.pdf +0 -0
- modules/scak/paper/figures/fig1_ooda_architecture.png +0 -0
- modules/scak/paper/figures/fig2_memory_hierarchy.md +83 -0
- modules/scak/paper/figures/fig2_memory_hierarchy.pdf +0 -0
- modules/scak/paper/figures/fig2_memory_hierarchy.png +0 -0
- modules/scak/paper/figures/fig3_gaia_results.md +64 -0
- modules/scak/paper/figures/fig3_gaia_results.pdf +0 -0
- modules/scak/paper/figures/fig3_gaia_results.png +0 -0
- modules/scak/paper/figures/fig4_ablation_heatmap.md +64 -0
- modules/scak/paper/figures/fig4_ablation_heatmap.pdf +0 -0
- modules/scak/paper/figures/fig4_ablation_heatmap.png +0 -0
- modules/scak/paper/figures/fig5_context_reduction.md +71 -0
- modules/scak/paper/figures/fig5_context_reduction.pdf +0 -0
- modules/scak/paper/figures/fig5_context_reduction.png +0 -0
- modules/scak/paper/figures/fig6_mttr_boxplot.md +80 -0
- modules/scak/paper/figures/fig6_mttr_boxplot.pdf +0 -0
- modules/scak/paper/figures/fig6_mttr_boxplot.png +0 -0
- modules/scak/paper/figures/generate_figures.py +463 -0
- modules/scak/paper/main.aux +103 -0
- modules/scak/paper/main.bbl +113 -0
- modules/scak/paper/main.blg +55 -0
- modules/scak/paper/main.md +192 -0
- modules/scak/paper/main.out +31 -0
- modules/scak/paper/main.pdf +0 -0
- modules/scak/paper/main.tex +482 -0
- modules/scak/reproducibility/ABLATIONS.md +225 -0
- modules/scak/reproducibility/Dockerfile.reproducibility +34 -0
- modules/scak/reproducibility/README.md +421 -0
- modules/scak/reproducibility/requirements-pinned.txt +32 -0
- modules/scak/reproducibility/run_all_experiments.py +395 -0
- modules/scak/reproducibility/seed_control.py +53 -0
- modules/scak/reproducibility/statistical_analysis.py +302 -0
- modules/scak/requirements.txt +50 -0
- modules/scak/setup.py +93 -0
- modules/scak/src/__init__.py +124 -0
- modules/scak/src/agents/__init__.py +13 -0
- modules/scak/src/agents/conflict_resolution.py +732 -0
- modules/scak/src/agents/orchestrator.py +761 -0
- modules/scak/src/agents/pubsub.py +484 -0
- modules/scak/src/agents/shadow_teacher.py +344 -0
- modules/scak/src/agents/swarm.py +661 -0
- modules/scak/src/agents/worker.py +357 -0
- modules/scak/src/integrations/__init__.py +81 -0
- modules/scak/src/integrations/cmvk_adapter.py +430 -0
- modules/scak/src/integrations/control_plane_adapter.py +601 -0
- modules/scak/src/integrations/langchain_integration.py +902 -0
- modules/scak/src/interfaces/__init__.py +59 -0
- modules/scak/src/interfaces/llm_clients.py +505 -0
- modules/scak/src/interfaces/openapi_tools.py +611 -0
- modules/scak/src/interfaces/plugin_system.py +605 -0
- modules/scak/src/interfaces/protocols.py +365 -0
- modules/scak/src/interfaces/telemetry.py +464 -0
- modules/scak/src/interfaces/tool_registry.py +547 -0
- modules/scak/src/kernel/__init__.py +100 -0
- modules/scak/src/kernel/auditor.py +305 -0
- modules/scak/src/kernel/circuit_breaker.py +398 -0
- modules/scak/src/kernel/core.py +724 -0
- modules/scak/src/kernel/distributed.py +667 -0
- modules/scak/src/kernel/evolution.py +455 -0
- modules/scak/src/kernel/failover.py +621 -0
- modules/scak/src/kernel/governance.py +710 -0
- modules/scak/src/kernel/governance_v2.py +603 -0
- modules/scak/src/kernel/lazy_evaluator.py +514 -0
- modules/scak/src/kernel/load_testing.py +633 -0
- modules/scak/src/kernel/memory.py +945 -0
- modules/scak/src/kernel/patcher.py +581 -0
- modules/scak/src/kernel/rubric.py +419 -0
- modules/scak/src/kernel/schemas.py +390 -0
- modules/scak/src/kernel/skill_mapper.py +309 -0
- modules/scak/src/kernel/triage.py +149 -0
- modules/scak/src/mocks/__init__.py +99 -0
- modules/scak/tests/__init__.py +1 -0
- modules/scak/tests/test_circuit_breaker.py +403 -0
- modules/scak/tests/test_conflict_resolution.py +287 -0
- modules/scak/tests/test_dual_loop.py +463 -0
- modules/scak/tests/test_enhanced_features.py +421 -0
- modules/scak/tests/test_failover_and_load.py +438 -0
- modules/scak/tests/test_governance.py +185 -0
- modules/scak/tests/test_kernel.py +359 -0
- modules/scak/tests/test_langchain_integration.py +451 -0
- modules/scak/tests/test_lazy_evaluator.py +465 -0
- modules/scak/tests/test_llm_clients.py +122 -0
- modules/scak/tests/test_memory_controller.py +528 -0
- modules/scak/tests/test_orchestrator.py +181 -0
- modules/scak/tests/test_phase3_integration.py +265 -0
- modules/scak/tests/test_pubsub_swarm.py +203 -0
- modules/scak/tests/test_reference_implementations.py +240 -0
- modules/scak/tests/test_rubric.py +363 -0
- modules/scak/tests/test_scak_v2.py +651 -0
- modules/scak/tests/test_skill_mapper.py +217 -0
- modules/scak/tests/test_specific_failures.py +393 -0
- modules/scak/tests/test_tool_registry.py +264 -0
- modules/scak/tests/test_tools_and_plugins.py +303 -0
- modules/scak/tests/test_triage.py +596 -0
- modules/scak/tests/test_write_through.py +319 -0
- agent_os_kernel-1.1.0.dist-info/METADATA +0 -400
- agent_os_kernel-1.1.0.dist-info/RECORD +0 -12
- {agent_os_kernel-1.1.0.dist-info → agent_os_kernel-1.2.0.dist-info}/WHEEL +0 -0
- {agent_os_kernel-1.1.0.dist-info → agent_os_kernel-1.2.0.dist-info}/licenses/LICENSE +0 -0
|
@@ -0,0 +1,296 @@
|
|
|
1
|
+
# Agent Control Plane vs AIOS: Architecture Comparison
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
| Aspect | AIOS (AGI Research) | Agent Control Plane |
|
|
6
|
+
|--------|---------------------|---------------------|
|
|
7
|
+
| **Primary Focus** | Efficiency (throughput, latency) | Safety (policy enforcement, audit) |
|
|
8
|
+
| **Target Audience** | Researchers, ML Engineers | Enterprise, Production Systems |
|
|
9
|
+
| **Kernel Philosophy** | Resource optimization | Security boundary |
|
|
10
|
+
| **Failure Mode** | Graceful degradation | Kernel panic on violation |
|
|
11
|
+
| **Policy Enforcement** | Optional/configurable | Mandatory, kernel-level |
|
|
12
|
+
| **Paper Venue** | COLM 2025 | ASPLOS 2026 (target) |
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Detailed Comparison
|
|
17
|
+
|
|
18
|
+
### 1. Kernel Architecture
|
|
19
|
+
|
|
20
|
+
#### AIOS Kernel
|
|
21
|
+
```
|
|
22
|
+
┌─────────────────────────────────────┐
|
|
23
|
+
│ AIOS Kernel │
|
|
24
|
+
├─────────────────────────────────────┤
|
|
25
|
+
│ ┌─────────┐ ┌─────────────────┐ │
|
|
26
|
+
│ │Scheduler│ │ Context Manager │ │
|
|
27
|
+
│ └─────────┘ └─────────────────┘ │
|
|
28
|
+
│ ┌─────────┐ ┌─────────────────┐ │
|
|
29
|
+
│ │Memory │ │ Tool Manager │ │
|
|
30
|
+
│ │Manager │ │ │ │
|
|
31
|
+
│ └─────────┘ └─────────────────┘ │
|
|
32
|
+
│ ┌─────────────────────────────────┐│
|
|
33
|
+
│ │ Access Control (Optional) ││
|
|
34
|
+
│ └─────────────────────────────────┘│
|
|
35
|
+
└─────────────────────────────────────┘
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
**Focus**: GPU utilization, FIFO/Round-Robin scheduling, context switching
|
|
39
|
+
|
|
40
|
+
#### Agent Control Plane Kernel
|
|
41
|
+
```
|
|
42
|
+
┌─────────────────────────────────────┐
|
|
43
|
+
│ Kernel Space (Ring 0) │
|
|
44
|
+
│ ┌─────────────────────────────────┐│
|
|
45
|
+
│ │ Policy Engine (Mandatory) ││
|
|
46
|
+
│ └─────────────────────────────────┘│
|
|
47
|
+
│ ┌─────────┐ ┌─────────────────┐ │
|
|
48
|
+
│ │ Flight │ │ Signal │ │
|
|
49
|
+
│ │Recorder │ │ Dispatcher │ │
|
|
50
|
+
│ └─────────┘ └─────────────────┘ │
|
|
51
|
+
│ ┌─────────┐ ┌─────────────────┐ │
|
|
52
|
+
│ │ VFS │ │ IPC Router │ │
|
|
53
|
+
│ │ Manager │ │ │ │
|
|
54
|
+
│ └─────────┘ └─────────────────┘ │
|
|
55
|
+
├─────────────────────────────────────┤
|
|
56
|
+
│ User Space (Ring 3) │
|
|
57
|
+
│ ┌─────────────────────────────────┐│
|
|
58
|
+
│ │ LLM Generation (Isolated) ││
|
|
59
|
+
│ │ Tool Execution ││
|
|
60
|
+
│ │ Agent Logic ││
|
|
61
|
+
│ └─────────────────────────────────┘│
|
|
62
|
+
└─────────────────────────────────────┘
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**Focus**: Isolation, policy enforcement, audit trail, crash containment
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
### 2. Key Differentiators
|
|
70
|
+
|
|
71
|
+
| Feature | AIOS | Agent Control Plane |
|
|
72
|
+
|---------|------|---------------------|
|
|
73
|
+
| **Scheduling** | FIFO, Round-Robin, Priority | Policy-based, Safety-first |
|
|
74
|
+
| **Context Switching** | Performance optimized | Checkpoint + Rollback |
|
|
75
|
+
| **Memory Model** | Short-term + Long-term | VFS with mount points |
|
|
76
|
+
| **Signal Handling** | None | POSIX-style (SIGSTOP, SIGKILL, etc.) |
|
|
77
|
+
| **Policy Violation** | Log and continue | Kernel panic (0% tolerance) |
|
|
78
|
+
| **Crash Isolation** | Same process | Kernel survives user crashes |
|
|
79
|
+
| **IPC** | Function calls | Typed pipes with policy check |
|
|
80
|
+
| **Audit** | Logging | Flight recorder (black box) |
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
### 3. Why Safety Over Efficiency?
|
|
85
|
+
|
|
86
|
+
#### The Enterprise Reality
|
|
87
|
+
|
|
88
|
+
**AIOS Approach**:
|
|
89
|
+
> "If an agent is slow, optimize it. If it fails, retry it."
|
|
90
|
+
|
|
91
|
+
**Our Approach**:
|
|
92
|
+
> "If an agent violates policy, kill it immediately. No exceptions."
|
|
93
|
+
|
|
94
|
+
#### Use Case: Financial Services
|
|
95
|
+
|
|
96
|
+
```python
|
|
97
|
+
# AIOS: Efficiency-first
|
|
98
|
+
async def transfer_money(agent, amount):
|
|
99
|
+
# AIOS focuses on throughput
|
|
100
|
+
result = await agent.execute(f"Transfer ${amount}")
|
|
101
|
+
return result # Hope nothing went wrong
|
|
102
|
+
|
|
103
|
+
# Agent Control Plane: Safety-first
|
|
104
|
+
async def transfer_money(kernel, agent_ctx, amount):
|
|
105
|
+
# Policy check BEFORE execution
|
|
106
|
+
allowed = await agent_ctx.check_policy("transfer", f"amount={amount}")
|
|
107
|
+
if not allowed:
|
|
108
|
+
# Kernel panic - cannot proceed
|
|
109
|
+
raise PolicyViolation("Transfer exceeds limit")
|
|
110
|
+
|
|
111
|
+
# Execute with full audit trail
|
|
112
|
+
result = await agent_ctx.syscall(SyscallType.SYS_EXEC,
|
|
113
|
+
tool="transfer",
|
|
114
|
+
args={"amount": amount}
|
|
115
|
+
)
|
|
116
|
+
# Flight recorder has everything
|
|
117
|
+
return result
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
### 4. Competitive Advantages
|
|
123
|
+
|
|
124
|
+
#### For Enterprise Adoption
|
|
125
|
+
|
|
126
|
+
| Concern | AIOS Answer | Our Answer |
|
|
127
|
+
|---------|-------------|------------|
|
|
128
|
+
| "What if agent goes rogue?" | "Monitor and intervene" | "Kernel panic, immediate termination" |
|
|
129
|
+
| "Can we audit all actions?" | "Logging available" | "Flight recorder - every syscall recorded" |
|
|
130
|
+
| "What about data exfiltration?" | "Access control optional" | "VFS mount points, policy per-path" |
|
|
131
|
+
| "Regulatory compliance?" | "Not primary focus" | "Built-in governance layer" |
|
|
132
|
+
| "Multi-tenant isolation?" | "Process-level" | "Kernel/User space separation" |
|
|
133
|
+
|
|
134
|
+
#### For Research Community
|
|
135
|
+
|
|
136
|
+
| Aspect | AIOS | Agent Control Plane |
|
|
137
|
+
|--------|------|---------------------|
|
|
138
|
+
| **Novel Contribution** | LLM Scheduling algorithms | Safety-first kernel design |
|
|
139
|
+
| **ASPLOS Fit** | Systems efficiency | OS abstractions for AI |
|
|
140
|
+
| **eBPF Potential** | Not explored | Network monitoring extension |
|
|
141
|
+
| **Reproducibility** | Benchmark suite | Differential auditing |
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
### 5. Technical Deep Dive: Signal Handling
|
|
146
|
+
|
|
147
|
+
AIOS has no signal mechanism. Agents are black boxes.
|
|
148
|
+
|
|
149
|
+
Agent Control Plane implements POSIX-style signals:
|
|
150
|
+
|
|
151
|
+
```python
|
|
152
|
+
class AgentSignal(IntEnum):
|
|
153
|
+
SIGSTOP = 1 # Pause for inspection (shadow mode)
|
|
154
|
+
SIGCONT = 2 # Resume execution
|
|
155
|
+
SIGINT = 3 # Graceful interrupt
|
|
156
|
+
SIGKILL = 4 # Immediate termination (non-maskable)
|
|
157
|
+
SIGTERM = 5 # Request graceful shutdown
|
|
158
|
+
SIGPOLICY = 8 # Policy violation (triggers SIGKILL)
|
|
159
|
+
SIGTRUST = 9 # Trust boundary crossed (triggers SIGKILL)
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
**Why this matters**:
|
|
163
|
+
- SIGSTOP enables "shadow mode" - pause and inspect without termination
|
|
164
|
+
- SIGKILL is non-maskable - agents CANNOT ignore it
|
|
165
|
+
- SIGPOLICY is automatic on violation - 0% tolerance guarantee
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
### 6. Memory Model Comparison
|
|
170
|
+
|
|
171
|
+
#### AIOS Memory
|
|
172
|
+
```
|
|
173
|
+
Agent
|
|
174
|
+
├── Short-term Memory (conversation buffer)
|
|
175
|
+
└── Long-term Memory (persistent storage)
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
#### Agent Control Plane VFS
|
|
179
|
+
```
|
|
180
|
+
/
|
|
181
|
+
├── mem/
|
|
182
|
+
│ ├── working/ # Ephemeral scratchpad
|
|
183
|
+
│ ├── episodic/ # Experience logs
|
|
184
|
+
│ ├── semantic/ # Facts (vector store mount)
|
|
185
|
+
│ └── procedural/ # Learned skills
|
|
186
|
+
├── state/
|
|
187
|
+
│ └── checkpoints/ # Snapshots for rollback
|
|
188
|
+
├── tools/ # Tool interfaces
|
|
189
|
+
├── policy/ # Read-only policy files
|
|
190
|
+
└── ipc/ # Inter-process communication
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
**Why VFS?**
|
|
194
|
+
- **Uniform interface**: Same API for memory, state, tools
|
|
195
|
+
- **Backend agnostic**: Mount Pinecone, Redis, or file system
|
|
196
|
+
- **Policy per-path**: `/policy` is read-only from user space
|
|
197
|
+
- **POSIX familiar**: Engineers know this model
|
|
198
|
+
|
|
199
|
+
---
|
|
200
|
+
|
|
201
|
+
### 7. IPC Comparison
|
|
202
|
+
|
|
203
|
+
#### AIOS: Direct function calls
|
|
204
|
+
```python
|
|
205
|
+
# AIOS - agents call each other directly
|
|
206
|
+
result = agent_b.process(agent_a.output)
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
#### Agent Control Plane: Typed pipes with policy
|
|
210
|
+
```python
|
|
211
|
+
# Our approach - policy-enforced pipes
|
|
212
|
+
pipeline = (
|
|
213
|
+
research_agent
|
|
214
|
+
| PolicyCheckPipe(allowed_types=["ResearchResult"])
|
|
215
|
+
| summary_agent
|
|
216
|
+
)
|
|
217
|
+
result = await pipeline.execute(query)
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
**Why pipes?**
|
|
221
|
+
- Type checking at pipe level (not runtime exceptions)
|
|
222
|
+
- Policy enforcement at every hop
|
|
223
|
+
- Backpressure prevents cascade failures
|
|
224
|
+
- Full audit trail through flight recorder
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
228
|
+
### 8. Positioning for ASPLOS 2026
|
|
229
|
+
|
|
230
|
+
#### AIOS Paper Focus (COLM 2025)
|
|
231
|
+
- Novel scheduling algorithms for LLMs
|
|
232
|
+
- Context switching performance
|
|
233
|
+
- Throughput benchmarks
|
|
234
|
+
|
|
235
|
+
#### Our Paper Focus (ASPLOS 2026 Target)
|
|
236
|
+
- Novel OS abstractions for AI safety
|
|
237
|
+
- Kernel/User space separation for agent isolation
|
|
238
|
+
- POSIX-inspired primitives (signals, VFS, pipes)
|
|
239
|
+
- eBPF extension for network monitoring (future)
|
|
240
|
+
|
|
241
|
+
**Key Differentiator**: We are not competing on efficiency. We are defining the **safety contract** for enterprise AI agents.
|
|
242
|
+
|
|
243
|
+
---
|
|
244
|
+
|
|
245
|
+
### 9. eBPF Research Direction
|
|
246
|
+
|
|
247
|
+
#### Concept: Kernel-level network monitoring for agents
|
|
248
|
+
|
|
249
|
+
```
|
|
250
|
+
┌─────────────────────────────────────────┐
|
|
251
|
+
│ Agent Process │
|
|
252
|
+
├─────────────────────────────────────────┤
|
|
253
|
+
│ HTTP Request to api.openai.com │
|
|
254
|
+
│ │ │
|
|
255
|
+
│ ▼ │
|
|
256
|
+
│ ┌─────────────────────────────────┐ │
|
|
257
|
+
│ │ eBPF Probe (Kernel Space) │ │
|
|
258
|
+
│ │ - Monitor all network calls │ │
|
|
259
|
+
│ │ - Block unauthorized endpoints │ │
|
|
260
|
+
│ │ - Log payload hashes │ │
|
|
261
|
+
│ └─────────────────────────────────┘ │
|
|
262
|
+
│ │ │
|
|
263
|
+
│ ▼ │
|
|
264
|
+
│ Network Stack │
|
|
265
|
+
└─────────────────────────────────────────┘
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
**Why eBPF?**
|
|
269
|
+
- Monitoring happens OUTSIDE Python runtime
|
|
270
|
+
- Cannot be bypassed by agent code
|
|
271
|
+
- Sub-millisecond overhead
|
|
272
|
+
- ASPLOS loves eBPF papers
|
|
273
|
+
|
|
274
|
+
---
|
|
275
|
+
|
|
276
|
+
### 10. Summary: When to Use What
|
|
277
|
+
|
|
278
|
+
| Use Case | Recommended |
|
|
279
|
+
|----------|-------------|
|
|
280
|
+
| Research experiments | AIOS |
|
|
281
|
+
| Production enterprise | Agent Control Plane |
|
|
282
|
+
| Throughput benchmarks | AIOS |
|
|
283
|
+
| Compliance-heavy industries | Agent Control Plane |
|
|
284
|
+
| Multi-agent chaos | AIOS (let them fight) |
|
|
285
|
+
| Multi-agent governance | Agent Control Plane |
|
|
286
|
+
|
|
287
|
+
---
|
|
288
|
+
|
|
289
|
+
## Conclusion
|
|
290
|
+
|
|
291
|
+
AIOS and Agent Control Plane are **not competing** - they solve different problems.
|
|
292
|
+
|
|
293
|
+
- **AIOS**: "How do we run 1000 agents efficiently?"
|
|
294
|
+
- **Agent Control Plane**: "How do we run 10 agents without any of them going rogue?"
|
|
295
|
+
|
|
296
|
+
For enterprise adoption, the second question matters more.
|
|
@@ -0,0 +1,367 @@
|
|
|
1
|
+
# Bibliography
|
|
2
|
+
|
|
3
|
+
Complete list of research papers, technical reports, and industry publications that inform the Agent Control Plane's design.
|
|
4
|
+
|
|
5
|
+
## Academic Papers
|
|
6
|
+
|
|
7
|
+
### Agent Safety
|
|
8
|
+
|
|
9
|
+
1. **A Safety Framework for Real-World Agentic Systems**
|
|
10
|
+
- Authors: Various
|
|
11
|
+
- Published: arXiv:2511.21990 (2024)
|
|
12
|
+
- URL: https://arxiv.org/abs/2511.21990
|
|
13
|
+
- Key Contribution: Contextual risk management framework for agentic AI
|
|
14
|
+
|
|
15
|
+
2. **Red-Teaming Agentic AI: Evaluation Frameworks and Benchmarks**
|
|
16
|
+
- Authors: Various
|
|
17
|
+
- Published: arXiv:2511.21990 (2024)
|
|
18
|
+
- URL: https://arxiv.org/abs/2511.21990
|
|
19
|
+
- Key Contribution: Adversarial testing methodologies and benchmark datasets
|
|
20
|
+
|
|
21
|
+
3. **MAESTRO: A Threat Modeling Framework for Agentic AI**
|
|
22
|
+
- Organization: Cloud Security Alliance
|
|
23
|
+
- Published: 2025
|
|
24
|
+
- URL: https://cloudsecurityalliance.org/
|
|
25
|
+
- Key Contribution: Multi-agent security threat taxonomy
|
|
26
|
+
|
|
27
|
+
### Multi-Agent Systems
|
|
28
|
+
|
|
29
|
+
4. **Multi-Agent Systems: A Survey**
|
|
30
|
+
- Authors: Various
|
|
31
|
+
- Published: arXiv:2308.05391 (2023)
|
|
32
|
+
- URL: https://arxiv.org/abs/2308.05391
|
|
33
|
+
- Key Contribution: Comprehensive overview of MAS architectures, hierarchical control patterns
|
|
34
|
+
|
|
35
|
+
5. **Fault-Tolerant Multi-Agent Systems**
|
|
36
|
+
- Journal: IEEE Transactions on Systems, Man, and Cybernetics
|
|
37
|
+
- Year: 2024
|
|
38
|
+
- Key Contribution: Resilience models, circuit breaker patterns, failure recovery
|
|
39
|
+
|
|
40
|
+
6. **Multimodal Agents: A Survey**
|
|
41
|
+
- Authors: Various
|
|
42
|
+
- Published: arXiv:2404.12390 (2024)
|
|
43
|
+
- URL: https://arxiv.org/abs/2404.12390
|
|
44
|
+
- Key Contribution: Vision, audio, and multi-modal fusion techniques for agents
|
|
45
|
+
|
|
46
|
+
### Privacy and Security
|
|
47
|
+
|
|
48
|
+
7. **Privacy in Agentic Systems**
|
|
49
|
+
- Authors: Various
|
|
50
|
+
- Published: arXiv:2409.1087 (2024)
|
|
51
|
+
- URL: https://arxiv.org/abs/2409.1087
|
|
52
|
+
- Key Contribution: Differential privacy, federated learning for agents
|
|
53
|
+
|
|
54
|
+
8. **Agent-to-Agent Communication Security**
|
|
55
|
+
- Conference: ACM CCS (Computer and Communications Security)
|
|
56
|
+
- Year: 2024
|
|
57
|
+
- Key Contribution: Authentication and authorization patterns for inter-agent messaging
|
|
58
|
+
|
|
59
|
+
### Governance and Ethics
|
|
60
|
+
|
|
61
|
+
9. **Responsible AI Governance: A Review**
|
|
62
|
+
- Publisher: ScienceDirect
|
|
63
|
+
- Year: 2024
|
|
64
|
+
- Key Contribution: Procedural practices for AI governance, risk-based frameworks
|
|
65
|
+
|
|
66
|
+
10. **Practices for Governing Agentic AI Systems**
|
|
67
|
+
- Organization: OpenAI
|
|
68
|
+
- Year: 2023
|
|
69
|
+
- URL: https://openai.com/research/
|
|
70
|
+
- Key Contribution: Pre/post-deployment governance, monitoring strategies
|
|
71
|
+
|
|
72
|
+
11. **Top 12 Papers on Agentic AI Governance**
|
|
73
|
+
- Platform: Substack
|
|
74
|
+
- Year: 2025
|
|
75
|
+
- Key Contribution: Survey of governance approaches across industry and academia
|
|
76
|
+
|
|
77
|
+
## Industry Reports and Technical Publications
|
|
78
|
+
|
|
79
|
+
12. **Unlocking Exponential Value with AI Agent Orchestration**
|
|
80
|
+
- Organization: Deloitte
|
|
81
|
+
- Year: 2025
|
|
82
|
+
- URL: https://www2.deloitte.com/
|
|
83
|
+
- Key Contribution: Enterprise patterns, ROI models for agent systems
|
|
84
|
+
|
|
85
|
+
13. **AI Agent Orchestration Frameworks: A Comparative Study**
|
|
86
|
+
- Organization: Kubiya
|
|
87
|
+
- Year: 2025
|
|
88
|
+
- Key Contribution: LangChain, AutoGen, CrewAI comparisons
|
|
89
|
+
|
|
90
|
+
14. **Evaluating Agentic AI: Frameworks and Metrics**
|
|
91
|
+
- Organization: World Economic Forum
|
|
92
|
+
- Year: 2025
|
|
93
|
+
- URL: https://www.weforum.org/
|
|
94
|
+
- Key Contribution: Standardized evaluation metrics
|
|
95
|
+
|
|
96
|
+
## Technical Specifications and Standards
|
|
97
|
+
|
|
98
|
+
15. **Model Context Protocol (MCP) Specification**
|
|
99
|
+
- Organization: Anthropic
|
|
100
|
+
- Year: 2024
|
|
101
|
+
- URL: https://modelcontextprotocol.io/
|
|
102
|
+
- Key Contribution: Standardized protocol for LLM context management
|
|
103
|
+
|
|
104
|
+
16. **Agent-to-Agent (A2A) Protocol**
|
|
105
|
+
- Organization: Open Agent Architecture Consortium
|
|
106
|
+
- Year: 2024
|
|
107
|
+
- Key Contribution: Inter-agent communication standard
|
|
108
|
+
|
|
109
|
+
17. **Attribute-Based Access Control (ABAC)**
|
|
110
|
+
- Standard: NIST SP 800-162
|
|
111
|
+
- Year: 2014 (updated)
|
|
112
|
+
- URL: https://csrc.nist.gov/publications/
|
|
113
|
+
- Key Contribution: Fine-grained access control based on attributes
|
|
114
|
+
|
|
115
|
+
## Competitive Framework Documentation
|
|
116
|
+
|
|
117
|
+
18. **LangChain Documentation**
|
|
118
|
+
- URL: https://python.langchain.com/
|
|
119
|
+
- Key Insights: Tool integration patterns, chain abstractions
|
|
120
|
+
|
|
121
|
+
19. **AutoGen: Multi-Agent Conversations**
|
|
122
|
+
- Organization: Microsoft Research
|
|
123
|
+
- URL: https://microsoft.github.io/autogen/
|
|
124
|
+
- Key Insights: Conversational agent patterns, async coordination
|
|
125
|
+
|
|
126
|
+
20. **CrewAI Documentation**
|
|
127
|
+
- URL: https://docs.crewai.com/
|
|
128
|
+
- Key Insights: Role-based agent orchestration, crew hierarchies
|
|
129
|
+
|
|
130
|
+
21. **LangGraph**
|
|
131
|
+
- Organization: LangChain
|
|
132
|
+
- URL: https://langchain-ai.github.io/langgraph/
|
|
133
|
+
- Key Insights: Graph-based agent workflows, state management
|
|
134
|
+
|
|
135
|
+
22. **Guardrails.ai**
|
|
136
|
+
- URL: https://www.guardrailsai.com/
|
|
137
|
+
- Key Insights: Output validation, guardrail composition patterns
|
|
138
|
+
|
|
139
|
+
## Books and Comprehensive References
|
|
140
|
+
|
|
141
|
+
23. **"Artificial Intelligence: A Modern Approach" (4th Edition)**
|
|
142
|
+
- Authors: Stuart Russell, Peter Norvig
|
|
143
|
+
- Publisher: Pearson (2020)
|
|
144
|
+
- Chapters: 11 (Planning), 19 (Learning Agents), 26 (AI Philosophy)
|
|
145
|
+
|
|
146
|
+
24. **"Multi-Agent Systems" (2nd Edition)**
|
|
147
|
+
- Author: Gerhard Weiss
|
|
148
|
+
- Publisher: MIT Press (2013)
|
|
149
|
+
- Key Topics: Coordination, negotiation, distributed problem solving
|
|
150
|
+
|
|
151
|
+
### Safety and Guardrails (2025-2026)
|
|
152
|
+
|
|
153
|
+
27. **LlamaGuard-2: Advanced Content Moderation**
|
|
154
|
+
- Organization: Meta AI
|
|
155
|
+
- Year: 2025
|
|
156
|
+
- URL: https://ai.meta.com/research/publications/
|
|
157
|
+
- Key Contribution: Multi-turn safety classification, improved jailbreak detection
|
|
158
|
+
|
|
159
|
+
28. **WildGuard: Adversarial Safety Testing**
|
|
160
|
+
- Authors: Various
|
|
161
|
+
- Published: arXiv:2406.18495 (2024, updated 2025)
|
|
162
|
+
- URL: https://arxiv.org/abs/2406.18495
|
|
163
|
+
- Key Contribution: Large-scale adversarial prompt dataset, automated red-teaming
|
|
164
|
+
|
|
165
|
+
29. **Constitutional AI: Harmlessness from AI Feedback**
|
|
166
|
+
- Organization: Anthropic
|
|
167
|
+
- Year: 2024
|
|
168
|
+
- URL: https://www.anthropic.com/research
|
|
169
|
+
- Key Contribution: RLHF with AI-generated feedback, value alignment
|
|
170
|
+
|
|
171
|
+
30. **Guardrails AI: Output Validation Framework**
|
|
172
|
+
- Organization: Guardrails AI
|
|
173
|
+
- Year: 2024-2025
|
|
174
|
+
- URL: https://www.guardrailsai.com/
|
|
175
|
+
- Key Contribution: Composable validators, reactive safety checks
|
|
176
|
+
|
|
177
|
+
### Agent Learning and Self-Correction (2025-2026)
|
|
178
|
+
|
|
179
|
+
31. **Reflexion: Language Agents with Verbal Reinforcement Learning**
|
|
180
|
+
- Authors: Shinn et al.
|
|
181
|
+
- Published: NeurIPS 2023, extended arXiv:2303.11366 (2025)
|
|
182
|
+
- URL: https://arxiv.org/abs/2303.11366
|
|
183
|
+
- Key Contribution: Self-reflection for iterative improvement, verbal RL
|
|
184
|
+
|
|
185
|
+
32. **Self-Refine: Iterative Refinement with Self-Feedback**
|
|
186
|
+
- Authors: Madaan et al.
|
|
187
|
+
- Published: ICLR 2024
|
|
188
|
+
- URL: https://arxiv.org/abs/2303.17651
|
|
189
|
+
- Key Contribution: Iterative self-feedback without external reward model
|
|
190
|
+
|
|
191
|
+
33. **Voyager: Open-Ended Skill Library for Embodied Agents**
|
|
192
|
+
- Authors: Wang et al.
|
|
193
|
+
- Published: arXiv:2305.16291 (2023, updated 2025)
|
|
194
|
+
- URL: https://arxiv.org/abs/2305.16291
|
|
195
|
+
- Key Contribution: Automatic curriculum learning, skill library management
|
|
196
|
+
|
|
197
|
+
34. **DEPS: Dialogue Evaluation with Persona-based Scenarios**
|
|
198
|
+
- Authors: Various
|
|
199
|
+
- Published: ACL 2024
|
|
200
|
+
- URL: https://aclanthology.org/
|
|
201
|
+
- Key Contribution: Evolvable agent teams, dynamic role assignment
|
|
202
|
+
|
|
203
|
+
35. **Agentic AI: A Comprehensive Survey**
|
|
204
|
+
- Authors: Various
|
|
205
|
+
- Published: arXiv:2510.25445 (2025)
|
|
206
|
+
- URL: https://arxiv.org/abs/2510.25445
|
|
207
|
+
- Key Contribution: Comprehensive 2025 survey of agentic systems, taxonomy, challenges
|
|
208
|
+
|
|
209
|
+
### Enterprise Governance and Compliance (2025-2026)
|
|
210
|
+
|
|
211
|
+
36. **EU AI Act Implementation Guidelines**
|
|
212
|
+
- Organization: European Commission
|
|
213
|
+
- Year: 2025
|
|
214
|
+
- URL: https://digital-strategy.ec.europa.eu/
|
|
215
|
+
- Key Contribution: Legal requirements for high-risk AI systems, compliance standards
|
|
216
|
+
|
|
217
|
+
37. **AI Governance in 2025: WEF Whitepaper**
|
|
218
|
+
- Organization: World Economic Forum
|
|
219
|
+
- Year: 2025
|
|
220
|
+
- URL: https://www.weforum.org/
|
|
221
|
+
- Key Contribution: Industry best practices, risk-based governance frameworks
|
|
222
|
+
|
|
223
|
+
38. **SOC 2 for AI Systems: Security Framework**
|
|
224
|
+
- Organization: AICPA
|
|
225
|
+
- Year: 2025
|
|
226
|
+
- URL: https://www.aicpa.org/
|
|
227
|
+
- Key Contribution: Security, availability, and confidentiality controls for AI
|
|
228
|
+
|
|
229
|
+
39. **GDPR Compliance for Agentic Systems**
|
|
230
|
+
- Organization: ICO (UK Information Commissioner's Office)
|
|
231
|
+
- Year: 2024-2025
|
|
232
|
+
- URL: https://ico.org.uk/
|
|
233
|
+
- Key Contribution: Data protection requirements, right to explanation
|
|
234
|
+
|
|
235
|
+
40. **HIPAA for AI Healthcare Agents**
|
|
236
|
+
- Organization: HHS Office for Civil Rights
|
|
237
|
+
- Year: 2025
|
|
238
|
+
- URL: https://www.hhs.gov/hipaa/
|
|
239
|
+
- Key Contribution: Privacy and security requirements for healthcare AI
|
|
240
|
+
|
|
241
|
+
### Evaluation Frameworks and Benchmarks (2025-2026)
|
|
242
|
+
|
|
243
|
+
41. **GAIA: General AI Assistants Benchmark**
|
|
244
|
+
- Authors: Mialon et al.
|
|
245
|
+
- Published: arXiv:2311.12983 (2023, updated 2025)
|
|
246
|
+
- URL: https://arxiv.org/abs/2311.12983
|
|
247
|
+
- Key Contribution: Multi-step reasoning, real-world tasks, tool use evaluation
|
|
248
|
+
|
|
249
|
+
42. **AgentBench: Evaluating LLMs as Agents**
|
|
250
|
+
- Authors: Liu et al.
|
|
251
|
+
- Published: arXiv:2308.03688 (2023, updated 2025)
|
|
252
|
+
- URL: https://arxiv.org/abs/2308.03688
|
|
253
|
+
- Key Contribution: Multi-domain agent evaluation, 8 different environments
|
|
254
|
+
|
|
255
|
+
43. **ToolBench: Tool Learning with Foundation Models**
|
|
256
|
+
- Authors: Qin et al.
|
|
257
|
+
- Published: arXiv:2307.16789 (2023, updated 2025)
|
|
258
|
+
- URL: https://arxiv.org/abs/2307.16789
|
|
259
|
+
- Key Contribution: 16,000+ real-world APIs, tool use benchmarking
|
|
260
|
+
|
|
261
|
+
44. **HaluEval: Hallucination Evaluation for LLMs**
|
|
262
|
+
- Authors: Li et al.
|
|
263
|
+
- Published: EMNLP 2023, extended 2025
|
|
264
|
+
- URL: https://arxiv.org/abs/2305.11747
|
|
265
|
+
- Key Contribution: Automated hallucination detection, factual accuracy
|
|
266
|
+
|
|
267
|
+
45. **SafetyBench: Comprehensive Safety Evaluation**
|
|
268
|
+
- Authors: Zhang et al.
|
|
269
|
+
- Published: arXiv:2309.07045 (2023, updated 2025)
|
|
270
|
+
- URL: https://arxiv.org/abs/2309.07045
|
|
271
|
+
- Key Contribution: 11,435 diverse safety prompts, multi-lingual coverage
|
|
272
|
+
|
|
273
|
+
### Advanced Agent Architectures (2025-2026)
|
|
274
|
+
|
|
275
|
+
46. **AutoGen: Enabling Next-Gen LLM Applications**
|
|
276
|
+
- Authors: Wu et al.
|
|
277
|
+
- Organization: Microsoft Research
|
|
278
|
+
- Published: arXiv:2308.08155 (2023, updated 2025)
|
|
279
|
+
- URL: https://arxiv.org/abs/2308.08155
|
|
280
|
+
- Key Contribution: Multi-agent conversation framework, customizable agents
|
|
281
|
+
|
|
282
|
+
47. **LangGraph: Graph-Based Agent Workflows**
|
|
283
|
+
- Organization: LangChain
|
|
284
|
+
- Year: 2024-2025
|
|
285
|
+
- URL: https://langchain-ai.github.io/langgraph/
|
|
286
|
+
- Key Contribution: Stateful agent workflows, cycles and branches
|
|
287
|
+
|
|
288
|
+
48. **o1-preview: OpenAI Reasoning Models**
|
|
289
|
+
- Organization: OpenAI
|
|
290
|
+
- Year: 2024-2025
|
|
291
|
+
- URL: https://openai.com/research/
|
|
292
|
+
- Key Contribution: Chain-of-thought reasoning, improved problem solving
|
|
293
|
+
|
|
294
|
+
49. **Chain-of-Verification (CoVe)**
|
|
295
|
+
- Authors: Dhuliawala et al.
|
|
296
|
+
- Published: arXiv:2309.11495 (2023, updated 2025)
|
|
297
|
+
- URL: https://arxiv.org/abs/2309.11495
|
|
298
|
+
- Key Contribution: Systematic verification to reduce hallucination
|
|
299
|
+
|
|
300
|
+
50. **ReAct: Synergizing Reasoning and Acting**
|
|
301
|
+
- Authors: Yao et al.
|
|
302
|
+
- Published: ICLR 2023, extended 2025
|
|
303
|
+
- URL: https://arxiv.org/abs/2210.03629
|
|
304
|
+
- Key Contribution: Interleaving reasoning traces and actions
|
|
305
|
+
|
|
306
|
+
### Tool Use and Function Calling (2025-2026)
|
|
307
|
+
|
|
308
|
+
51. **Gorilla: Large Language Model Connected with APIs**
|
|
309
|
+
- Authors: Patil et al.
|
|
310
|
+
- Published: arXiv:2305.15334 (2023, updated 2025)
|
|
311
|
+
- URL: https://arxiv.org/abs/2305.15334
|
|
312
|
+
- Key Contribution: API call generation, retrieval-aware training
|
|
313
|
+
|
|
314
|
+
52. **ToolLLM: Facilitating Tool Use with LLMs**
|
|
315
|
+
- Authors: Qin et al.
|
|
316
|
+
- Published: arXiv:2307.16789 (2023, updated 2025)
|
|
317
|
+
- URL: https://arxiv.org/abs/2307.16789
|
|
318
|
+
- Key Contribution: Tool documentation understanding, planning
|
|
319
|
+
|
|
320
|
+
## Preprint Archive Searches
|
|
321
|
+
|
|
322
|
+
For the latest research, we regularly monitor:
|
|
323
|
+
- **arXiv.org**: AI (cs.AI), Multi-Agent Systems (cs.MA), Cryptography and Security (cs.CR)
|
|
324
|
+
- **Papers with Code**: Agent systems leaderboards
|
|
325
|
+
- **Hugging Face Papers**: Agent benchmarks and datasets
|
|
326
|
+
|
|
327
|
+
## Historical Context
|
|
328
|
+
|
|
329
|
+
25. **"Planning and Acting in Partially Observable Stochastic Domains"**
|
|
330
|
+
- Authors: Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra
|
|
331
|
+
- Journal: Artificial Intelligence 101(1-2) (1998)
|
|
332
|
+
- Key Contribution: POMDP framework for agent decision-making under uncertainty
|
|
333
|
+
|
|
334
|
+
26. **"The Belief-Desire-Intention Model of Agency"**
|
|
335
|
+
- Authors: Anand S. Rao, Michael P. Georgeff
|
|
336
|
+
- Conference: ATAL 1995
|
|
337
|
+
- Key Contribution: Foundational agent architecture still used today
|
|
338
|
+
|
|
339
|
+
## Contributing to Bibliography
|
|
340
|
+
|
|
341
|
+
Found a relevant paper? Submit a pull request with:
|
|
342
|
+
1. Full citation with authors, year, venue
|
|
343
|
+
2. URL or DOI
|
|
344
|
+
3. 1-2 sentence key contribution
|
|
345
|
+
4. How it informs Agent Control Plane design
|
|
346
|
+
|
|
347
|
+
---
|
|
348
|
+
|
|
349
|
+
**Maintained by**: Agent Control Plane Research Team
|
|
350
|
+
**Last Updated**: January 2026
|
|
351
|
+
**Total Citations**: 52 papers and reports
|
|
352
|
+
|
|
353
|
+
## Summary by Category
|
|
354
|
+
|
|
355
|
+
- **Agent Safety and Governance**: 10 papers
|
|
356
|
+
- **Multi-Agent Systems**: 6 papers
|
|
357
|
+
- **Privacy and Security**: 3 papers
|
|
358
|
+
- **Enterprise AI**: 3 reports
|
|
359
|
+
- **Safety and Guardrails (2025-2026)**: 4 papers
|
|
360
|
+
- **Agent Learning and Self-Correction (2025-2026)**: 5 papers
|
|
361
|
+
- **Enterprise Governance (2025-2026)**: 5 standards
|
|
362
|
+
- **Evaluation Frameworks (2025-2026)**: 5 benchmarks
|
|
363
|
+
- **Advanced Architectures (2025-2026)**: 5 papers
|
|
364
|
+
- **Tool Use (2025-2026)**: 2 papers
|
|
365
|
+
- **Standards and Specifications**: 3 standards
|
|
366
|
+
- **Competitive Frameworks**: 5 frameworks
|
|
367
|
+
- **Books**: 2 comprehensive references
|