PyPI - isagellm - Versions diffs - 0.1.0.6__cp311-none-any.whl → 0.2.2.0__cp311-none-any.whl - Mend

isagellm 0.1.0.6cp311-none-any.whl → 0.2.2.0cp311-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/METADATA +52 -74
isagellm-0.2.2.0.dist-info/RECORD +11 -0
sagellm/__init__.py +9 -4
sagellm/__init__.pyc +0 -0
sagellm/cli.pyc +0 -0
isagellm-0.1.0.6.dist-info/RECORD +0 -11
{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/WHEEL +0 -0
{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/entry_points.txt +0 -0
{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/top_level.txt +0 -0

{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: isagellm
-Version: 0.1.0.6
+Version: 0.2.2.0
 Summary: sageLLM: Modular LLM inference engine for domestic computing power (Huawei Ascend, NVIDIA)
 Author: IntelliStream Team
 License: Proprietary - IntelliStream
@@ -17,10 +17,10 @@ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: ==3.11.*
 Description-Content-Type: text/markdown
 Requires-Dist: isagellm-protocol<0.2.0,>=0.1.0
-Requires-Dist: isagellm-backend<0.2.0,>=0.1.1.3
-Requires-Dist: isagellm-core<0.2.0,>=0.1.0.2
-Requires-Dist: isagellm-control-plane<0.2.0,>=0.1.0.2
-Requires-Dist: isagellm-gateway<0.2.0,>=0.1.0.2
+Requires-Dist: isagellm-backend<0.3.0,>=0.2.1.0
+Requires-Dist: isagellm-core<0.3.0,>=0.2.2.0
+Requires-Dist: isagellm-control-plane<0.2.0,>=0.1.0.4
+Requires-Dist: isagellm-gateway<0.2.0,>=0.1.0.4
 Requires-Dist: isagellm-kv-cache<0.2.0,>=0.1.0
 Requires-Dist: isagellm-comm<0.2.0,>=0.1.0
 Requires-Dist: isagellm-compression<0.2.0,>=0.1.0
@@ -51,7 +51,7 @@ Requires-Dist: mypy>=1.0.0; extra == "dev"
   Ollama-like experience for Chinese hardware ecosystems (Huawei Ascend, NVIDIA)
 </p>
----
+______________________________________________________________________
 ## ✨ Features
@@ -85,45 +85,37 @@ pip install 'isagellm[all]'
 ## 🚀 Quick Start
-### 启动模式选择
-sageLLM 支持两种启动模式，满足不同场景需求：
-| 模式 | 使用场景 | 依赖 | 命令示例 |
-|------|---------|------|---------|
-| **Mock** | CI/测试/本地开发 | 无需 GPU | `sage-llm serve --mock` |
-| **生产** | 真实推理服务 | GPU/CPU 后端 | `sage-llm serve --control-plane` |
-**⚠️ Fail-Fast 保证**：非 mock 模式下，若依赖缺失或配置错误，系统将**立即报错退出**，不会静默回退到 mock 模式。
 ### CLI (like ollama)
 ```bash
 # Show system info
 sage-llm info
-# Mock 模式（无 GPU 依赖）
+# Default mode (CPU engine, no GPU required)
+sage-llm serve
+sage-llm run -p "What is LLM inference?"
+# Mock mode (CI / fast smoke tests)
 sage-llm serve --mock
 sage-llm run -p "What is LLM inference?" --mock
-sage-llm demo --workload year1 --mock
-# 生产模式（需要安装 control-plane）
+# Production mode (requires control-plane)
 # pip install 'isagellm[server]'
 sage-llm serve --control-plane
-sage-llm gateway --control-plane --port 8080
-# 如果缺少依赖，将看到：
-# ❌ Error: Control Plane required but not installed
-#    Install: pip install 'isagellm[control-plane]'
+sage-llm gateway --port 8080
 ```
 ### Python API
 ```python
-from sagellm import Request, MockEngine
+from sagellm import BackendConfig, EngineConfig, Request, create_backend, create_engine
-# Create mock engine (no GPU needed)
-engine = MockEngine()
+# Create CPU backend + engine (no GPU needed)
+backend = create_backend(BackendConfig(kind="cpu", device="cpu"))
+engine = create_engine(
+  EngineConfig(kind="cpu", model="sshleifer/tiny-gpt2", device="cpu"),
+  backend,
+)
 # Run inference
 request = Request(
@@ -143,39 +135,20 @@ print(f"Throughput: {response.metrics.throughput_tps:.2f} tokens/s")
 ```yaml
 # ~/.sage-llm/config.yaml
 backend:
-  kind: mock  # Options: mock, cpu, cuda, ascend
-  # Fail-fast 配置：如果指定了非 mock 后端但不可用，将报错退出
-  strict_mode: true  # 默认为 true，符合申报书要求
+  kind: cpu  # Options: cpu, mock, cuda, ascend
+  device: cpu
 engine:
-  kind: mock
-  model: Qwen/Qwen2-7B
-# Mock 模式配置
-mock:
-  enabled: false      # true 时强制使用 mock，无论其他配置
-  deterministic: true # mock 输出是否固定（用于回归测试）
-# 生产模式最低要求
-production:
-  control_plane:
-    required: true    # true 时缺少 control-plane 将报错（非 mock 模式）
-    endpoint: "localhost:8080"
-  backend:
-    required: true    # true 时缺少真实后端将报错
-    fallback_to_mock: false  # 禁止自动降级到 mock（fail-fast）
-workload:
-  segments:
-    - short   # 128 in → 128 out
-    - long    # 2048 in → 512 out
-    - stress  # concurrent requests
+  kind: cpu
+  model: sshleifer/tiny-gpt2
+control_plane:
+  endpoint: "localhost:8080"
 ```
-## 📊 Year 1 Demo Contract
+## 📊 Metrics & Validation
-sageLLM must produce these metrics for validation:
+sageLLM provides comprehensive performance metrics:
 ```json
 {
@@ -184,12 +157,12 @@ sageLLM must produce these metrics for validation:
   "throughput_tps": 80.0,
   "peak_mem_mb": 24576,
   "kv_used_tokens": 4096,
-  "prefix_hit_rate": 0.85,
-  "evict_count": 3
+  "prefix_hit_rate": 0.85
 }
 ```
-Run validation:
+Run benchmarks:
 ```bash
 sage-llm demo --workload year1 --output metrics.json
 ```
@@ -259,59 +232,64 @@ python scripts/verify_dependencies.py
 - **[INFERENCE_FLOW.md](docs/INFERENCE_FLOW.md)** - 推理流程详解
 - **[PR_CHECKLIST.md](docs/PR_CHECKLIST.md)** - Pull Request 检查清单
----
+______________________________________________________________________
 ## 📚 Documentation Index
 ### 用户文档
 - [快速开始](README.md#-quick-start) - 5 分钟上手
 - [部署指南](docs/DEPLOYMENT_GUIDE.md) - 生产环境部署
-- [配置参考](docs/DEPLOYMENT_GUIDE.md#配置文件说明) - 完整配置选项
+- [配置参考](docs/DEPLOYMENT_GUIDE.md#%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6%E8%AF%B4%E6%98%8E) - 完整配置选项
 - [环境变量](docs/ENVIRONMENT_VARIABLES.md) - 环境变量参考
 - [故障排查](docs/TROUBLESHOOTING.md) - 常见问题解决
 ### 开发者文档
 - [开发指南](docs/DEVELOPER_GUIDE.md) - 贡献代码
 - [架构设计](README.md#-architecture) - 系统架构
 - [Workspace 使用](docs/WORKSPACE_GUIDE.md) - Multi-root 工作区
 - [PR 检查清单](docs/PR_CHECKLIST.md) - 提交前检查
 ### API 文档
 - OpenAI 兼容 API - 参见 [sagellm-gateway](https://github.com/intellistream/sagellm-gateway)
 - Python API - 参见 [API_REFERENCE.md](docs/API_REFERENCE.md)（待补充）
 ### 子包文档
 - [sagellm-protocol](https://github.com/intellistream/sagellm-protocol) - 协议定义
 - [sagellm-backend](https://github.com/intellistream/sagellm-backend) - 后端抽象
 - [sagellm-core](https://github.com/intellistream/sagellm-core) - 引擎核心
 - [sagellm-control-plane](https://github.com/intellistream/sagellm-control-plane) - 控制面
 - [sagellm-gateway](https://github.com/intellistream/sagellm-gateway) - API 网关
 - [sagellm-benchmark](https://github.com/intellistream/sagellm-benchmark) - 基准测试
 - [**DEVELOPER_GUIDE.md**](DEVELOPER_GUIDE.md) - 架构规范与开发指南
 - [**PR_CHECKLIST.md**](PR_CHECKLIST.md) - Pull Request 审查清单
 - [**scripts/verify_dependencies.py**](scripts/verify_dependencies.py) - 依赖层次验证
 ## 📚 Package Details
-| Package | PyPI Name | Import Name | Description |
-|---------|-----------|-------------|-------------|
-| sagellm | `isagellm` | `sagellm` | Umbrella package (install this) |
-| sagellm-protocol | `isagellm-protocol` | `sagellm_protocol` | Protocol v0.1 types |
-| sagellm-core | `isagellm-core` | `sagellm_core` | Runtime & config |
-| sagellm-backend | `isagellm-backend` | `sagellm_backend` | Hardware abstraction |
-## 🎯 Roadmap
-- **Year 1**: Core inference with KV cache, prefix sharing, basic eviction
-- **Year 2**: Multi-node inference, advanced scheduling
-- **Year 3**: Full production-ready deployment
+| Package          | PyPI Name           | Import Name        | Description                     |
+| ---------------- | ------------------- | ------------------ | ------------------------------- |
+| sagellm          | `isagellm`          | `sagellm`          | Umbrella package (install this) |
+| sagellm-protocol | `isagellm-protocol` | `sagellm_protocol` | Protocol v0.1 types             |
+| sagellm-core     | `isagellm-core`     | `sagellm_core`     | Runtime & config                |
+| sagellm-backend  | `isagellm-backend`  | `sagellm_backend`  | Hardware abstraction            |
 ## 📄 License
 Proprietary - IntelliStream. Internal use only.
----
+______________________________________________________________________
 <p align="center">
   <sub>Built with ❤️ by IntelliStream Team for domestic AI infrastructure</sub>

isagellm-0.2.2.0.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,11 @@
+sagellm/__init__.py,sha256=UCOHeECsFoHYcSqdcsY0eObBDS18HPzj9GW-fIgCNaM,5647
+sagellm/__init__.pyc,sha256=-JtSOMiNExRaYR1aBwkXZiDsvScL1EFDOX2vs_iU8JU,4860
+sagellm/cli.pyc,sha256=rgkJu6Npz3Kf1WppnYt1K3JvO8IBfJpLlry_idWI1OU,53692
+sagellm/py.typed,sha256=ixa8YukDZ3kLo0WsFJRGohLMyHzbMur1ALmmASML2cs,64
+sagellm/__pycache__/__init__.cpython-311.pyc,sha256=xGiRA2rVPAyoIUzq5phb-y3bV3VRpjYU3eIJbmGhID0,4616
+sagellm/__pycache__/cli.cpython-311.pyc,sha256=n-HafVk9XrfnPpl56eonKAeIF2XCoAaUvdkJ0unLJBY,52767
+isagellm-0.2.2.0.dist-info/METADATA,sha256=X7lblvqCgr6U8XPcGAE18DQJ8RGukmbtpIPtYVw_Yqo,9128
+isagellm-0.2.2.0.dist-info/WHEEL,sha256=ZJeWpR6hcCRGwxVKXlDk-HsGwijNyTq4fszaDj4Ycyo,93
+isagellm-0.2.2.0.dist-info/entry_points.txt,sha256=NqSiD9EEbziWs94BYtKFUzrnKTyCFG2MuZcvrRryhtg,73
+isagellm-0.2.2.0.dist-info/top_level.txt,sha256=q-O8RUHV2YT7pQv12AYgFiK7PNvB9cHVg_7s5Tp08xI,8
+isagellm-0.2.2.0.dist-info/RECORD,,

sagellm/__init__.py CHANGED Viewed

@@ -7,15 +7,20 @@ Quick Start:
     pip install isagellm
     # CLI usage (like ollama)
-    sage-llm serve --mock           # Start mock server
+    sage-llm serve                  # Start CPU engine server
     sage-llm run -p "Hello world"   # Single inference
+    sage-llm serve --mock           # Mock server for CI
     sage-llm demo --workload year1  # Run Year1 demo validation
     sage-llm info                   # Show system info
     # Python API
-    from sagellm import Request, MockEngine, create_engine
+    from sagellm import BackendConfig, EngineConfig, Request, create_backend, create_engine
-    engine = MockEngine()
+    backend = create_backend(BackendConfig(kind="cpu", device="cpu"))
+    engine = create_engine(
+        EngineConfig(kind="cpu", model="sshleifer/tiny-gpt2", device="cpu"),
+        backend,
+    )
     response = engine.generate(Request(prompt="Hello", max_tokens=128))
     print(response.text)
@@ -28,7 +33,7 @@ Architecture:
 from __future__ import annotations
-__version__ = "0.1.0.6"
+__version__ = "0.2.2.0"
 # Lazy imports to handle installation order
 _LAZY_IMPORTS: dict[str, tuple[str, str]] = {

sagellm/__init__.pyc CHANGED Viewed

Binary file

sagellm/cli.pyc CHANGED Viewed

Binary file

isagellm-0.1.0.6.dist-info/RECORD DELETED Viewed

@@ -1,11 +0,0 @@
-sagellm/__init__.py,sha256=fvvYT8rqWc77Q6mHyEZp5o4BiTwa1fpmxHjzXypz5xo,5379
-sagellm/__init__.pyc,sha256=GwhdZnZYMqAatjKR1T-2j1l7gDvAmRgGLg8QFseZcgw,4591
-sagellm/cli.pyc,sha256=pUhwr6R1ChduZB-Z8x_-kgRXy0OtDFqnM_r3XSSlp_s,52742
-sagellm/py.typed,sha256=ixa8YukDZ3kLo0WsFJRGohLMyHzbMur1ALmmASML2cs,64
-sagellm/__pycache__/__init__.cpython-311.pyc,sha256=xGiRA2rVPAyoIUzq5phb-y3bV3VRpjYU3eIJbmGhID0,4616
-sagellm/__pycache__/cli.cpython-311.pyc,sha256=n-HafVk9XrfnPpl56eonKAeIF2XCoAaUvdkJ0unLJBY,52767
-isagellm-0.1.0.6.dist-info/METADATA,sha256=9VO23GmlyIdekx0baW9SS3H5nhXHMbhWQIdpkjSzgP8,10008
-isagellm-0.1.0.6.dist-info/WHEEL,sha256=ZJeWpR6hcCRGwxVKXlDk-HsGwijNyTq4fszaDj4Ycyo,93
-isagellm-0.1.0.6.dist-info/entry_points.txt,sha256=NqSiD9EEbziWs94BYtKFUzrnKTyCFG2MuZcvrRryhtg,73
-isagellm-0.1.0.6.dist-info/top_level.txt,sha256=q-O8RUHV2YT7pQv12AYgFiK7PNvB9cHVg_7s5Tp08xI,8
-isagellm-0.1.0.6.dist-info/RECORD,,

{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/WHEEL RENAMED Viewed

File without changes

{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{isagellm-0.1.0.6.dist-info → isagellm-0.2.2.0.dist-info}/top_level.txt RENAMED Viewed

File without changes

isagellm 0.1.0.6__cp311-none-any.whl → 0.2.2.0__cp311-none-any.whl

isagellm 0.1.0.6cp311-none-any.whl → 0.2.2.0cp311-none-any.whl