isagellm-core 0.4.0.17__py2.py3-none-any.whl → 0.4.0.19__py2.py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (168) hide show
  1. isagellm_core-0.4.0.19.dist-info/METADATA +718 -0
  2. isagellm_core-0.4.0.19.dist-info/RECORD +174 -0
  3. {isagellm_core-0.4.0.17.dist-info → isagellm_core-0.4.0.19.dist-info}/WHEEL +1 -1
  4. sagellm_core/__init__.py +4 -22
  5. sagellm_core/__init__.pyc +0 -0
  6. sagellm_core/__main__.pyc +0 -0
  7. sagellm_core/__pycache__/__init__.cpython-311.pyc +0 -0
  8. sagellm_core/__pycache__/config.cpython-311.pyc +0 -0
  9. sagellm_core/__pycache__/demo.cpython-311.pyc +0 -0
  10. sagellm_core/__pycache__/engine.cpython-311.pyc +0 -0
  11. sagellm_core/__pycache__/engine_factory.cpython-311.pyc +0 -0
  12. sagellm_core/__pycache__/engine_server.cpython-311.pyc +0 -0
  13. sagellm_core/__pycache__/factory.cpython-311.pyc +0 -0
  14. sagellm_core/__pycache__/health.cpython-311.pyc +0 -0
  15. sagellm_core/__pycache__/llm_engine.cpython-311.pyc +0 -0
  16. sagellm_core/__pycache__/pd_executor.cpython-311.pyc +0 -0
  17. sagellm_core/__pycache__/plugins.cpython-311.pyc +0 -0
  18. sagellm_core/__pycache__/runner.cpython-311.pyc +0 -0
  19. sagellm_core/__pycache__/runtime.cpython-311.pyc +0 -0
  20. sagellm_core/__pycache__/workload.cpython-311.pyc +0 -0
  21. sagellm_core/config.pyc +0 -0
  22. sagellm_core/decoding/__init__.pyc +0 -0
  23. sagellm_core/decoding/__pycache__/__init__.cpython-311.pyc +0 -0
  24. sagellm_core/decoding/__pycache__/base.cpython-311.pyc +0 -0
  25. sagellm_core/decoding/__pycache__/beam_search.cpython-311.pyc +0 -0
  26. sagellm_core/decoding/__pycache__/contrastive.cpython-311.pyc +0 -0
  27. sagellm_core/decoding/__pycache__/greedy.cpython-311.pyc +0 -0
  28. sagellm_core/decoding/__pycache__/sampling.cpython-311.pyc +0 -0
  29. sagellm_core/decoding/base.pyc +0 -0
  30. sagellm_core/decoding/beam_search.pyc +0 -0
  31. sagellm_core/decoding/contrastive.pyc +0 -0
  32. sagellm_core/decoding/greedy.pyc +0 -0
  33. sagellm_core/decoding/sampling.pyc +0 -0
  34. sagellm_core/demo.pyc +0 -0
  35. sagellm_core/distributed/__init__.pyc +0 -0
  36. sagellm_core/distributed/__pycache__/__init__.cpython-311.pyc +0 -0
  37. sagellm_core/distributed/__pycache__/strategies.cpython-311.pyc +0 -0
  38. sagellm_core/distributed/strategies.pyc +0 -0
  39. sagellm_core/engine.pyc +0 -0
  40. sagellm_core/engine_core/__init__.pyc +0 -0
  41. sagellm_core/engine_core/__pycache__/__init__.cpython-311.pyc +0 -0
  42. sagellm_core/engine_core/__pycache__/engine_core.cpython-311.pyc +0 -0
  43. sagellm_core/engine_core/engine_core.pyc +0 -0
  44. sagellm_core/engine_core/scheduler/__init__.py +31 -3
  45. sagellm_core/engine_core/scheduler/__init__.pyc +0 -0
  46. sagellm_core/engine_core/scheduler/__pycache__/__init__.cpython-311.pyc +0 -0
  47. sagellm_core/engine_core/scheduler/__pycache__/base.cpython-311.pyc +0 -0
  48. sagellm_core/engine_core/scheduler/__pycache__/batch.cpython-311.pyc +0 -0
  49. sagellm_core/engine_core/scheduler/__pycache__/metrics.cpython-311.pyc +0 -0
  50. sagellm_core/engine_core/scheduler/__pycache__/scheduler.cpython-311.pyc +0 -0
  51. sagellm_core/engine_core/scheduler/__pycache__/scheduler_kv_bridge.cpython-311.pyc +0 -0
  52. sagellm_core/engine_core/scheduler/base.pyc +0 -0
  53. sagellm_core/engine_core/scheduler/batch.pyc +0 -0
  54. sagellm_core/engine_core/scheduler/metrics.pyc +0 -0
  55. sagellm_core/engine_core/scheduler/policy/__init__.py +6 -0
  56. sagellm_core/engine_core/scheduler/policy/__init__.pyc +0 -0
  57. sagellm_core/engine_core/scheduler/policy/__pycache__/__init__.cpython-311.pyc +0 -0
  58. sagellm_core/engine_core/scheduler/policy/__pycache__/fcfs.cpython-311.pyc +0 -0
  59. sagellm_core/engine_core/scheduler/policy/__pycache__/priority.cpython-311.pyc +0 -0
  60. sagellm_core/engine_core/scheduler/policy/fcfs.pyc +0 -0
  61. sagellm_core/engine_core/scheduler/policy/priority.pyc +0 -0
  62. sagellm_core/engine_core/scheduler/scheduler.pyc +0 -0
  63. sagellm_core/engine_core/scheduler/scheduler_kv_bridge.pyc +0 -0
  64. sagellm_core/engine_factory.pyc +0 -0
  65. sagellm_core/engine_server.pyc +0 -0
  66. sagellm_core/engines/__init__.pyc +0 -0
  67. sagellm_core/engines/__pycache__/__init__.cpython-311.pyc +0 -0
  68. sagellm_core/engines/__pycache__/ascend.cpython-311.pyc +0 -0
  69. sagellm_core/engines/__pycache__/cpu.cpython-311.pyc +0 -0
  70. sagellm_core/engines/__pycache__/embedding.cpython-311.pyc +0 -0
  71. sagellm_core/engines/__pycache__/hf_cuda.cpython-311.pyc +0 -0
  72. sagellm_core/engines/__pycache__/pytorch.cpython-311.pyc +0 -0
  73. sagellm_core/engines/__pycache__/pytorch_engine.cpython-311.pyc +0 -0
  74. sagellm_core/engines/embedding.pyc +0 -0
  75. sagellm_core/executor/__init__.pyc +0 -0
  76. sagellm_core/executor/__pycache__/__init__.cpython-311.pyc +0 -0
  77. sagellm_core/executor/__pycache__/executor_base.cpython-311.pyc +0 -0
  78. sagellm_core/executor/__pycache__/uniproc_executor.cpython-311.pyc +0 -0
  79. sagellm_core/executor/executor_base.pyc +0 -0
  80. sagellm_core/executor/uniproc_executor.pyc +0 -0
  81. sagellm_core/factory.pyc +0 -0
  82. sagellm_core/health.pyc +0 -0
  83. sagellm_core/inputs/__init__.pyc +0 -0
  84. sagellm_core/inputs/__pycache__/__init__.cpython-311.pyc +0 -0
  85. sagellm_core/inputs/__pycache__/processor.cpython-311.pyc +0 -0
  86. sagellm_core/inputs/__pycache__/tokenizer_utils.cpython-311.pyc +0 -0
  87. sagellm_core/inputs/processor.pyc +0 -0
  88. sagellm_core/inputs/tokenizer_utils.pyc +0 -0
  89. sagellm_core/layers/__init__.py +30 -0
  90. sagellm_core/layers/__init__.pyc +0 -0
  91. sagellm_core/layers/__pycache__/__init__.cpython-311.pyc +0 -0
  92. sagellm_core/layers/__pycache__/activation.cpython-311.pyc +0 -0
  93. sagellm_core/layers/__pycache__/base.cpython-311.pyc +0 -0
  94. sagellm_core/layers/__pycache__/embedding.cpython-311.pyc +0 -0
  95. sagellm_core/layers/__pycache__/linear.cpython-311.pyc +0 -0
  96. sagellm_core/layers/__pycache__/normalization.cpython-311.pyc +0 -0
  97. sagellm_core/layers/activation.pyc +0 -0
  98. sagellm_core/layers/base.pyc +0 -0
  99. sagellm_core/layers/embedding.pyc +0 -0
  100. sagellm_core/layers/linear.pyc +0 -0
  101. sagellm_core/layers/normalization.pyc +0 -0
  102. sagellm_core/llm_engine.pyc +0 -0
  103. sagellm_core/model/__init__.py +131 -5
  104. sagellm_core/model/__init__.pyc +0 -0
  105. sagellm_core/model/__pycache__/__init__.cpython-311.pyc +0 -0
  106. sagellm_core/model/__pycache__/base.cpython-311.pyc +0 -0
  107. sagellm_core/model/__pycache__/factory.cpython-311.pyc +0 -0
  108. sagellm_core/model/__pycache__/gpt2.cpython-311.pyc +0 -0
  109. sagellm_core/model/__pycache__/llama.cpython-311.pyc +0 -0
  110. sagellm_core/model/__pycache__/mixtral.cpython-311.pyc +0 -0
  111. sagellm_core/model/__pycache__/model_loader.cpython-311.pyc +0 -0
  112. sagellm_core/model/__pycache__/quantization.cpython-311.pyc +0 -0
  113. sagellm_core/model/__pycache__/qwen2.cpython-311.pyc +0 -0
  114. sagellm_core/model/__pycache__/registry.cpython-311.pyc +0 -0
  115. sagellm_core/model/__pycache__/weight_utils.cpython-311.pyc +0 -0
  116. sagellm_core/model/base.pyc +0 -0
  117. sagellm_core/model/factory.pyc +0 -0
  118. sagellm_core/model/gpt2.pyc +0 -0
  119. sagellm_core/model/llama.pyc +0 -0
  120. sagellm_core/model/mixtral.pyc +0 -0
  121. sagellm_core/model/model_loader.pyc +0 -0
  122. sagellm_core/model/quantization.pyc +0 -0
  123. sagellm_core/model/qwen2.pyc +0 -0
  124. sagellm_core/model/registry.pyc +0 -0
  125. sagellm_core/model/weight_loader/__init__.py +54 -0
  126. sagellm_core/model/weight_loader/__init__.pyc +0 -0
  127. sagellm_core/model/weight_loader/__pycache__/__init__.cpython-311.pyc +0 -0
  128. sagellm_core/model/weight_loader/__pycache__/base.cpython-311.pyc +0 -0
  129. sagellm_core/model/weight_loader/__pycache__/pytorch.cpython-311.pyc +0 -0
  130. sagellm_core/model/weight_loader/__pycache__/quantized.cpython-311.pyc +0 -0
  131. sagellm_core/model/weight_loader/__pycache__/safetensors.cpython-311.pyc +0 -0
  132. sagellm_core/model/weight_loader/base.pyc +0 -0
  133. sagellm_core/model/weight_loader/pytorch.pyc +0 -0
  134. sagellm_core/model/weight_loader/quantized.pyc +0 -0
  135. sagellm_core/model/weight_loader/safetensors.pyc +0 -0
  136. sagellm_core/model/weight_utils.pyc +0 -0
  137. sagellm_core/observability/__init__.pyc +0 -0
  138. sagellm_core/observability/__pycache__/__init__.cpython-311.pyc +0 -0
  139. sagellm_core/observability/__pycache__/logger.cpython-311.pyc +0 -0
  140. sagellm_core/observability/__pycache__/metrics.cpython-311.pyc +0 -0
  141. sagellm_core/observability/logger.pyc +0 -0
  142. sagellm_core/observability/metrics.pyc +0 -0
  143. sagellm_core/pd_executor.pyc +0 -0
  144. sagellm_core/plugins.pyc +0 -0
  145. sagellm_core/runner.pyc +0 -0
  146. sagellm_core/runtime.pyc +0 -0
  147. sagellm_core/sampling/__init__.pyc +0 -0
  148. sagellm_core/sampling/__pycache__/__init__.cpython-311.pyc +0 -0
  149. sagellm_core/sampling/__pycache__/params.cpython-311.pyc +0 -0
  150. sagellm_core/sampling/__pycache__/sampler.cpython-311.pyc +0 -0
  151. sagellm_core/sampling/params.pyc +0 -0
  152. sagellm_core/sampling/sampler.pyc +0 -0
  153. sagellm_core/worker/__init__.pyc +0 -0
  154. sagellm_core/worker/__pycache__/__init__.cpython-311.pyc +0 -0
  155. sagellm_core/worker/__pycache__/worker.cpython-311.pyc +0 -0
  156. sagellm_core/worker/model_runner/__init__.pyc +0 -0
  157. sagellm_core/worker/model_runner/__pycache__/__init__.cpython-311.pyc +0 -0
  158. sagellm_core/worker/model_runner/__pycache__/model_runner.cpython-311.pyc +0 -0
  159. sagellm_core/worker/model_runner/model_runner.pyc +0 -0
  160. sagellm_core/worker/worker.pyc +0 -0
  161. sagellm_core/workload.pyc +0 -0
  162. isagellm_core-0.4.0.17.dist-info/METADATA +0 -308
  163. isagellm_core-0.4.0.17.dist-info/RECORD +0 -122
  164. sagellm_core/__pycache__/base_engine.cpython-311.pyc +0 -0
  165. sagellm_core/__pycache__/mock_engine.cpython-311.pyc +0 -0
  166. sagellm_core/engines/__pycache__/mock.cpython-311.pyc +0 -0
  167. {isagellm_core-0.4.0.17.dist-info → isagellm_core-0.4.0.19.dist-info}/entry_points.txt +0 -0
  168. {isagellm_core-0.4.0.17.dist-info → isagellm_core-0.4.0.19.dist-info}/top_level.txt +0 -0
@@ -1,13 +1,139 @@
1
- """Model loading utilities for sageLLM."""
1
+ """Model definitions using backend kernels.
2
2
 
3
- from __future__ import annotations
3
+ This module provides model architectures that use sagellm-core layers
4
+ instead of direct HuggingFace implementations.
4
5
 
6
+ Key models:
7
+ - BaseLLMModel: Abstract base for all LLM models
8
+ - GPT2Model: GPT-2 architecture (for testing)
9
+ - LlamaModel: LLaMA/LLaMA-2/LLaMA-3 architecture
10
+ - Qwen2Model: Qwen2/Qwen2.5 architecture
11
+ - MixtralModel: Mixtral Mixture of Experts architecture
12
+ - create_model_from_config: Factory function to create models
13
+
14
+ Registry system:
15
+ - MODEL_REGISTRY: Central registry of model types
16
+ - register_model: Decorator to register new models
17
+ - get_model_class: Look up model class by type
18
+
19
+ Weight loading:
20
+ - WeightLoaderProtocol: Protocol for weight loaders
21
+ - WEIGHT_LOADERS: Registry of weight formats
22
+ - SafetensorsLoader, PytorchLoader: Built-in loaders
23
+ - QuantizedWeightLoader: Quantized weight loading (INT8/AWQ/GPTQ)
24
+
25
+ Quantization:
26
+ - QuantConfig: Quantization configuration
27
+ - detect_quantization: Auto-detect quantization format
28
+ """
29
+
30
+ # Registry (import first to ensure registration works)
31
+ from sagellm_core.model.registry import (
32
+ MODEL_REGISTRY,
33
+ get_model_class,
34
+ is_model_registered,
35
+ list_registered_models,
36
+ register_model,
37
+ )
38
+
39
+ # Base class
40
+ from sagellm_core.model.base import BaseLLMModel
41
+
42
+ # Factory
43
+ from sagellm_core.model.factory import create_model_from_config, load_model_weights
44
+
45
+ # Models (import triggers registration)
46
+ from sagellm_core.model.gpt2 import GPT2Model
47
+ from sagellm_core.model.llama import (
48
+ LlamaModel,
49
+ LlamaAttention,
50
+ LlamaMLP,
51
+ LlamaDecoderLayer,
52
+ LlamaRotaryEmbedding,
53
+ )
54
+ from sagellm_core.model.qwen2 import (
55
+ Qwen2Model,
56
+ Qwen2Attention,
57
+ Qwen2MLP,
58
+ Qwen2DecoderLayer,
59
+ )
60
+ from sagellm_core.model.mixtral import (
61
+ MixtralModel,
62
+ MixtralSparseMoE,
63
+ MixtralMLP,
64
+ MixtralDecoderLayer,
65
+ )
66
+
67
+ # Quantization
68
+ from sagellm_core.model.quantization import (
69
+ QuantConfig,
70
+ QuantizationType,
71
+ detect_quantization,
72
+ )
73
+
74
+ # Weight loaders
75
+ from sagellm_core.model.weight_loader import (
76
+ WEIGHT_LOADERS,
77
+ WeightLoaderProtocol,
78
+ WeightNameMapper,
79
+ SafetensorsLoader,
80
+ PytorchLoader,
81
+ detect_weight_format,
82
+ get_weight_loader,
83
+ list_weight_loaders,
84
+ load_weights_into_model,
85
+ register_weight_loader,
86
+ )
87
+ from sagellm_core.model.weight_loader.quantized import QuantizedWeightLoader
88
+
89
+ # Legacy compatibility
5
90
  from sagellm_core.model.model_loader import ModelLoader, load_model
6
- from sagellm_core.model.weight_utils import WeightLoader, QuantizedWeightLoader
7
91
 
8
92
  __all__ = [
93
+ # Base
94
+ "BaseLLMModel",
95
+ # Models
96
+ "GPT2Model",
97
+ "LlamaModel",
98
+ "LlamaAttention",
99
+ "LlamaMLP",
100
+ "LlamaDecoderLayer",
101
+ "LlamaRotaryEmbedding",
102
+ "Qwen2Model",
103
+ "Qwen2Attention",
104
+ "Qwen2MLP",
105
+ "Qwen2DecoderLayer",
106
+ "MixtralModel",
107
+ "MixtralSparseMoE",
108
+ "MixtralMLP",
109
+ "MixtralDecoderLayer",
110
+ # Factory
111
+ "create_model_from_config",
112
+ "load_model_weights",
113
+ # Registry
114
+ "MODEL_REGISTRY",
115
+ "register_model",
116
+ "get_model_class",
117
+ "list_registered_models",
118
+ "is_model_registered",
119
+ # Quantization
120
+ "QuantConfig",
121
+ "QuantizationType",
122
+ "detect_quantization",
123
+ # Weight loaders
124
+ "WEIGHT_LOADERS",
125
+ "WeightLoaderProtocol",
126
+ "WeightNameMapper",
127
+ "SafetensorsLoader",
128
+ "PytorchLoader",
129
+ "QuantizedWeightLoader",
130
+ "detect_weight_format",
131
+ "get_weight_loader",
132
+ "list_weight_loaders",
133
+ "load_weights_into_model",
134
+ "register_weight_loader",
135
+ # Legacy
9
136
  "ModelLoader",
10
137
  "load_model",
11
- "WeightLoader",
12
- "QuantizedWeightLoader",
13
138
  ]
139
+
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
@@ -0,0 +1,54 @@
1
+ """Weight loader module.
2
+
3
+ This module provides utilities for loading model weights from various formats:
4
+ - safetensors: Fast, secure tensor format (recommended)
5
+ - pytorch: Standard PyTorch .bin/.pt checkpoints
6
+
7
+ Usage:
8
+ from sagellm_core.model.weight_loader import (
9
+ get_weight_loader,
10
+ detect_weight_format,
11
+ load_weights_into_model,
12
+ )
13
+
14
+ # Auto-detect format and load
15
+ load_weights_into_model(model, "path/to/model")
16
+
17
+ # Or manually:
18
+ loader = get_weight_loader("safetensors")
19
+ for name, tensor in loader.load_weights("path/to/model"):
20
+ print(f"{name}: {tensor.shape}")
21
+ """
22
+
23
+ from sagellm_core.model.weight_loader.base import (
24
+ WEIGHT_LOADERS,
25
+ WeightLoaderProtocol,
26
+ WeightNameMapper,
27
+ detect_weight_format,
28
+ get_weight_loader,
29
+ is_weight_loader_registered,
30
+ list_weight_loaders,
31
+ load_weights_into_model,
32
+ register_weight_loader,
33
+ )
34
+
35
+ # Import loaders to trigger registration
36
+ from sagellm_core.model.weight_loader.safetensors import SafetensorsLoader
37
+ from sagellm_core.model.weight_loader.pytorch import PytorchLoader
38
+
39
+ __all__ = [
40
+ # Protocol and registry
41
+ "WeightLoaderProtocol",
42
+ "WEIGHT_LOADERS",
43
+ "register_weight_loader",
44
+ "get_weight_loader",
45
+ "list_weight_loaders",
46
+ "is_weight_loader_registered",
47
+ # Utilities
48
+ "detect_weight_format",
49
+ "WeightNameMapper",
50
+ "load_weights_into_model",
51
+ # Concrete loaders
52
+ "SafetensorsLoader",
53
+ "PytorchLoader",
54
+ ]
Binary file
Binary file
Binary file
Binary file
Binary file
sagellm_core/plugins.pyc CHANGED
Binary file
sagellm_core/runner.pyc CHANGED
Binary file
sagellm_core/runtime.pyc CHANGED
Binary file
Binary file
Binary file
Binary file
Binary file
Binary file
sagellm_core/workload.pyc CHANGED
Binary file
@@ -1,308 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: isagellm-core
3
- Version: 0.4.0.17
4
- Summary: sageLLM core runtime with PD separation (MVP)
5
- Author: IntelliStream Team
6
- License: Proprietary - IntelliStream
7
- Classifier: Development Status :: 3 - Alpha
8
- Classifier: Programming Language :: Python :: 3
9
- Classifier: Programming Language :: Python :: 3.10
10
- Classifier: Programming Language :: Python :: 3.11
11
- Classifier: Programming Language :: Python :: 3.12
12
- Requires-Python: ==3.11.*
13
- Description-Content-Type: text/markdown
14
- Requires-Dist: pydantic>=2.0.0
15
- Requires-Dist: pyyaml>=6.0.0
16
- Requires-Dist: isagellm-protocol<0.5.0,>=0.4.0.0
17
- Requires-Dist: isagellm-backend<0.5.0,>=0.4.0.0
18
- Requires-Dist: isagellm-comm<0.5.0,>=0.4.0.0
19
- Requires-Dist: isagellm-kv-cache<0.5.0,>=0.4.0.0
20
- Requires-Dist: fastapi>=0.100.0
21
- Requires-Dist: uvicorn>=0.22.0
22
- Requires-Dist: torch>=2.0.0
23
- Requires-Dist: transformers>=4.35.0
24
- Requires-Dist: accelerate>=0.26.0
25
- Provides-Extra: dev
26
- Requires-Dist: pytest>=7.0.0; extra == "dev"
27
- Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
28
- Requires-Dist: pytest-timeout>=2.0.0; extra == "dev"
29
- Requires-Dist: ruff>=0.8.0; extra == "dev"
30
- Requires-Dist: mypy>=1.0.0; extra == "dev"
31
- Requires-Dist: types-PyYAML>=6.0.0; extra == "dev"
32
- Requires-Dist: pre-commit>=3.0.0; extra == "dev"
33
- Requires-Dist: isage-pypi-publisher>=0.2.0; extra == "dev"
34
-
35
- # sagellm-core
36
-
37
- ## Protocol Compliance (Mandatory)
38
-
39
- - MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
40
- - Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
41
-
42
- [![CI](https://github.com/intellistream/sagellm-core/actions/workflows/ci.yml/badge.svg)](https://github.com/intellistream/sagellm-core/actions/workflows/ci.yml)
43
- [![PyPI](https://img.shields.io/pypi/v/isagellm-core.svg)](https://pypi.org/project/isagellm-core/)
44
- [![Python](https://img.shields.io/pypi/pyversions/isagellm-core.svg)](https://pypi.org/project/isagellm-core/)
45
- [![License](https://img.shields.io/github/license/intellistream/sagellm-core.svg)](https://github.com/intellistream/sagellm-core/blob/main/LICENSE)
46
- [![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)
47
-
48
- sageLLM Core - 引擎协调层与运行时系统
49
-
50
- ## 架构定位
51
-
52
- ```
53
- ┌─────────────────────────────────────────────────────────────┐
54
- │ sagellm-core (引擎协调层) ← 本仓库 │
55
- │ ┌─────────────────────────────────────────────────────┐ │
56
- │ │ LLMEngine (Hardware-Agnostic, vLLM v1 style) │ │
57
- │ │ • 统一推理接口: generate, stream, execute │ │
58
- │ │ • 自动后端选择 (auto-detect cuda/ascend/cpu) │ │
59
- │ │ • 配置驱动 (LLMEngineConfig) │ │
60
- │ └─────────────────────────────────────────────────────┘ │
61
- │ ┌─────────────────────────────────────────────────────┐ │
62
- │ │ Configuration System (config.py) │ │
63
- │ │ • YAML/JSON 配置解析 │ │
64
- │ │ • Pydantic v2 类型验证 │ │
65
- │ └─────────────────────────────────────────────────────┘ │
66
- ├─────────────────────────────────────────────────────────────┤
67
- │ sagellm-backend (硬件抽象层) │
68
- │ • BackendProvider (CPU/CUDA/Ascend) │
69
- │ • Stream/Event/KVBlock 管理 │
70
- └─────────────────────────────────────────────────────────────┘
71
- ```
72
-
73
- **职责分离**:
74
- - ✅ **Core 负责**:LLMEngine (硬件无关)、配置、协调
75
- - ✅ **Backend 负责**:硬件抽象、设备管理、Provider 实现
76
-
77
- ## Features
78
-
79
- - 🔧 引擎抽象层与自描述架构
80
- - 🏭 EngineFactory - 支持自动发现与优先级选择
81
- - 🎯 内置引擎实现(CPU/CUDA/Embedding)
82
- - 🔌 插件系统 - 扩展引擎与后端
83
- - ⚙️ 配置系统 - YAML/JSON + Pydantic v2
84
- - ✅ CPU-First - 无 GPU 测试支持
85
-
86
- ## MoE(规划中)
87
-
88
- - MoE 路由/调度/执行图由 **sagellm-core** 主责
89
- - 依赖 **sagellm-comm**(all-to-all/拓扑通信)与 **sagellm-backend**(专家算子/内核)
90
- - 需先在 Protocol 中补充全局字段(如 router 元数据、专家负载指标)
91
-
92
- ## 安装
93
-
94
- ```bash
95
- # 从 PyPI 安装(自动安装 protocol + backend 依赖)
96
- pip install isagellm-core
97
- ```
98
-
99
- ## 🚀 开发者快速开始
100
-
101
- ```bash
102
- git clone git@github.com:intellistream/sagellm-core.git
103
- cd sagellm-core
104
- ./quickstart.sh # 一键安装开发环境(含依赖)
105
-
106
- # 或手动安装
107
- pip install -e ".[dev]"
108
- ```
109
-
110
- 运行测试:
111
- ```bash
112
- pytest tests/ -v
113
- ```
114
-
115
- > 💡 **提示**:`isagellm-protocol` 和 `isagellm-backend` 会自动从 PyPI 安装。
116
- > 如需本地联调:
117
- > ```bash
118
- > pip install -e ../sagellm-protocol
119
- > pip install -e ../sagellm-backend
120
- > ```
121
-
122
- ## Configuration System
123
-
124
- ### 使用方法
125
-
126
- ```python
127
- from sagellm_core import load_config, create_backend, create_engine
128
-
129
- # 从 YAML/JSON 加载配置
130
- config = load_config("config.yaml")
131
-
132
- # 创建 backend 和 engine(通过插件发现)
133
- backend = create_backend(config.backend)
134
- engine = create_engine(config.engine, backend)
135
- ```
136
-
137
- ### Configuration Structure
138
-
139
- Main configuration components:
140
- - `BackendConfig`: Device and backend settings
141
- - `EngineConfig`: Inference engine configuration
142
- - `WorkloadConfig`: Workload parameters
143
- - `OutputConfig`: Output paths and logging
144
-
145
- ### 配置示例
146
-
147
- #### 示例配置文件
148
-
149
- - [config_cpu.yaml](examples/config_cpu.yaml) - CPU 模式(CI/开发)
150
- - [config_cuda.yaml](examples/config_cuda.yaml) - CUDA 生产模式
151
- - [config_ascend.yaml](examples/config_ascend.yaml) - 昇腾生产模式
152
- - [config_minimal.json](examples/config_minimal.json) - 最小 JSON 配置
153
-
154
- 更多信息参见 [examples/README.md](examples/README.md)
155
-
156
- ### 配置格式
157
-
158
- 支持 YAML(推荐)和 JSON 格式:
159
-
160
- ```yaml
161
- backend:
162
- kind: cpu
163
- engine:
164
- kind: cpu
165
- model: sshleifer/tiny-gpt2
166
- workload:
167
- segments: [short, long, stress]
168
- concurrency: 4
169
- output:
170
- metrics_path: ./metrics.json
171
- ```
172
-
173
- ### 插件解析
174
-
175
- 当配置指定的 backend/engine kind 未安装时,会抛出 `PluginResolutionError`:
176
-
177
- ```python
178
- from sagellm_core import create_backend, BackendConfig, PluginResolutionError
179
-
180
- try:
181
- backend = create_backend(BackendConfig(kind="ascend_cann", device="npu:0"))
182
- except PluginResolutionError as e:
183
- print(f"错误: {e}")
184
- # 输出: No implementation found for sagellm.backends kind='ascend_cann'.
185
- # Install hint: pip install isagellm-backend-ascend_cann
186
- ```
187
-
188
- ## Development Guide
189
-
190
- ### 快速开始
191
-
192
- ```bash
193
- # 克隆仓库
194
- git clone https://github.com/intellistream/sagellm-core.git
195
- cd sagellm-core
196
-
197
- # 安装开发依赖
198
- pip install -e ".[dev]"
199
-
200
- # 安装 pre-commit hooks
201
- pre-commit install
202
-
203
- # 验证环境
204
- pytest tests/ -v
205
- ```
206
-
207
- ### Testing and Quality
208
-
209
- #### Pre-commit Hooks
210
-
211
- 安装后,每次 `git commit` 会自动运行:
212
- - **Ruff**: 代码格式化 + Lint 检查
213
- - **Mypy**: 静态类型检查
214
- - **YAML/JSON**: 配置文件验证
215
-
216
- ```bash
217
- # 手动运行所有 hooks
218
- pre-commit run --all-files
219
-
220
- # 绕过 hooks(紧急情况)
221
- git commit --no-verify
222
- ```
223
-
224
- #### Running Tests
225
-
226
- ```bash
227
- # Run all tests
228
- pytest tests/ -v
229
-
230
- # Run specific test module
231
- pytest tests/unit/test_demo.py -v
232
-
233
- # Generate coverage report
234
- pytest tests/ --cov=sagellm_core --cov-report=html
235
- ```
236
-
237
- #### Continuous Integration
238
-
239
- GitHub Actions automatically runs on each PR:
240
- - Code linting and formatting checks
241
- - Tests across Python 3.10, 3.11, 3.12
242
- - Package build verification
243
-
244
- ### Code Style
245
-
246
- This project uses:
247
- - **Ruff** for formatting and linting
248
- - **Mypy** for type checking
249
- - **Type hints** are required for all functions
250
-
251
- For detailed guidelines, see [CONTRIBUTING.md](CONTRIBUTING.md)
252
-
253
- ### 代码检查
254
-
255
- ```bash
256
- # 格式化代码
257
- ruff format .
258
-
259
- # Lint 检查
260
- ruff check .
261
-
262
- # 类型检查
263
- mypy src/sagellm_core
264
-
265
- # 一键检查所有
266
- pre-commit run --all-files
267
- ```
268
-
269
- ## 依赖
270
-
271
- - `pydantic>=2.0.0`: 配置校验
272
- - `pyyaml>=6.0.0`: YAML 配置支持
273
- - `isagellm-protocol>=0.1.0,<0.2.0`: 协议定义(Level 0)
274
- - `isagellm-backend>=0.1.0,<0.2.0`: 后端抽象(Level 1)
275
-
276
- ## Related Packages
277
-
278
- - `isagellm-protocol` - Protocol definitions
279
- - `isagellm-backend` - Backend abstraction layer
280
- - `isagellm` - Main package with CLI
281
-
282
- For more packages, see the [sageLLM ecosystem](https://github.com/intellistream/sagellm)
283
-
284
- ## 🔄 贡献指南
285
-
286
- 请遵循以下工作流程:
287
-
288
- 1. **创建 Issue** - 描述问题/需求
289
- ```bash
290
- gh issue create --title "[Bug] 描述" --label "bug,sagellm-core"
291
- ```
292
-
293
- 2. **开发修复** - 在本地 `fix/#123-xxx` 分支解决
294
- ```bash
295
- git checkout -b fix/#123-xxx origin/main-dev
296
- # 开发、测试...
297
- pytest -v
298
- ruff format . && ruff check . --fix
299
- ```
300
-
301
- 3. **发起 PR** - 提交到 `main-dev` 分支
302
- ```bash
303
- gh pr create --base main-dev --title "Fix: 描述" --body "Closes #123"
304
- ```
305
-
306
- 4. **合并** - 审批后合并到 `main-dev`
307
-
308
- 更多详情见 [.github/copilot-instructions.md](.github/copilot-instructions.md)