isagellm-core 0.4.0.18__py2.py3-none-any.whl → 0.4.0.19__py2.py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- isagellm_core-0.4.0.19.dist-info/METADATA +718 -0
- isagellm_core-0.4.0.19.dist-info/RECORD +174 -0
- {isagellm_core-0.4.0.18.dist-info → isagellm_core-0.4.0.19.dist-info}/WHEEL +1 -1
- sagellm_core/__init__.py +4 -22
- sagellm_core/__init__.pyc +0 -0
- sagellm_core/__main__.pyc +0 -0
- sagellm_core/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/config.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/demo.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/engine.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/engine_factory.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/engine_server.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/factory.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/health.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/llm_engine.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/pd_executor.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/plugins.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/runner.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/runtime.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/workload.cpython-311.pyc +0 -0
- sagellm_core/config.pyc +0 -0
- sagellm_core/decoding/__init__.pyc +0 -0
- sagellm_core/decoding/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/decoding/__pycache__/base.cpython-311.pyc +0 -0
- sagellm_core/decoding/__pycache__/beam_search.cpython-311.pyc +0 -0
- sagellm_core/decoding/__pycache__/contrastive.cpython-311.pyc +0 -0
- sagellm_core/decoding/__pycache__/greedy.cpython-311.pyc +0 -0
- sagellm_core/decoding/__pycache__/sampling.cpython-311.pyc +0 -0
- sagellm_core/decoding/base.pyc +0 -0
- sagellm_core/decoding/beam_search.pyc +0 -0
- sagellm_core/decoding/contrastive.pyc +0 -0
- sagellm_core/decoding/greedy.pyc +0 -0
- sagellm_core/decoding/sampling.pyc +0 -0
- sagellm_core/demo.pyc +0 -0
- sagellm_core/distributed/__init__.pyc +0 -0
- sagellm_core/distributed/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/distributed/__pycache__/strategies.cpython-311.pyc +0 -0
- sagellm_core/distributed/strategies.pyc +0 -0
- sagellm_core/engine.pyc +0 -0
- sagellm_core/engine_core/__init__.pyc +0 -0
- sagellm_core/engine_core/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/engine_core/__pycache__/engine_core.cpython-311.pyc +0 -0
- sagellm_core/engine_core/engine_core.pyc +0 -0
- sagellm_core/engine_core/scheduler/__init__.py +31 -3
- sagellm_core/engine_core/scheduler/__init__.pyc +0 -0
- sagellm_core/engine_core/scheduler/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/__pycache__/base.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/__pycache__/batch.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/__pycache__/metrics.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/__pycache__/scheduler.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/__pycache__/scheduler_kv_bridge.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/base.pyc +0 -0
- sagellm_core/engine_core/scheduler/batch.pyc +0 -0
- sagellm_core/engine_core/scheduler/metrics.pyc +0 -0
- sagellm_core/engine_core/scheduler/policy/__init__.py +6 -0
- sagellm_core/engine_core/scheduler/policy/__init__.pyc +0 -0
- sagellm_core/engine_core/scheduler/policy/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/policy/__pycache__/fcfs.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/policy/__pycache__/priority.cpython-311.pyc +0 -0
- sagellm_core/engine_core/scheduler/policy/fcfs.pyc +0 -0
- sagellm_core/engine_core/scheduler/policy/priority.pyc +0 -0
- sagellm_core/engine_core/scheduler/scheduler.pyc +0 -0
- sagellm_core/engine_core/scheduler/scheduler_kv_bridge.pyc +0 -0
- sagellm_core/engine_factory.pyc +0 -0
- sagellm_core/engine_server.pyc +0 -0
- sagellm_core/engines/__init__.pyc +0 -0
- sagellm_core/engines/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/ascend.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/cpu.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/embedding.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/hf_cuda.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/pytorch.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/pytorch_engine.cpython-311.pyc +0 -0
- sagellm_core/engines/embedding.pyc +0 -0
- sagellm_core/executor/__init__.pyc +0 -0
- sagellm_core/executor/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/executor/__pycache__/executor_base.cpython-311.pyc +0 -0
- sagellm_core/executor/__pycache__/uniproc_executor.cpython-311.pyc +0 -0
- sagellm_core/executor/executor_base.pyc +0 -0
- sagellm_core/executor/uniproc_executor.pyc +0 -0
- sagellm_core/factory.pyc +0 -0
- sagellm_core/health.pyc +0 -0
- sagellm_core/inputs/__init__.pyc +0 -0
- sagellm_core/inputs/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/inputs/__pycache__/processor.cpython-311.pyc +0 -0
- sagellm_core/inputs/__pycache__/tokenizer_utils.cpython-311.pyc +0 -0
- sagellm_core/inputs/processor.pyc +0 -0
- sagellm_core/inputs/tokenizer_utils.pyc +0 -0
- sagellm_core/layers/__init__.py +30 -0
- sagellm_core/layers/__init__.pyc +0 -0
- sagellm_core/layers/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/layers/__pycache__/activation.cpython-311.pyc +0 -0
- sagellm_core/layers/__pycache__/base.cpython-311.pyc +0 -0
- sagellm_core/layers/__pycache__/embedding.cpython-311.pyc +0 -0
- sagellm_core/layers/__pycache__/linear.cpython-311.pyc +0 -0
- sagellm_core/layers/__pycache__/normalization.cpython-311.pyc +0 -0
- sagellm_core/layers/activation.pyc +0 -0
- sagellm_core/layers/base.pyc +0 -0
- sagellm_core/layers/embedding.pyc +0 -0
- sagellm_core/layers/linear.pyc +0 -0
- sagellm_core/layers/normalization.pyc +0 -0
- sagellm_core/llm_engine.pyc +0 -0
- sagellm_core/model/__init__.py +131 -5
- sagellm_core/model/__init__.pyc +0 -0
- sagellm_core/model/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/base.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/factory.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/gpt2.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/llama.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/mixtral.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/model_loader.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/quantization.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/qwen2.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/registry.cpython-311.pyc +0 -0
- sagellm_core/model/__pycache__/weight_utils.cpython-311.pyc +0 -0
- sagellm_core/model/base.pyc +0 -0
- sagellm_core/model/factory.pyc +0 -0
- sagellm_core/model/gpt2.pyc +0 -0
- sagellm_core/model/llama.pyc +0 -0
- sagellm_core/model/mixtral.pyc +0 -0
- sagellm_core/model/model_loader.pyc +0 -0
- sagellm_core/model/quantization.pyc +0 -0
- sagellm_core/model/qwen2.pyc +0 -0
- sagellm_core/model/registry.pyc +0 -0
- sagellm_core/model/weight_loader/__init__.py +54 -0
- sagellm_core/model/weight_loader/__init__.pyc +0 -0
- sagellm_core/model/weight_loader/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/model/weight_loader/__pycache__/base.cpython-311.pyc +0 -0
- sagellm_core/model/weight_loader/__pycache__/pytorch.cpython-311.pyc +0 -0
- sagellm_core/model/weight_loader/__pycache__/quantized.cpython-311.pyc +0 -0
- sagellm_core/model/weight_loader/__pycache__/safetensors.cpython-311.pyc +0 -0
- sagellm_core/model/weight_loader/base.pyc +0 -0
- sagellm_core/model/weight_loader/pytorch.pyc +0 -0
- sagellm_core/model/weight_loader/quantized.pyc +0 -0
- sagellm_core/model/weight_loader/safetensors.pyc +0 -0
- sagellm_core/model/weight_utils.pyc +0 -0
- sagellm_core/observability/__init__.pyc +0 -0
- sagellm_core/observability/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/observability/__pycache__/logger.cpython-311.pyc +0 -0
- sagellm_core/observability/__pycache__/metrics.cpython-311.pyc +0 -0
- sagellm_core/observability/logger.pyc +0 -0
- sagellm_core/observability/metrics.pyc +0 -0
- sagellm_core/pd_executor.pyc +0 -0
- sagellm_core/plugins.pyc +0 -0
- sagellm_core/runner.pyc +0 -0
- sagellm_core/runtime.pyc +0 -0
- sagellm_core/sampling/__init__.pyc +0 -0
- sagellm_core/sampling/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/sampling/__pycache__/params.cpython-311.pyc +0 -0
- sagellm_core/sampling/__pycache__/sampler.cpython-311.pyc +0 -0
- sagellm_core/sampling/params.pyc +0 -0
- sagellm_core/sampling/sampler.pyc +0 -0
- sagellm_core/worker/__init__.pyc +0 -0
- sagellm_core/worker/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/worker/__pycache__/worker.cpython-311.pyc +0 -0
- sagellm_core/worker/model_runner/__init__.pyc +0 -0
- sagellm_core/worker/model_runner/__pycache__/__init__.cpython-311.pyc +0 -0
- sagellm_core/worker/model_runner/__pycache__/model_runner.cpython-311.pyc +0 -0
- sagellm_core/worker/model_runner/model_runner.pyc +0 -0
- sagellm_core/worker/worker.pyc +0 -0
- sagellm_core/workload.pyc +0 -0
- isagellm_core-0.4.0.18.dist-info/METADATA +0 -308
- isagellm_core-0.4.0.18.dist-info/RECORD +0 -122
- sagellm_core/__pycache__/base_engine.cpython-311.pyc +0 -0
- sagellm_core/__pycache__/mock_engine.cpython-311.pyc +0 -0
- sagellm_core/engines/__pycache__/mock.cpython-311.pyc +0 -0
- {isagellm_core-0.4.0.18.dist-info → isagellm_core-0.4.0.19.dist-info}/entry_points.txt +0 -0
- {isagellm_core-0.4.0.18.dist-info → isagellm_core-0.4.0.19.dist-info}/top_level.txt +0 -0
sagellm_core/model/__init__.py
CHANGED
|
@@ -1,13 +1,139 @@
|
|
|
1
|
-
"""Model
|
|
1
|
+
"""Model definitions using backend kernels.
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
This module provides model architectures that use sagellm-core layers
|
|
4
|
+
instead of direct HuggingFace implementations.
|
|
4
5
|
|
|
6
|
+
Key models:
|
|
7
|
+
- BaseLLMModel: Abstract base for all LLM models
|
|
8
|
+
- GPT2Model: GPT-2 architecture (for testing)
|
|
9
|
+
- LlamaModel: LLaMA/LLaMA-2/LLaMA-3 architecture
|
|
10
|
+
- Qwen2Model: Qwen2/Qwen2.5 architecture
|
|
11
|
+
- MixtralModel: Mixtral Mixture of Experts architecture
|
|
12
|
+
- create_model_from_config: Factory function to create models
|
|
13
|
+
|
|
14
|
+
Registry system:
|
|
15
|
+
- MODEL_REGISTRY: Central registry of model types
|
|
16
|
+
- register_model: Decorator to register new models
|
|
17
|
+
- get_model_class: Look up model class by type
|
|
18
|
+
|
|
19
|
+
Weight loading:
|
|
20
|
+
- WeightLoaderProtocol: Protocol for weight loaders
|
|
21
|
+
- WEIGHT_LOADERS: Registry of weight formats
|
|
22
|
+
- SafetensorsLoader, PytorchLoader: Built-in loaders
|
|
23
|
+
- QuantizedWeightLoader: Quantized weight loading (INT8/AWQ/GPTQ)
|
|
24
|
+
|
|
25
|
+
Quantization:
|
|
26
|
+
- QuantConfig: Quantization configuration
|
|
27
|
+
- detect_quantization: Auto-detect quantization format
|
|
28
|
+
"""
|
|
29
|
+
|
|
30
|
+
# Registry (import first to ensure registration works)
|
|
31
|
+
from sagellm_core.model.registry import (
|
|
32
|
+
MODEL_REGISTRY,
|
|
33
|
+
get_model_class,
|
|
34
|
+
is_model_registered,
|
|
35
|
+
list_registered_models,
|
|
36
|
+
register_model,
|
|
37
|
+
)
|
|
38
|
+
|
|
39
|
+
# Base class
|
|
40
|
+
from sagellm_core.model.base import BaseLLMModel
|
|
41
|
+
|
|
42
|
+
# Factory
|
|
43
|
+
from sagellm_core.model.factory import create_model_from_config, load_model_weights
|
|
44
|
+
|
|
45
|
+
# Models (import triggers registration)
|
|
46
|
+
from sagellm_core.model.gpt2 import GPT2Model
|
|
47
|
+
from sagellm_core.model.llama import (
|
|
48
|
+
LlamaModel,
|
|
49
|
+
LlamaAttention,
|
|
50
|
+
LlamaMLP,
|
|
51
|
+
LlamaDecoderLayer,
|
|
52
|
+
LlamaRotaryEmbedding,
|
|
53
|
+
)
|
|
54
|
+
from sagellm_core.model.qwen2 import (
|
|
55
|
+
Qwen2Model,
|
|
56
|
+
Qwen2Attention,
|
|
57
|
+
Qwen2MLP,
|
|
58
|
+
Qwen2DecoderLayer,
|
|
59
|
+
)
|
|
60
|
+
from sagellm_core.model.mixtral import (
|
|
61
|
+
MixtralModel,
|
|
62
|
+
MixtralSparseMoE,
|
|
63
|
+
MixtralMLP,
|
|
64
|
+
MixtralDecoderLayer,
|
|
65
|
+
)
|
|
66
|
+
|
|
67
|
+
# Quantization
|
|
68
|
+
from sagellm_core.model.quantization import (
|
|
69
|
+
QuantConfig,
|
|
70
|
+
QuantizationType,
|
|
71
|
+
detect_quantization,
|
|
72
|
+
)
|
|
73
|
+
|
|
74
|
+
# Weight loaders
|
|
75
|
+
from sagellm_core.model.weight_loader import (
|
|
76
|
+
WEIGHT_LOADERS,
|
|
77
|
+
WeightLoaderProtocol,
|
|
78
|
+
WeightNameMapper,
|
|
79
|
+
SafetensorsLoader,
|
|
80
|
+
PytorchLoader,
|
|
81
|
+
detect_weight_format,
|
|
82
|
+
get_weight_loader,
|
|
83
|
+
list_weight_loaders,
|
|
84
|
+
load_weights_into_model,
|
|
85
|
+
register_weight_loader,
|
|
86
|
+
)
|
|
87
|
+
from sagellm_core.model.weight_loader.quantized import QuantizedWeightLoader
|
|
88
|
+
|
|
89
|
+
# Legacy compatibility
|
|
5
90
|
from sagellm_core.model.model_loader import ModelLoader, load_model
|
|
6
|
-
from sagellm_core.model.weight_utils import WeightLoader, QuantizedWeightLoader
|
|
7
91
|
|
|
8
92
|
__all__ = [
|
|
93
|
+
# Base
|
|
94
|
+
"BaseLLMModel",
|
|
95
|
+
# Models
|
|
96
|
+
"GPT2Model",
|
|
97
|
+
"LlamaModel",
|
|
98
|
+
"LlamaAttention",
|
|
99
|
+
"LlamaMLP",
|
|
100
|
+
"LlamaDecoderLayer",
|
|
101
|
+
"LlamaRotaryEmbedding",
|
|
102
|
+
"Qwen2Model",
|
|
103
|
+
"Qwen2Attention",
|
|
104
|
+
"Qwen2MLP",
|
|
105
|
+
"Qwen2DecoderLayer",
|
|
106
|
+
"MixtralModel",
|
|
107
|
+
"MixtralSparseMoE",
|
|
108
|
+
"MixtralMLP",
|
|
109
|
+
"MixtralDecoderLayer",
|
|
110
|
+
# Factory
|
|
111
|
+
"create_model_from_config",
|
|
112
|
+
"load_model_weights",
|
|
113
|
+
# Registry
|
|
114
|
+
"MODEL_REGISTRY",
|
|
115
|
+
"register_model",
|
|
116
|
+
"get_model_class",
|
|
117
|
+
"list_registered_models",
|
|
118
|
+
"is_model_registered",
|
|
119
|
+
# Quantization
|
|
120
|
+
"QuantConfig",
|
|
121
|
+
"QuantizationType",
|
|
122
|
+
"detect_quantization",
|
|
123
|
+
# Weight loaders
|
|
124
|
+
"WEIGHT_LOADERS",
|
|
125
|
+
"WeightLoaderProtocol",
|
|
126
|
+
"WeightNameMapper",
|
|
127
|
+
"SafetensorsLoader",
|
|
128
|
+
"PytorchLoader",
|
|
129
|
+
"QuantizedWeightLoader",
|
|
130
|
+
"detect_weight_format",
|
|
131
|
+
"get_weight_loader",
|
|
132
|
+
"list_weight_loaders",
|
|
133
|
+
"load_weights_into_model",
|
|
134
|
+
"register_weight_loader",
|
|
135
|
+
# Legacy
|
|
9
136
|
"ModelLoader",
|
|
10
137
|
"load_model",
|
|
11
|
-
"WeightLoader",
|
|
12
|
-
"QuantizedWeightLoader",
|
|
13
138
|
]
|
|
139
|
+
|
sagellm_core/model/__init__.pyc
CHANGED
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
"""Weight loader module.
|
|
2
|
+
|
|
3
|
+
This module provides utilities for loading model weights from various formats:
|
|
4
|
+
- safetensors: Fast, secure tensor format (recommended)
|
|
5
|
+
- pytorch: Standard PyTorch .bin/.pt checkpoints
|
|
6
|
+
|
|
7
|
+
Usage:
|
|
8
|
+
from sagellm_core.model.weight_loader import (
|
|
9
|
+
get_weight_loader,
|
|
10
|
+
detect_weight_format,
|
|
11
|
+
load_weights_into_model,
|
|
12
|
+
)
|
|
13
|
+
|
|
14
|
+
# Auto-detect format and load
|
|
15
|
+
load_weights_into_model(model, "path/to/model")
|
|
16
|
+
|
|
17
|
+
# Or manually:
|
|
18
|
+
loader = get_weight_loader("safetensors")
|
|
19
|
+
for name, tensor in loader.load_weights("path/to/model"):
|
|
20
|
+
print(f"{name}: {tensor.shape}")
|
|
21
|
+
"""
|
|
22
|
+
|
|
23
|
+
from sagellm_core.model.weight_loader.base import (
|
|
24
|
+
WEIGHT_LOADERS,
|
|
25
|
+
WeightLoaderProtocol,
|
|
26
|
+
WeightNameMapper,
|
|
27
|
+
detect_weight_format,
|
|
28
|
+
get_weight_loader,
|
|
29
|
+
is_weight_loader_registered,
|
|
30
|
+
list_weight_loaders,
|
|
31
|
+
load_weights_into_model,
|
|
32
|
+
register_weight_loader,
|
|
33
|
+
)
|
|
34
|
+
|
|
35
|
+
# Import loaders to trigger registration
|
|
36
|
+
from sagellm_core.model.weight_loader.safetensors import SafetensorsLoader
|
|
37
|
+
from sagellm_core.model.weight_loader.pytorch import PytorchLoader
|
|
38
|
+
|
|
39
|
+
__all__ = [
|
|
40
|
+
# Protocol and registry
|
|
41
|
+
"WeightLoaderProtocol",
|
|
42
|
+
"WEIGHT_LOADERS",
|
|
43
|
+
"register_weight_loader",
|
|
44
|
+
"get_weight_loader",
|
|
45
|
+
"list_weight_loaders",
|
|
46
|
+
"is_weight_loader_registered",
|
|
47
|
+
# Utilities
|
|
48
|
+
"detect_weight_format",
|
|
49
|
+
"WeightNameMapper",
|
|
50
|
+
"load_weights_into_model",
|
|
51
|
+
# Concrete loaders
|
|
52
|
+
"SafetensorsLoader",
|
|
53
|
+
"PytorchLoader",
|
|
54
|
+
]
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
sagellm_core/pd_executor.pyc
CHANGED
|
Binary file
|
sagellm_core/plugins.pyc
CHANGED
|
Binary file
|
sagellm_core/runner.pyc
CHANGED
|
Binary file
|
sagellm_core/runtime.pyc
CHANGED
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
sagellm_core/sampling/params.pyc
CHANGED
|
Binary file
|
|
Binary file
|
sagellm_core/worker/__init__.pyc
CHANGED
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
sagellm_core/worker/worker.pyc
CHANGED
|
Binary file
|
sagellm_core/workload.pyc
CHANGED
|
Binary file
|
|
@@ -1,308 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: isagellm-core
|
|
3
|
-
Version: 0.4.0.18
|
|
4
|
-
Summary: sageLLM core runtime with PD separation (MVP)
|
|
5
|
-
Author: IntelliStream Team
|
|
6
|
-
License: Proprietary - IntelliStream
|
|
7
|
-
Classifier: Development Status :: 3 - Alpha
|
|
8
|
-
Classifier: Programming Language :: Python :: 3
|
|
9
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
10
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
12
|
-
Requires-Python: ==3.11.*
|
|
13
|
-
Description-Content-Type: text/markdown
|
|
14
|
-
Requires-Dist: pydantic>=2.0.0
|
|
15
|
-
Requires-Dist: pyyaml>=6.0.0
|
|
16
|
-
Requires-Dist: isagellm-protocol<0.5.0,>=0.4.0.0
|
|
17
|
-
Requires-Dist: isagellm-backend<0.5.0,>=0.4.0.0
|
|
18
|
-
Requires-Dist: isagellm-comm<0.5.0,>=0.4.0.0
|
|
19
|
-
Requires-Dist: isagellm-kv-cache<0.5.0,>=0.4.0.0
|
|
20
|
-
Requires-Dist: fastapi>=0.100.0
|
|
21
|
-
Requires-Dist: uvicorn>=0.22.0
|
|
22
|
-
Requires-Dist: torch>=2.0.0
|
|
23
|
-
Requires-Dist: transformers>=4.35.0
|
|
24
|
-
Requires-Dist: accelerate>=0.26.0
|
|
25
|
-
Provides-Extra: dev
|
|
26
|
-
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
27
|
-
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
28
|
-
Requires-Dist: pytest-timeout>=2.0.0; extra == "dev"
|
|
29
|
-
Requires-Dist: ruff>=0.8.0; extra == "dev"
|
|
30
|
-
Requires-Dist: mypy>=1.0.0; extra == "dev"
|
|
31
|
-
Requires-Dist: types-PyYAML>=6.0.0; extra == "dev"
|
|
32
|
-
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
|
|
33
|
-
Requires-Dist: isage-pypi-publisher>=0.2.0; extra == "dev"
|
|
34
|
-
|
|
35
|
-
# sagellm-core
|
|
36
|
-
|
|
37
|
-
## Protocol Compliance (Mandatory)
|
|
38
|
-
|
|
39
|
-
- MUST follow Protocol v0.1: https://github.com/intellistream/sagellm-docs/blob/main/docs/specs/protocol_v0.1.md
|
|
40
|
-
- Any globally shared definitions (fields, error codes, metrics, IDs, schemas) MUST be added to Protocol first.
|
|
41
|
-
|
|
42
|
-
[](https://github.com/intellistream/sagellm-core/actions/workflows/ci.yml)
|
|
43
|
-
[](https://pypi.org/project/isagellm-core/)
|
|
44
|
-
[](https://pypi.org/project/isagellm-core/)
|
|
45
|
-
[](https://github.com/intellistream/sagellm-core/blob/main/LICENSE)
|
|
46
|
-
[](https://github.com/astral-sh/ruff)
|
|
47
|
-
|
|
48
|
-
sageLLM Core - 引擎协调层与运行时系统
|
|
49
|
-
|
|
50
|
-
## 架构定位
|
|
51
|
-
|
|
52
|
-
```
|
|
53
|
-
┌─────────────────────────────────────────────────────────────┐
|
|
54
|
-
│ sagellm-core (引擎协调层) ← 本仓库 │
|
|
55
|
-
│ ┌─────────────────────────────────────────────────────┐ │
|
|
56
|
-
│ │ LLMEngine (Hardware-Agnostic, vLLM v1 style) │ │
|
|
57
|
-
│ │ • 统一推理接口: generate, stream, execute │ │
|
|
58
|
-
│ │ • 自动后端选择 (auto-detect cuda/ascend/cpu) │ │
|
|
59
|
-
│ │ • 配置驱动 (LLMEngineConfig) │ │
|
|
60
|
-
│ └─────────────────────────────────────────────────────┘ │
|
|
61
|
-
│ ┌─────────────────────────────────────────────────────┐ │
|
|
62
|
-
│ │ Configuration System (config.py) │ │
|
|
63
|
-
│ │ • YAML/JSON 配置解析 │ │
|
|
64
|
-
│ │ • Pydantic v2 类型验证 │ │
|
|
65
|
-
│ └─────────────────────────────────────────────────────┘ │
|
|
66
|
-
├─────────────────────────────────────────────────────────────┤
|
|
67
|
-
│ sagellm-backend (硬件抽象层) │
|
|
68
|
-
│ • BackendProvider (CPU/CUDA/Ascend) │
|
|
69
|
-
│ • Stream/Event/KVBlock 管理 │
|
|
70
|
-
└─────────────────────────────────────────────────────────────┘
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
**职责分离**:
|
|
74
|
-
- ✅ **Core 负责**:LLMEngine (硬件无关)、配置、协调
|
|
75
|
-
- ✅ **Backend 负责**:硬件抽象、设备管理、Provider 实现
|
|
76
|
-
|
|
77
|
-
## Features
|
|
78
|
-
|
|
79
|
-
- 🔧 引擎抽象层与自描述架构
|
|
80
|
-
- 🏭 EngineFactory - 支持自动发现与优先级选择
|
|
81
|
-
- 🎯 内置引擎实现(CPU/CUDA/Embedding)
|
|
82
|
-
- 🔌 插件系统 - 扩展引擎与后端
|
|
83
|
-
- ⚙️ 配置系统 - YAML/JSON + Pydantic v2
|
|
84
|
-
- ✅ CPU-First - 无 GPU 测试支持
|
|
85
|
-
|
|
86
|
-
## MoE(规划中)
|
|
87
|
-
|
|
88
|
-
- MoE 路由/调度/执行图由 **sagellm-core** 主责
|
|
89
|
-
- 依赖 **sagellm-comm**(all-to-all/拓扑通信)与 **sagellm-backend**(专家算子/内核)
|
|
90
|
-
- 需先在 Protocol 中补充全局字段(如 router 元数据、专家负载指标)
|
|
91
|
-
|
|
92
|
-
## 安装
|
|
93
|
-
|
|
94
|
-
```bash
|
|
95
|
-
# 从 PyPI 安装(自动安装 protocol + backend 依赖)
|
|
96
|
-
pip install isagellm-core
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
## 🚀 开发者快速开始
|
|
100
|
-
|
|
101
|
-
```bash
|
|
102
|
-
git clone git@github.com:intellistream/sagellm-core.git
|
|
103
|
-
cd sagellm-core
|
|
104
|
-
./quickstart.sh # 一键安装开发环境(含依赖)
|
|
105
|
-
|
|
106
|
-
# 或手动安装
|
|
107
|
-
pip install -e ".[dev]"
|
|
108
|
-
```
|
|
109
|
-
|
|
110
|
-
运行测试:
|
|
111
|
-
```bash
|
|
112
|
-
pytest tests/ -v
|
|
113
|
-
```
|
|
114
|
-
|
|
115
|
-
> 💡 **提示**:`isagellm-protocol` 和 `isagellm-backend` 会自动从 PyPI 安装。
|
|
116
|
-
> 如需本地联调:
|
|
117
|
-
> ```bash
|
|
118
|
-
> pip install -e ../sagellm-protocol
|
|
119
|
-
> pip install -e ../sagellm-backend
|
|
120
|
-
> ```
|
|
121
|
-
|
|
122
|
-
## Configuration System
|
|
123
|
-
|
|
124
|
-
### 使用方法
|
|
125
|
-
|
|
126
|
-
```python
|
|
127
|
-
from sagellm_core import load_config, create_backend, create_engine
|
|
128
|
-
|
|
129
|
-
# 从 YAML/JSON 加载配置
|
|
130
|
-
config = load_config("config.yaml")
|
|
131
|
-
|
|
132
|
-
# 创建 backend 和 engine(通过插件发现)
|
|
133
|
-
backend = create_backend(config.backend)
|
|
134
|
-
engine = create_engine(config.engine, backend)
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
### Configuration Structure
|
|
138
|
-
|
|
139
|
-
Main configuration components:
|
|
140
|
-
- `BackendConfig`: Device and backend settings
|
|
141
|
-
- `EngineConfig`: Inference engine configuration
|
|
142
|
-
- `WorkloadConfig`: Workload parameters
|
|
143
|
-
- `OutputConfig`: Output paths and logging
|
|
144
|
-
|
|
145
|
-
### 配置示例
|
|
146
|
-
|
|
147
|
-
#### 示例配置文件
|
|
148
|
-
|
|
149
|
-
- [config_cpu.yaml](examples/config_cpu.yaml) - CPU 模式(CI/开发)
|
|
150
|
-
- [config_cuda.yaml](examples/config_cuda.yaml) - CUDA 生产模式
|
|
151
|
-
- [config_ascend.yaml](examples/config_ascend.yaml) - 昇腾生产模式
|
|
152
|
-
- [config_minimal.json](examples/config_minimal.json) - 最小 JSON 配置
|
|
153
|
-
|
|
154
|
-
更多信息参见 [examples/README.md](examples/README.md)
|
|
155
|
-
|
|
156
|
-
### 配置格式
|
|
157
|
-
|
|
158
|
-
支持 YAML(推荐)和 JSON 格式:
|
|
159
|
-
|
|
160
|
-
```yaml
|
|
161
|
-
backend:
|
|
162
|
-
kind: cpu
|
|
163
|
-
engine:
|
|
164
|
-
kind: cpu
|
|
165
|
-
model: sshleifer/tiny-gpt2
|
|
166
|
-
workload:
|
|
167
|
-
segments: [short, long, stress]
|
|
168
|
-
concurrency: 4
|
|
169
|
-
output:
|
|
170
|
-
metrics_path: ./metrics.json
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
### 插件解析
|
|
174
|
-
|
|
175
|
-
当配置指定的 backend/engine kind 未安装时,会抛出 `PluginResolutionError`:
|
|
176
|
-
|
|
177
|
-
```python
|
|
178
|
-
from sagellm_core import create_backend, BackendConfig, PluginResolutionError
|
|
179
|
-
|
|
180
|
-
try:
|
|
181
|
-
backend = create_backend(BackendConfig(kind="ascend_cann", device="npu:0"))
|
|
182
|
-
except PluginResolutionError as e:
|
|
183
|
-
print(f"错误: {e}")
|
|
184
|
-
# 输出: No implementation found for sagellm.backends kind='ascend_cann'.
|
|
185
|
-
# Install hint: pip install isagellm-backend-ascend_cann
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
## Development Guide
|
|
189
|
-
|
|
190
|
-
### 快速开始
|
|
191
|
-
|
|
192
|
-
```bash
|
|
193
|
-
# 克隆仓库
|
|
194
|
-
git clone https://github.com/intellistream/sagellm-core.git
|
|
195
|
-
cd sagellm-core
|
|
196
|
-
|
|
197
|
-
# 安装开发依赖
|
|
198
|
-
pip install -e ".[dev]"
|
|
199
|
-
|
|
200
|
-
# 安装 pre-commit hooks
|
|
201
|
-
pre-commit install
|
|
202
|
-
|
|
203
|
-
# 验证环境
|
|
204
|
-
pytest tests/ -v
|
|
205
|
-
```
|
|
206
|
-
|
|
207
|
-
### Testing and Quality
|
|
208
|
-
|
|
209
|
-
#### Pre-commit Hooks
|
|
210
|
-
|
|
211
|
-
安装后,每次 `git commit` 会自动运行:
|
|
212
|
-
- **Ruff**: 代码格式化 + Lint 检查
|
|
213
|
-
- **Mypy**: 静态类型检查
|
|
214
|
-
- **YAML/JSON**: 配置文件验证
|
|
215
|
-
|
|
216
|
-
```bash
|
|
217
|
-
# 手动运行所有 hooks
|
|
218
|
-
pre-commit run --all-files
|
|
219
|
-
|
|
220
|
-
# 绕过 hooks(紧急情况)
|
|
221
|
-
git commit --no-verify
|
|
222
|
-
```
|
|
223
|
-
|
|
224
|
-
#### Running Tests
|
|
225
|
-
|
|
226
|
-
```bash
|
|
227
|
-
# Run all tests
|
|
228
|
-
pytest tests/ -v
|
|
229
|
-
|
|
230
|
-
# Run specific test module
|
|
231
|
-
pytest tests/unit/test_demo.py -v
|
|
232
|
-
|
|
233
|
-
# Generate coverage report
|
|
234
|
-
pytest tests/ --cov=sagellm_core --cov-report=html
|
|
235
|
-
```
|
|
236
|
-
|
|
237
|
-
#### Continuous Integration
|
|
238
|
-
|
|
239
|
-
GitHub Actions automatically runs on each PR:
|
|
240
|
-
- Code linting and formatting checks
|
|
241
|
-
- Tests across Python 3.10, 3.11, 3.12
|
|
242
|
-
- Package build verification
|
|
243
|
-
|
|
244
|
-
### Code Style
|
|
245
|
-
|
|
246
|
-
This project uses:
|
|
247
|
-
- **Ruff** for formatting and linting
|
|
248
|
-
- **Mypy** for type checking
|
|
249
|
-
- **Type hints** are required for all functions
|
|
250
|
-
|
|
251
|
-
For detailed guidelines, see [CONTRIBUTING.md](CONTRIBUTING.md)
|
|
252
|
-
|
|
253
|
-
### 代码检查
|
|
254
|
-
|
|
255
|
-
```bash
|
|
256
|
-
# 格式化代码
|
|
257
|
-
ruff format .
|
|
258
|
-
|
|
259
|
-
# Lint 检查
|
|
260
|
-
ruff check .
|
|
261
|
-
|
|
262
|
-
# 类型检查
|
|
263
|
-
mypy src/sagellm_core
|
|
264
|
-
|
|
265
|
-
# 一键检查所有
|
|
266
|
-
pre-commit run --all-files
|
|
267
|
-
```
|
|
268
|
-
|
|
269
|
-
## 依赖
|
|
270
|
-
|
|
271
|
-
- `pydantic>=2.0.0`: 配置校验
|
|
272
|
-
- `pyyaml>=6.0.0`: YAML 配置支持
|
|
273
|
-
- `isagellm-protocol>=0.1.0,<0.2.0`: 协议定义(Level 0)
|
|
274
|
-
- `isagellm-backend>=0.1.0,<0.2.0`: 后端抽象(Level 1)
|
|
275
|
-
|
|
276
|
-
## Related Packages
|
|
277
|
-
|
|
278
|
-
- `isagellm-protocol` - Protocol definitions
|
|
279
|
-
- `isagellm-backend` - Backend abstraction layer
|
|
280
|
-
- `isagellm` - Main package with CLI
|
|
281
|
-
|
|
282
|
-
For more packages, see the [sageLLM ecosystem](https://github.com/intellistream/sagellm)
|
|
283
|
-
|
|
284
|
-
## 🔄 贡献指南
|
|
285
|
-
|
|
286
|
-
请遵循以下工作流程:
|
|
287
|
-
|
|
288
|
-
1. **创建 Issue** - 描述问题/需求
|
|
289
|
-
```bash
|
|
290
|
-
gh issue create --title "[Bug] 描述" --label "bug,sagellm-core"
|
|
291
|
-
```
|
|
292
|
-
|
|
293
|
-
2. **开发修复** - 在本地 `fix/#123-xxx` 分支解决
|
|
294
|
-
```bash
|
|
295
|
-
git checkout -b fix/#123-xxx origin/main-dev
|
|
296
|
-
# 开发、测试...
|
|
297
|
-
pytest -v
|
|
298
|
-
ruff format . && ruff check . --fix
|
|
299
|
-
```
|
|
300
|
-
|
|
301
|
-
3. **发起 PR** - 提交到 `main-dev` 分支
|
|
302
|
-
```bash
|
|
303
|
-
gh pr create --base main-dev --title "Fix: 描述" --body "Closes #123"
|
|
304
|
-
```
|
|
305
|
-
|
|
306
|
-
4. **合并** - 审批后合并到 `main-dev`
|
|
307
|
-
|
|
308
|
-
更多详情见 [.github/copilot-instructions.md](.github/copilot-instructions.md)
|