xinference 0.9.4__py3-none-any.whl → 0.10.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of xinference might be problematic. Click here for more details.
- xinference/_version.py +3 -3
- xinference/api/oauth2/auth_service.py +47 -18
- xinference/api/oauth2/types.py +1 -0
- xinference/api/restful_api.py +34 -7
- xinference/client/oscar/actor_client.py +4 -3
- xinference/client/restful/restful_client.py +20 -4
- xinference/conftest.py +13 -2
- xinference/core/supervisor.py +48 -1
- xinference/core/worker.py +139 -20
- xinference/deploy/cmdline.py +119 -20
- xinference/model/embedding/core.py +1 -2
- xinference/model/llm/__init__.py +4 -6
- xinference/model/llm/ggml/llamacpp.py +2 -10
- xinference/model/llm/llm_family.json +877 -13
- xinference/model/llm/llm_family.py +15 -0
- xinference/model/llm/llm_family_modelscope.json +571 -0
- xinference/model/llm/pytorch/chatglm.py +2 -0
- xinference/model/llm/pytorch/core.py +22 -26
- xinference/model/llm/pytorch/deepseek_vl.py +232 -0
- xinference/model/llm/pytorch/internlm2.py +2 -0
- xinference/model/llm/pytorch/omnilmm.py +153 -0
- xinference/model/llm/pytorch/qwen_vl.py +2 -0
- xinference/model/llm/pytorch/yi_vl.py +4 -2
- xinference/model/llm/utils.py +53 -5
- xinference/model/llm/vllm/core.py +54 -6
- xinference/model/rerank/core.py +3 -0
- xinference/thirdparty/deepseek_vl/__init__.py +31 -0
- xinference/thirdparty/deepseek_vl/models/__init__.py +28 -0
- xinference/thirdparty/deepseek_vl/models/clip_encoder.py +242 -0
- xinference/thirdparty/deepseek_vl/models/image_processing_vlm.py +208 -0
- xinference/thirdparty/deepseek_vl/models/modeling_vlm.py +170 -0
- xinference/thirdparty/deepseek_vl/models/processing_vlm.py +390 -0
- xinference/thirdparty/deepseek_vl/models/projector.py +100 -0
- xinference/thirdparty/deepseek_vl/models/sam.py +593 -0
- xinference/thirdparty/deepseek_vl/models/siglip_vit.py +681 -0
- xinference/thirdparty/deepseek_vl/utils/__init__.py +18 -0
- xinference/thirdparty/deepseek_vl/utils/conversation.py +348 -0
- xinference/thirdparty/deepseek_vl/utils/io.py +78 -0
- xinference/thirdparty/omnilmm/__init__.py +0 -0
- xinference/thirdparty/omnilmm/chat.py +216 -0
- xinference/thirdparty/omnilmm/constants.py +4 -0
- xinference/thirdparty/omnilmm/conversation.py +332 -0
- xinference/thirdparty/omnilmm/model/__init__.py +1 -0
- xinference/thirdparty/omnilmm/model/omnilmm.py +594 -0
- xinference/thirdparty/omnilmm/model/resampler.py +166 -0
- xinference/thirdparty/omnilmm/model/utils.py +563 -0
- xinference/thirdparty/omnilmm/train/__init__.py +13 -0
- xinference/thirdparty/omnilmm/train/train_utils.py +150 -0
- xinference/thirdparty/omnilmm/utils.py +134 -0
- xinference/types.py +15 -19
- xinference/web/ui/build/asset-manifest.json +3 -3
- xinference/web/ui/build/index.html +1 -1
- xinference/web/ui/build/static/js/main.76ef2b17.js +3 -0
- xinference/web/ui/build/static/js/main.76ef2b17.js.map +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/15e2cf8cd8d0989719b6349428ff576f9009ff4c2dcc52378be0bd938e82495e.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/35d0e4a317e5582cbb79d901302e9d706520ac53f8a734c2fd8bfde6eb5a4f02.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/3c2f277c93c5f1638e08db38df0d0fb4e58d1c5571aea03241a5c04ff4094704.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/3fa1f69162f9c6dc0f6a6e21b64d49d6b8e6fa8dfa59a82cf829931c5f97d99f.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/44774c783428f952d8e2e4ad0998a9c5bc16a57cd9c68b7c5ff18aaa5a41d65c.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/5393569d846332075b93b55656716a34f50e0a8c970be789502d7e6c49755fd7.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/59ce49eae0f486af4c5034d4d2f9ca77c3ec3a32ecc560085caf5ef482b5f4c9.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/62e257ed9016471035fa1a7da57c9e2a4250974ed566b4d1295873d747c68eb2.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/63a4c48f0326d071c7772c46598215c006ae41fd3d4ff3577fe717de66ad6e89.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/b9cbcb6d77ba21b22c6950b6fb5b305d23c19cf747f99f7d48b6b046f8f7b1b0.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/d06a96a3c9c32e42689094aa3aaad41c8125894e956b8f84a70fadce6e3f65b3.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/d076fd56cf3b15ed2433e3744b98c6b4e4410a19903d1db4de5bba0e1a1b3347.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/daad8131d91134f6d7aef895a0c9c32e1cb928277cb5aa66c01028126d215be0.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/de0299226173b0662b573f49e3992220f6611947073bd66ac079728a8bc8837d.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/e606671420d2937102c3c34b4b04056c11736408c1d3347b8cf42dfe61fb394b.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/e6eccc9aa641e7da833492e27846dc965f9750281420977dc84654ca6ed221e4.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/e9b52d171223bb59fb918316297a051cdfd42dd453e8260fd918e90bc0a4ebdf.json +1 -0
- xinference/web/ui/node_modules/.cache/babel-loader/f16aec63602a77bd561d0e67fa00b76469ac54b8033754bba114ec5eb3257964.json +1 -0
- {xinference-0.9.4.dist-info → xinference-0.10.1.dist-info}/METADATA +25 -12
- {xinference-0.9.4.dist-info → xinference-0.10.1.dist-info}/RECORD +79 -58
- xinference/model/llm/ggml/ctransformers.py +0 -281
- xinference/model/llm/ggml/ctransformers_util.py +0 -161
- xinference/web/ui/build/static/js/main.66b1c4fb.js +0 -3
- xinference/web/ui/build/static/js/main.66b1c4fb.js.map +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/0bd70b1ecf307e2681318e864f4692305b6350c8683863007f4caf2f9ac33b6e.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/0db651c046ef908f45cde73af0dbea0a797d3e35bb57f4a0863b481502103a64.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/18e5d5422e2464abf4a3e6d38164570e2e426e0a921e9a2628bbae81b18da353.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/3d93bd9a74a1ab0cec85af40f9baa5f6a8e7384b9e18c409b95a81a7b45bb7e2.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/3e055de705e397e1d413d7f429589b1a98dd78ef378b97f0cdb462c5f2487d5e.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/4fd24800544873512b540544ae54601240a5bfefd9105ff647855c64f8ad828f.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/52aa27272b4b9968f62666262b47661cb1992336a2aff3b13994cc36877b3ec3.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/60c4b98d8ea7479fb0c94cfd19c8128f17bd7e27a1e73e6dd9adf6e9d88d18eb.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/7e094845f611802b024b57439cbf911038169d06cdf6c34a72a7277f35aa71a4.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/95c8cc049fadd23085d8623e1d43d70b614a4e52217676f186a417dca894aa09.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/98b7ef307f436affe13d75a4f265b27e828ccc2b10ffae6513abe2681bc11971.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/a8070ce4b780b4a044218536e158a9e7192a6c80ff593fdc126fee43f46296b5.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/b400cfc9db57fa6c70cd2bad055b73c5079fde0ed37974009d898083f6af8cd8.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/bd04667474fd9cac2983b03725c218908a6cc0ee9128a5953cd00d26d4877f60.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/c2124cfe036b26befcbd386d1d17743b1a58d0b7a041a17bb67f9924400d63c3.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/c230a727b8f68f0e62616a75e14a3d33026dc4164f2e325a9a8072d733850edb.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/d44a6eb6106e09082b691a315c9f6ce17fcfe25beb7547810e0d271ce3301cd2.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/e1d9b2ae4e1248658704bc6bfc5d6160dcd1a9e771ea4ae8c1fed0aaddeedd29.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/fd4a8ae5d192331af1bedd1d2d70efcc569708ee6cc4cb479b225d059482aa81.json +0 -1
- xinference/web/ui/node_modules/.cache/babel-loader/fe5db70859503a54cbe71f9637e5a314cda88b1f0eecb733b6e6f837697db1ef.json +0 -1
- /xinference/web/ui/build/static/js/{main.66b1c4fb.js.LICENSE.txt → main.76ef2b17.js.LICENSE.txt} +0 -0
- {xinference-0.9.4.dist-info → xinference-0.10.1.dist-info}/LICENSE +0 -0
- {xinference-0.9.4.dist-info → xinference-0.10.1.dist-info}/WHEEL +0 -0
- {xinference-0.9.4.dist-info → xinference-0.10.1.dist-info}/entry_points.txt +0 -0
- {xinference-0.9.4.dist-info → xinference-0.10.1.dist-info}/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: xinference
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.10.1
|
|
4
4
|
Summary: Model Serving Made Easy
|
|
5
5
|
Home-page: https://github.com/xorbitsai/inference
|
|
6
6
|
Author: Qin Xuye
|
|
@@ -22,6 +22,7 @@ License-File: LICENSE
|
|
|
22
22
|
Requires-Dist: xoscar >=0.3.0
|
|
23
23
|
Requires-Dist: torch
|
|
24
24
|
Requires-Dist: gradio >=3.39.0
|
|
25
|
+
Requires-Dist: typer[all] <0.12.0
|
|
25
26
|
Requires-Dist: pillow
|
|
26
27
|
Requires-Dist: click
|
|
27
28
|
Requires-Dist: tqdm >=4.27
|
|
@@ -43,12 +44,14 @@ Requires-Dist: aioprometheus[starlette] >=23.12.0
|
|
|
43
44
|
Requires-Dist: pynvml
|
|
44
45
|
Requires-Dist: async-timeout
|
|
45
46
|
Requires-Dist: peft
|
|
47
|
+
Requires-Dist: timm
|
|
48
|
+
Requires-Dist: opencv-contrib-python
|
|
46
49
|
Provides-Extra: all
|
|
47
50
|
Requires-Dist: chatglm-cpp >=0.3.0 ; extra == 'all'
|
|
48
|
-
Requires-Dist: llama-cpp-python
|
|
51
|
+
Requires-Dist: llama-cpp-python !=0.2.58,>=0.2.25 ; extra == 'all'
|
|
49
52
|
Requires-Dist: transformers >=4.34.1 ; extra == 'all'
|
|
50
53
|
Requires-Dist: torch ; extra == 'all'
|
|
51
|
-
Requires-Dist: accelerate >=0.
|
|
54
|
+
Requires-Dist: accelerate >=0.27.2 ; extra == 'all'
|
|
52
55
|
Requires-Dist: sentencepiece ; extra == 'all'
|
|
53
56
|
Requires-Dist: transformers-stream-generator ; extra == 'all'
|
|
54
57
|
Requires-Dist: bitsandbytes ; extra == 'all'
|
|
@@ -60,7 +63,12 @@ Requires-Dist: diffusers ; extra == 'all'
|
|
|
60
63
|
Requires-Dist: controlnet-aux ; extra == 'all'
|
|
61
64
|
Requires-Dist: orjson ; extra == 'all'
|
|
62
65
|
Requires-Dist: optimum ; extra == 'all'
|
|
66
|
+
Requires-Dist: outlines ==0.0.34 ; extra == 'all'
|
|
67
|
+
Requires-Dist: attrdict ; extra == 'all'
|
|
68
|
+
Requires-Dist: timm >=0.9.16 ; extra == 'all'
|
|
69
|
+
Requires-Dist: torchvision ; extra == 'all'
|
|
63
70
|
Requires-Dist: auto-gptq ; (sys_platform != "darwin") and extra == 'all'
|
|
71
|
+
Requires-Dist: autoawq ; (sys_platform != "darwin") and extra == 'all'
|
|
64
72
|
Requires-Dist: vllm >=0.2.6 ; (sys_platform == "linux") and extra == 'all'
|
|
65
73
|
Requires-Dist: sglang[all] ; (sys_platform == "linux") and extra == 'all'
|
|
66
74
|
Provides-Extra: benchmark
|
|
@@ -81,7 +89,7 @@ Requires-Dist: jieba >=0.42.0 ; extra == 'dev'
|
|
|
81
89
|
Requires-Dist: flake8 >=3.8.0 ; extra == 'dev'
|
|
82
90
|
Requires-Dist: black ; extra == 'dev'
|
|
83
91
|
Requires-Dist: openai >1 ; extra == 'dev'
|
|
84
|
-
Requires-Dist: opencv-python ; extra == 'dev'
|
|
92
|
+
Requires-Dist: opencv-contrib-python ; extra == 'dev'
|
|
85
93
|
Requires-Dist: langchain ; extra == 'dev'
|
|
86
94
|
Requires-Dist: orjson ; extra == 'dev'
|
|
87
95
|
Requires-Dist: sphinx-tabs ; extra == 'dev'
|
|
@@ -94,11 +102,12 @@ Requires-Dist: sphinx-intl >=0.9.9 ; extra == 'doc'
|
|
|
94
102
|
Requires-Dist: sphinx-tabs ; extra == 'doc'
|
|
95
103
|
Requires-Dist: sphinx-design ; extra == 'doc'
|
|
96
104
|
Requires-Dist: prometheus-client ; extra == 'doc'
|
|
105
|
+
Requires-Dist: timm ; extra == 'doc'
|
|
106
|
+
Requires-Dist: opencv-contrib-python ; extra == 'doc'
|
|
97
107
|
Provides-Extra: embedding
|
|
98
108
|
Requires-Dist: sentence-transformers >=2.3.1 ; extra == 'embedding'
|
|
99
109
|
Provides-Extra: ggml
|
|
100
|
-
Requires-Dist: llama-cpp-python
|
|
101
|
-
Requires-Dist: ctransformers ; extra == 'ggml'
|
|
110
|
+
Requires-Dist: llama-cpp-python !=0.2.58,>=0.2.25 ; extra == 'ggml'
|
|
102
111
|
Requires-Dist: chatglm-cpp >=0.3.0 ; extra == 'ggml'
|
|
103
112
|
Provides-Extra: image
|
|
104
113
|
Requires-Dist: diffusers ; extra == 'image'
|
|
@@ -111,7 +120,7 @@ Requires-Dist: sglang[all] ; extra == 'sglang'
|
|
|
111
120
|
Provides-Extra: transformers
|
|
112
121
|
Requires-Dist: transformers >=4.34.1 ; extra == 'transformers'
|
|
113
122
|
Requires-Dist: torch ; extra == 'transformers'
|
|
114
|
-
Requires-Dist: accelerate >=0.
|
|
123
|
+
Requires-Dist: accelerate >=0.27.2 ; extra == 'transformers'
|
|
115
124
|
Requires-Dist: sentencepiece ; extra == 'transformers'
|
|
116
125
|
Requires-Dist: transformers-stream-generator ; extra == 'transformers'
|
|
117
126
|
Requires-Dist: bitsandbytes ; extra == 'transformers'
|
|
@@ -119,7 +128,11 @@ Requires-Dist: protobuf ; extra == 'transformers'
|
|
|
119
128
|
Requires-Dist: einops ; extra == 'transformers'
|
|
120
129
|
Requires-Dist: tiktoken ; extra == 'transformers'
|
|
121
130
|
Requires-Dist: auto-gptq ; extra == 'transformers'
|
|
131
|
+
Requires-Dist: autoawq ; extra == 'transformers'
|
|
122
132
|
Requires-Dist: optimum ; extra == 'transformers'
|
|
133
|
+
Requires-Dist: attrdict ; extra == 'transformers'
|
|
134
|
+
Requires-Dist: timm >=0.9.16 ; extra == 'transformers'
|
|
135
|
+
Requires-Dist: torchvision ; extra == 'transformers'
|
|
123
136
|
Requires-Dist: peft ; extra == 'transformers'
|
|
124
137
|
Provides-Extra: vllm
|
|
125
138
|
Requires-Dist: vllm >=0.2.6 ; extra == 'vllm'
|
|
@@ -152,20 +165,20 @@ potential of cutting-edge AI models.
|
|
|
152
165
|
|
|
153
166
|
## 🔥 Hot Topics
|
|
154
167
|
### Framework Enhancements
|
|
168
|
+
- Support specifying worker and GPU indexes for launching models: [#1195](https://github.com/xorbitsai/inference/pull/1195)
|
|
169
|
+
- Support SGLang backend: [#1161](https://github.com/xorbitsai/inference/pull/1161)
|
|
155
170
|
- Support LoRA for LLM and image models: [#1080](https://github.com/xorbitsai/inference/pull/1080)
|
|
156
171
|
- Support speech recognition model: [#929](https://github.com/xorbitsai/inference/pull/929)
|
|
157
172
|
- Metrics support: [#906](https://github.com/xorbitsai/inference/pull/906)
|
|
158
173
|
- Docker image: [#855](https://github.com/xorbitsai/inference/pull/855)
|
|
159
174
|
- Support multimodal: [#829](https://github.com/xorbitsai/inference/pull/829)
|
|
160
|
-
- Auto recover: [#694](https://github.com/xorbitsai/inference/pull/694)
|
|
161
|
-
- Function calling API: [#701](https://github.com/xorbitsai/inference/pull/701), here's example: https://github.com/xorbitsai/inference/blob/main/examples/FunctionCall.ipynb
|
|
162
175
|
### New Models
|
|
176
|
+
- Built-in support for [Qwen1.5 MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B-Chat): [#1263](https://github.com/xorbitsai/inference/pull/1263)
|
|
177
|
+
- Built-in support for [Qwen1.5 32B](https://huggingface.co/Qwen/Qwen1.5-32B-Chat): [#1249](https://github.com/xorbitsai/inference/pull/1249)
|
|
178
|
+
- Built-in support for [OmniLMM](https://github.com/OpenBMB/OmniLMM): [#1171](https://github.com/xorbitsai/inference/pull/1171)
|
|
163
179
|
- Built-in support for [Gemma](https://github.com/google-deepmind/gemma): [#1024](https://github.com/xorbitsai/inference/pull/1024)
|
|
164
180
|
- Built-in support for [Qwen1.5](https://github.com/QwenLM/Qwen1.5): [#994](https://github.com/xorbitsai/inference/pull/994)
|
|
165
181
|
- Built-in support for [Yi-VL](https://github.com/01-ai/Yi): [#946](https://github.com/xorbitsai/inference/pull/946)
|
|
166
|
-
- Built-in support for [Whisper](https://github.com/openai/whisper): [#929](https://github.com/xorbitsai/inference/pull/929)
|
|
167
|
-
- Built-in support for [Orion-chat](https://huggingface.co/OrionStarAI): [#933](https://github.com/xorbitsai/inference/pull/933)
|
|
168
|
-
- Built-in support for [InternLM2-chat](https://huggingface.co/internlm/internlm2-chat-7b): [#829](https://github.com/xorbitsai/inference/pull/913)
|
|
169
182
|
### Integrations
|
|
170
183
|
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
|
|
171
184
|
- [Chatbox](https://chatboxai.app/): a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux.
|