liger-kernel-nightly 0.5.2.dev20241212055744__py3-none-any.whl → 0.5.2.dev20241212060541__py3-none-any.whl
Sign up to get free protection for your applications and to get access to all the features.
- {liger_kernel_nightly-0.5.2.dev20241212055744.dist-info → liger_kernel_nightly-0.5.2.dev20241212060541.dist-info}/METADATA +14 -1
- {liger_kernel_nightly-0.5.2.dev20241212055744.dist-info → liger_kernel_nightly-0.5.2.dev20241212060541.dist-info}/RECORD +6 -6
- {liger_kernel_nightly-0.5.2.dev20241212055744.dist-info → liger_kernel_nightly-0.5.2.dev20241212060541.dist-info}/LICENSE +0 -0
- {liger_kernel_nightly-0.5.2.dev20241212055744.dist-info → liger_kernel_nightly-0.5.2.dev20241212060541.dist-info}/NOTICE +0 -0
- {liger_kernel_nightly-0.5.2.dev20241212055744.dist-info → liger_kernel_nightly-0.5.2.dev20241212060541.dist-info}/WHEEL +0 -0
- {liger_kernel_nightly-0.5.2.dev20241212055744.dist-info → liger_kernel_nightly-0.5.2.dev20241212060541.dist-info}/top_level.txt +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.1
|
2
2
|
Name: liger_kernel_nightly
|
3
|
-
Version: 0.5.2.
|
3
|
+
Version: 0.5.2.dev20241212060541
|
4
4
|
Summary: Efficient Triton kernels for LLM Training
|
5
5
|
License: BSD 2-CLAUSE LICENSE
|
6
6
|
Copyright 2024 LinkedIn Corporation
|
@@ -136,6 +136,19 @@ With one line of code, Liger Kernel can increase throughput by more than 20% and
|
|
136
136
|
> - Benchmark conditions: LLaMA 3-8B, Batch Size = 8, Data Type = `bf16`, Optimizer = AdamW, Gradient Checkpointing = True, Distributed Strategy = FSDP1 on 8 A100s.
|
137
137
|
> - Hugging Face models start to OOM at a 4K context length, whereas Hugging Face + Liger Kernel scales up to 16K.
|
138
138
|
|
139
|
+
## Optimize post training with Liger Kernel
|
140
|
+
|
141
|
+

|
142
|
+
|
143
|
+
We provide optimized post training kernels like DPO, ORPO, SimPO, and more which can reduce memory usage by up to 80%. You can easily use them as python modules.
|
144
|
+
|
145
|
+
```python
|
146
|
+
from liger_kernel.chunked_loss import LigerFusedLinearDPOLoss
|
147
|
+
orpo_loss = LigerFusedLinearORPOLoss()
|
148
|
+
y = orpo_loss(lm_head.weight, x, target)
|
149
|
+
```
|
150
|
+
|
151
|
+
|
139
152
|
## Examples
|
140
153
|
|
141
154
|
| **Use Case** | **Description** |
|
@@ -58,9 +58,9 @@ liger_kernel/transformers/trainer/__init__.py,sha256=c4OQVJmhNOloj0JYSEc0j_cQuBb
|
|
58
58
|
liger_kernel/transformers/trainer/orpo_trainer.py,sha256=jko6oq_XQdBSmXubp05E-_YXOyhtB5Bj75dg5YNwOsE,7517
|
59
59
|
liger_kernel/triton/__init__.py,sha256=yfRe0zMb47QnqjecZWG7LnanfCTzeku7SgWRAwNVmzU,101
|
60
60
|
liger_kernel/triton/monkey_patch.py,sha256=5BcGKTtdqeYchypBIBopGIWPx1-cFALz7sOKoEsqXJ0,1584
|
61
|
-
liger_kernel_nightly-0.5.2.
|
62
|
-
liger_kernel_nightly-0.5.2.
|
63
|
-
liger_kernel_nightly-0.5.2.
|
64
|
-
liger_kernel_nightly-0.5.2.
|
65
|
-
liger_kernel_nightly-0.5.2.
|
66
|
-
liger_kernel_nightly-0.5.2.
|
61
|
+
liger_kernel_nightly-0.5.2.dev20241212060541.dist-info/LICENSE,sha256=OhzLDHJ0to4a8sodVLELZiCFylZ1NAAYLs-HrjPy0ag,1312
|
62
|
+
liger_kernel_nightly-0.5.2.dev20241212060541.dist-info/METADATA,sha256=J64c14dbQAzCW0-j89DnVcgt1VxXesKDC-szl0_2dvU,21001
|
63
|
+
liger_kernel_nightly-0.5.2.dev20241212060541.dist-info/NOTICE,sha256=njwnoPZLh9AN8SJQzxvCGLHi-8X__AvWRze6joNXIY8,2066
|
64
|
+
liger_kernel_nightly-0.5.2.dev20241212060541.dist-info/WHEEL,sha256=P9jw-gEje8ByB7_hXoICnHtVCrEwMQh-630tKvQWehc,91
|
65
|
+
liger_kernel_nightly-0.5.2.dev20241212060541.dist-info/top_level.txt,sha256=2eghu4hA3LnkM7ElW92tQ8zegWKgSbeo-k-aGe1YnvY,13
|
66
|
+
liger_kernel_nightly-0.5.2.dev20241212060541.dist-info/RECORD,,
|
File without changes
|
File without changes
|
File without changes
|