red-candle 1.0.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +77 -14
- data/lib/candle/version.rb +1 -1
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 52c635f005d25a305f99781763a4a3cc03f85fc5b74f0e576e51973ef8306fac
|
4
|
+
data.tar.gz: 1a0ac260a3803f1920ba2d9f71ec361013ae1eb99cf2caed62c0e9aecc583e96
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d301e6ed0fe8ac144c0735288c687f5dd74e7967dbe5d357e550ca5d467f6a33017b2bd9e7f46081711b6bf13555caa3e044183cd74cfaa89151e15c8cdb04a4
|
7
|
+
data.tar.gz: d296c35002b6d0ed919176375e5cc5d93c70fae0c0ae9a02d5cf86b8a4a49a67898c7fbc96e16350bba6792b18792126c976de156fb854b9b4f3260fa052cd79
|
data/README.md
CHANGED
@@ -1,9 +1,66 @@
|
|
1
|
-
# red-candle
|
1
|
+
# `red-candle` Native LLMs for Ruby 🚀
|
2
2
|
|
3
3
|
[](https://github.com/assaydepot/red-candle/actions/workflows/build.yml)
|
4
4
|
[](https://badge.fury.io/rb/red-candle)
|
5
5
|
|
6
|
-
|
6
|
+
Run state-of-the-art **language models directly from Ruby**. No Python, no APIs, no external services - just Ruby with blazing-fast Rust under the hood. Hardware accelerated with **Metal (Mac)** and **CUDA (NVIDIA).**
|
7
|
+
|
8
|
+
## Install & Chat in 30 Seconds
|
9
|
+
|
10
|
+
[](https://www.youtube.com/watch?v=hbyFCyh8esk)
|
11
|
+
|
12
|
+
```bash
|
13
|
+
# Install the gem
|
14
|
+
gem install red-candle
|
15
|
+
```
|
16
|
+
|
17
|
+
```ruby
|
18
|
+
require 'candle'
|
19
|
+
|
20
|
+
# Download a model (one-time, ~650MB) - Mistral, Llama3, Gemma all work!
|
21
|
+
llm = Candle::LLM.from_pretrained("TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF",
|
22
|
+
gguf_file: "tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf")
|
23
|
+
|
24
|
+
# Chat with it - no API calls, running locally in your Ruby process!
|
25
|
+
messages = [
|
26
|
+
{ role: "user", content: "Explain Ruby in one sentence" }
|
27
|
+
]
|
28
|
+
|
29
|
+
puts llm.chat(messages)
|
30
|
+
# => "Ruby is a dynamic, object-oriented programming language known for its
|
31
|
+
# simplicity, elegance, and productivity, often used for web development
|
32
|
+
# with frameworks like Rails."
|
33
|
+
```
|
34
|
+
|
35
|
+
## What Just Happened?
|
36
|
+
|
37
|
+
You just ran a 1.1-billion parameter AI model inside Ruby. The model lives in your process memory, runs on your hardware (CPU/GPU), and responds instantly without network latency.
|
38
|
+
|
39
|
+
## Stream Responses Like a Pro
|
40
|
+
|
41
|
+
```ruby
|
42
|
+
# Watch the AI think in real-time
|
43
|
+
llm.chat_stream(messages) do |token|
|
44
|
+
print token
|
45
|
+
end
|
46
|
+
```
|
47
|
+
|
48
|
+
## Why This Matters
|
49
|
+
|
50
|
+
- **Privacy**: Your data never leaves your machine
|
51
|
+
- **Speed**: No network overhead, direct memory access
|
52
|
+
- **Control**: Fine-tune generation parameters, access raw tokens
|
53
|
+
- **Integration**: It's just Ruby objects - use it anywhere Ruby runs
|
54
|
+
|
55
|
+
## Supports
|
56
|
+
|
57
|
+
- **Tokenizers**: Access the tokenizer directly
|
58
|
+
- **EmbeddingModel**: Generate embeddings for text
|
59
|
+
- **Reranker**: Rerank documents based on relevance
|
60
|
+
- **NER**: Named Entity Recognition directly from Ruby
|
61
|
+
- **LLM**: Chat with Large Language Models (e.g., Llama, Mistral, Gemma)
|
62
|
+
|
63
|
+
----
|
7
64
|
|
8
65
|
## Usage
|
9
66
|
|
@@ -671,7 +728,7 @@ All NER methods return entities in a consistent format:
|
|
671
728
|
|
672
729
|
## Common Runtime Errors
|
673
730
|
|
674
|
-
###
|
731
|
+
### Weight is negative, too large or not a valid number
|
675
732
|
|
676
733
|
**Error:**
|
677
734
|
```
|
@@ -688,13 +745,12 @@ All NER methods return entities in a consistent format:
|
|
688
745
|
- Q3_K_M (3-bit) - Minimum recommended quantization
|
689
746
|
|
690
747
|
```ruby
|
691
|
-
# Instead of Q2_K:
|
692
748
|
llm = Candle::LLM.from_pretrained("TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF",
|
693
749
|
device: device,
|
694
750
|
gguf_file: "tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf")
|
695
751
|
```
|
696
752
|
|
697
|
-
###
|
753
|
+
### Cannot find tensor model.embed_tokens.weight
|
698
754
|
|
699
755
|
**Error:**
|
700
756
|
```
|
@@ -713,7 +769,7 @@ Failed to load quantized model: cannot find tensor model.embed_tokens.weight (Ru
|
|
713
769
|
```
|
714
770
|
3. If the error persists, the GGUF file may use an unsupported architecture or format
|
715
771
|
|
716
|
-
###
|
772
|
+
### No GGUF file found in repository
|
717
773
|
|
718
774
|
**Error:**
|
719
775
|
```
|
@@ -730,7 +786,7 @@ llm = Candle::LLM.from_pretrained("TheBloke/Llama-2-7B-Chat-GGUF",
|
|
730
786
|
gguf_file: "llama-2-7b-chat.Q4_K_M.gguf")
|
731
787
|
```
|
732
788
|
|
733
|
-
###
|
789
|
+
### Failed to download tokenizer
|
734
790
|
|
735
791
|
**Error:**
|
736
792
|
```
|
@@ -741,7 +797,7 @@ Failed to load quantized model: Failed to download tokenizer: request error: HTT
|
|
741
797
|
|
742
798
|
**Solution:** The code now includes fallback tokenizer loading. If you still encounter this error, ensure you're using the latest version of red-candle.
|
743
799
|
|
744
|
-
###
|
800
|
+
### Missing metadata in GGUF file
|
745
801
|
|
746
802
|
**Error:**
|
747
803
|
```
|
@@ -770,17 +826,24 @@ Failed to load GGUF model: cannot find llama.attention.head_count in metadata (R
|
|
770
826
|
FORK IT!
|
771
827
|
|
772
828
|
```
|
773
|
-
git clone https://github.com/
|
829
|
+
git clone https://github.com/assaydepot/red-candle
|
774
830
|
cd red-candle
|
775
831
|
bundle
|
776
832
|
bundle exec rake compile
|
777
833
|
```
|
778
834
|
|
779
|
-
Implemented with [Magnus](https://github.com/matsadler/magnus), with reference to [Polars Ruby](https://github.com/ankane/polars-ruby)
|
780
|
-
|
781
835
|
Pull requests are welcome.
|
782
836
|
|
783
|
-
|
837
|
+
## Release
|
838
|
+
|
839
|
+
1. Update version number in `lib/candle/version.rb` and commit.
|
840
|
+
2. `bundle exec rake build`
|
841
|
+
3. `git tag VERSION_NUMBER`
|
842
|
+
4. `git push --follow-tags`
|
843
|
+
5. `gem push pkg/red-candle-1.0.0.gem`
|
844
|
+
|
845
|
+
## See Also
|
784
846
|
|
785
|
-
- [
|
786
|
-
- [
|
847
|
+
- [Candle](https://github.com/huggingface/candle)
|
848
|
+
- [Magnus](https://github.com/matsadler/magnus)
|
849
|
+
- [Outlines-core](https://github.com/dottxt-ai/outlines-core)
|
data/lib/candle/version.rb
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: red-candle
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.0.
|
4
|
+
version: 1.0.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Christopher Petersen
|
@@ -216,9 +216,9 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
216
216
|
- !ruby/object:Gem::Version
|
217
217
|
version: 3.3.26
|
218
218
|
requirements:
|
219
|
-
- Rust >= 1.
|
219
|
+
- Rust >= 1.65
|
220
220
|
rubygems_version: 3.5.3
|
221
221
|
signing_key:
|
222
222
|
specification_version: 4
|
223
|
-
summary: huggingface/candle for
|
223
|
+
summary: huggingface/candle for Ruby
|
224
224
|
test_files: []
|