PyPI - llguidance - Versions diffs - 0.7.20__tar.gz → 0.7.22__tar.gz - Mend

llguidance 0.7.20tar.gz → 0.7.22tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (174) hide show

{llguidance-0.7.20 → llguidance-0.7.22}/CHANGELOG.md RENAMED Viewed

@@ -4,6 +4,16 @@ All notable changes to this project will be documented in this file. Dates are d
 If a release doesn't introduce any interesting changes (build fixes etc.), it's skipped.
+#### [0.7.22](https://github.com/guidance-ai/llguidance/compare/v0.7.21...0.7.22) 2025-05-21
+- Keep EOS token bytes in `TokenizerWrapper` [`#178`](https://github.com/guidance-ai/llguidance/pull/178)
+- Stop using prefix/sentinel strings for `TokenizerWrapper` [`#175`](https://github.com/guidance-ai/llguidance/pull/175)
+- avoid taking poisoned locks, see [`#174`](https://github.com/guidance-ai/llguidance/issues/174) [`d41aa9a`](https://github.com/guidance-ai/llguidance/commit/d41aa9a4427967708a951506b2bc0e395871b6c8); thanks [@g-eoj](https://github.com/g-eoj)
+#### [0.7.21](https://github.com/guidance-ai/llguidance/compare/v0.7.20...0.7.21) 2025-05-20
+- include parser state in errors [`82e34da`](https://github.com/guidance-ai/llguidance/commit/82e34da704d22f04979d8cbc54a0ac00885a277d)
+- tighten email format in JSON schema [`7454ea9`](https://github.com/guidance-ai/llguidance/commit/7454ea9df958f8bcc42e6bb986d6de397de65b3e)
 #### [0.7.20](https://github.com/guidance-ai/llguidance/compare/v0.7.19...0.7.20) 2025-05-15

{llguidance-0.7.20 → llguidance-0.7.22}/Cargo.lock RENAMED Viewed

@@ -1174,7 +1174,7 @@ checksum = "23fb14cb19457329c82206317a5663005a4d404783dc74f4252769b0d5f42856"
 [[package]]
 name = "llguidance"
-version = "0.7.20"
+version = "0.7.22"
 dependencies = [
  "anyhow",
  "derivre",
@@ -1193,7 +1193,7 @@ dependencies = [
 [[package]]
 name = "llguidance_py"
-version = "0.7.20"
+version = "0.7.22"
 dependencies = [
  "anyhow",
  "bytemuck",
@@ -2336,7 +2336,7 @@ dependencies = [
 [[package]]
 name = "toktrie"
-version = "0.7.20"
+version = "0.7.22"
 dependencies = [
  "anyhow",
  "bytemuck",
@@ -2347,7 +2347,7 @@ dependencies = [
 [[package]]
 name = "toktrie_hf_downloader"
-version = "0.7.20"
+version = "0.7.22"
 dependencies = [
  "anyhow",
  "hf-hub",
@@ -2358,7 +2358,7 @@ dependencies = [
 [[package]]
 name = "toktrie_hf_tokenizers"
-version = "0.7.20"
+version = "0.7.22"
 dependencies = [
  "anyhow",
  "log",

{llguidance-0.7.20 → llguidance-0.7.22}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: llguidance
-Version: 0.7.20
+Version: 0.7.22
 License-File: LICENSE
 Summary: Bindings for the Low-level Guidance (llguidance) Rust library for use within Guidance
 Author: Michal Moskal
@@ -20,6 +20,7 @@ Project-URL: issue_tracker, https://github.com/microsoft/llguidance/issues
 ---
+* 2025-05-20 LLGuidance [shipped](https://x.com/OpenAIDevs/status/1924915341052019166) in [OpenAI](https://x.com/OpenAIDevs/status/1924915343677653014) for JSON Schema
 * 2025-04-11 integration [merged](https://github.com/chromium/chromium/commit/07ca6337c2f714ba0477202414bd2b1692e70594) into Chromium
 * 2025-03-25 integration [merged](https://github.com/vllm-project/vllm/pull/14779) into vLLM (v0.8.2)
 * 2025-02-26 integration [merged](https://github.com/sgl-project/sglang/pull/3298) into SGLang (v0.4.4)
@@ -59,6 +60,7 @@ The library can be used from:
 The library is currently integrated in:
 - [Guidance](https://github.com/guidance-ai/guidance) - library for interacting with LLMs
+- [OpenAI models](https://x.com/OpenAIDevs/status/1924915343677653014) - LLGuidance powers [Structured Output](https://platform.openai.com/docs/guides/structured-outputs) (JSON Schema only)
 - [llama.cpp](https://github.com/ggerganov/llama.cpp/pull/10224) -
   available via `-DLLAMA_LLGUIDANCE=ON` option for `cmake`;
   llama.cpp can be also used Guidance Python package

{llguidance-0.7.20 → llguidance-0.7.22}/README.md RENAMED Viewed

@@ -8,6 +8,7 @@
 ---
+* 2025-05-20 LLGuidance [shipped](https://x.com/OpenAIDevs/status/1924915341052019166) in [OpenAI](https://x.com/OpenAIDevs/status/1924915343677653014) for JSON Schema
 * 2025-04-11 integration [merged](https://github.com/chromium/chromium/commit/07ca6337c2f714ba0477202414bd2b1692e70594) into Chromium
 * 2025-03-25 integration [merged](https://github.com/vllm-project/vllm/pull/14779) into vLLM (v0.8.2)
 * 2025-02-26 integration [merged](https://github.com/sgl-project/sglang/pull/3298) into SGLang (v0.4.4)
@@ -47,6 +48,7 @@ The library can be used from:
 The library is currently integrated in:
 - [Guidance](https://github.com/guidance-ai/guidance) - library for interacting with LLMs
+- [OpenAI models](https://x.com/OpenAIDevs/status/1924915343677653014) - LLGuidance powers [Structured Output](https://platform.openai.com/docs/guides/structured-outputs) (JSON Schema only)
 - [llama.cpp](https://github.com/ggerganov/llama.cpp/pull/10224) -
   available via `-DLLAMA_LLGUIDANCE=ON` option for `cmake`;
   llama.cpp can be also used Guidance Python package

{llguidance-0.7.20 → llguidance-0.7.22}/parser/Cargo.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "llguidance"
-version = "0.7.20"
+version = "0.7.22"
 edition = "2021"
 license = "MIT"
 description = "Super-fast Structured Outputs"

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/constraint.rs RENAMED Viewed

@@ -137,7 +137,7 @@ impl Constraint {
     /// The splice is never returned when ff_tokens are disabled in InferenceCapabilities.
     /// After this returns, commit_token() must be called with the sampled token if any.
     pub fn compute_mask(&mut self) -> Result<&StepResult> {
-        panic_utils::catch_unwind(std::panic::AssertUnwindSafe(|| self.compute_mask_inner()))
+        self.catch_unwind(|s| s.compute_mask_inner())
             .map(|_| &self.last_res)
     }
@@ -185,6 +185,14 @@ impl Constraint {
         self.parser.validate_tokens_raw(tokens)
     }
+    fn catch_unwind<F, R>(&mut self, f: F) -> Result<R>
+    where
+        F: FnOnce(&mut Self) -> Result<R>,
+    {
+        panic_utils::catch_unwind(std::panic::AssertUnwindSafe(|| f(self)))
+            .map_err(|e| anyhow::anyhow!(self.parser.augment_err(e)))
+    }
     /// commit_token() is a top-level method in this file and is called by
     /// the LLInterpreter::commit_token().
     ///
@@ -194,9 +202,7 @@ impl Constraint {
     /// It only returns 'STOP' if previous compute_mask() already returned 'STOP'
     /// (in which case there's little point calling commit_token()).
     pub fn commit_token(&mut self, sampled_token: Option<TokenId>) -> Result<CommitResult> {
-        panic_utils::catch_unwind(std::panic::AssertUnwindSafe(|| {
-            self.commit_token_inner(sampled_token)
-        }))
+        self.catch_unwind(|s| s.commit_token_inner(sampled_token))
     }
     fn commit_token_inner(&mut self, sampled_token: Option<TokenId>) -> Result<CommitResult> {

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/parser.rs RENAMED Viewed

@@ -2827,4 +2827,10 @@ impl Parser {
         copy.shared = Arc::new(Mutex::new(shared.clone()));
         copy
     }
+    pub fn test_trigger_lexer_error(&mut self) -> Result<()> {
+        self.with_shared(|_state| {
+            panic!("synthetic error");
+        })
+    }
 }

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/formats.rs RENAMED Viewed

@@ -27,9 +27,15 @@ pub fn lookup_format(name: &str) -> Option<&str> {
         "duration" => {
             r"P(?:(?P<dur_date>(?:(?P<dur_year>[0-9]+Y(?:[0-9]+M(?:[0-9]+D)?)?)|(?P<dur_month>[0-9]+M(?:[0-9]+D)?)|(?P<dur_day>[0-9]+D))(?:T(?:(?P<dur_hour>[0-9]+H(?:[0-9]+M(?:[0-9]+S)?)?)|(?P<dur_minute>[0-9]+M(?:[0-9]+S)?)|(?P<dur_second>[0-9]+S)))?)|(?P<dur_time>T(?:(?P<dur_hour2>[0-9]+H(?:[0-9]+M(?:[0-9]+S)?)?)|(?P<dur_minute2>[0-9]+M(?:[0-9]+S)?)|(?P<dur_second2>[0-9]+S)))|(?P<dur_week>[0-9]+W))"
         }
-        "email" => {
-            r"(?P<local_part>(?P<dot_string>[^\s@\.]+(\.[^\s@\.]+)*))@((?P<domain>(?P<sub_domain>[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?)(\.(?P<sub_domain2>[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?))*)|\[(?P<ipv4>((([0-9])|(([1-9])[0-9]|(25[0-5]|(2[0-4]|(1)[0-9])[0-9])))\.){3}(([0-9])|(([1-9])[0-9]|(25[0-5]|(2[0-4]|(1)[0-9])[0-9]))))\])"
-        }
+        // https://www.rfc-editor.org/rfc/inline-errata/rfc5321.html 4.1.2 -> Mailbox
+        "email" => concat!(
+            r"(?P<local_part>(?P<dot_string>[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+(\.[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+)*))",
+            r"@(",
+            r"(?P<domain>(?P<sub_domain>[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?)(\.(?P<sub_domain2>[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?))*)",
+            r"|",
+            r"\[(?P<ipv4>((([0-9])|(([1-9])[0-9]|(25[0-5]|(2[0-4]|(1)[0-9])[0-9])))\.){3}(([0-9])|(([1-9])[0-9]|(25[0-5]|(2[0-4]|(1)[0-9])[0-9]))))\]",
+            r")"
+        ),
         "hostname" => {
             r"[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*"
         }

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/matcher.rs RENAMED Viewed

@@ -1,4 +1,4 @@
-use anyhow::{anyhow, ensure, Result};
+use anyhow::{anyhow, bail, ensure, Result};
 use toktrie::{SimpleVob, TokEnv, TokenId};
 use crate::{api::StopReason, earley::ParserStats, panic_utils, TokenParser};
@@ -48,8 +48,9 @@ impl Matcher {
                 match r {
                     Ok(r) => Ok(r),
                     Err(e) => {
-                        self.0 = MatcherState::Error(e.to_string());
-                        Err(e)
+                        let msg = inner.parser.augment_err(e);
+                        self.0 = MatcherState::Error(msg.clone());
+                        bail!(msg);
                     }
                 }
             }
@@ -85,6 +86,10 @@ impl Matcher {
         self.consume_tokens(&[token])
     }
+    pub fn test_trigger_lexer_error(&mut self) -> Result<()> {
+        self.with_inner(|inner| inner.parser.parser.test_trigger_lexer_error())
+    }
     pub fn rollback(&mut self, num_tokens: usize) -> Result<()> {
         self.with_inner(|inner| inner.parser.rollback(num_tokens))
     }

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/tokenparser.rs RENAMED Viewed

@@ -1,4 +1,4 @@
-use std::{hint::black_box, panic::AssertUnwindSafe, sync::Arc, time::Duration};
+use std::{fmt::Display, hint::black_box, panic::AssertUnwindSafe, sync::Arc, time::Duration};
 use crate::{
     api::{GrammarInit, ParserLimits, StopReason},
@@ -22,6 +22,9 @@ pub struct TokenParser {
     max_step_stats: ParserStats,
     eos_token: TokenId,
+    had_rollback: bool,
+    had_backtrack: bool,
     is_accepting_cache: Option<bool>,
     ff_tokens_cache: Option<(Vec<TokenId>, Vec<u8>)>,
     stop_reason: StopReason,
@@ -110,6 +113,8 @@ impl TokenParser {
             max_tokens_total: max_tokens,
             last_bias_time: Duration::from_secs(0),
             is_fresh: true,
+            had_backtrack: false,
+            had_rollback: false,
         })
     }
@@ -268,6 +273,36 @@ impl TokenParser {
         res_prompt
     }
+    pub fn augment_err(&self, e: impl Display) -> String {
+        format!("{e}\n<state>\n{}\n</state>", self.dump_state())
+    }
+    pub fn dump_state(&self) -> String {
+        // make sure not take self.parser.shared lock
+        // for example, self.parser.lexer_stats() takes it
+        // if we take it after panic, it will be poisoned
+        format!(
+            "Tokens: {}\n{} tokens, {} bytes; grm_prefix: {:?}\nFlags:{}{}\nParser: {}\nStop: {}\nError: {}",
+            self.tok_trie().tokens_dbg(&self.llm_tokens),
+            self.llm_tokens.len(),
+            self.llm_bytes.len(),
+            String::from_utf8_lossy(&self.grm_prefix),
+            if self.had_backtrack {
+                " had_backtrack"
+            } else {
+                ""
+            },
+            if self.had_rollback {
+                " had_rollback"
+            } else {
+                ""
+            },
+            self.parser.stats(),
+            self.stop_reason,
+            self.error_message.as_deref().unwrap_or("None"),
+        )
+    }
     fn clear_caches(&mut self) {
         self.is_accepting_cache = None;
         self.ff_tokens_cache = None;
@@ -332,6 +367,8 @@ impl TokenParser {
         // this will fail in case we're in error state or not initialized
         self.check_initialized("rollback")?;
+        self.had_rollback = true;
         let new_len = self.llm_tokens.len() - n_tokens;
         let mut bytes_to_drop = 0;
         for tok in &self.llm_tokens[new_len..] {
@@ -522,6 +559,7 @@ impl TokenParser {
                 self.llm_bytes.extend_from_slice(tok_bytes);
                 if backtrack_bytes0 != 0 {
+                    self.had_backtrack = true;
                     let mut backtrack_bytes: isize = backtrack_bytes0.try_into().unwrap();
                     let mut backtrack_tokens = 0;
                     while backtrack_bytes > 0 {

{llguidance-0.7.20 → llguidance-0.7.22}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "llguidance"
-version = "0.7.20"
+version = "0.7.22"
 description = "Bindings for the Low-level Guidance (llguidance) Rust library for use within Guidance"
 requires-python = ">=3.9"
 license = "MIT"

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/_tokenizer.py RENAMED Viewed

@@ -23,10 +23,6 @@ class TokenizerWrapper:
             gtokenizer(b"test")
         except:
             self._accepts_bytes = False
-        # If the tokenizer used bytes, then b"\xff" would be better (since it's invalid UTF-8)
-        # For now, we'll settle for "\x02" as assume it doesn't start any other token
-        self._prefix_string = "\x02"
-        self._prefix_tokens = self._encode_string(self._prefix_string)
     def _encode_string(self, s: str) -> List[TokenId]:
         r: List[TokenId]
@@ -37,7 +33,4 @@ class TokenizerWrapper:
         return r
     # required by LLTokenizer
-    def __call__(self, s: str) -> List[TokenId]:
-        tokens = self._encode_string(self._prefix_string + s)
-        assert tokens[: len(self._prefix_tokens)] == self._prefix_tokens
-        return tokens[len(self._prefix_tokens) :]
+    __call__ = _encode_string

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/cli.py RENAMED Viewed

@@ -1,7 +1,7 @@
 import argparse
 import json
 import huggingface_hub
-from transformers import AutoTokenizer  # type: ignore[attr-defined]
+from transformers import AutoTokenizer
 import llguidance

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/hf.py RENAMED Viewed

@@ -13,11 +13,11 @@ def from_tokenizer(
     """
     Create a new tokenizer from a fast Hugging Face tokenizer.
     This is an expensive operation (~1s), so the result should be cached.
-    It also currently creates a non-canonical tokenizer, which means it cannot
-    produce fast-forward tokens (though it can produce fast-forward bytes).
+    It currently only supports fast tokenizers, which are then handled
+    by the Rust tokenizers library.
     Args:
-        hf_tokenizer: transformers.PreTrainedTokenizerBase - the tokenizer to wrap
+        hf_tokenizer: transformers.PreTrainedTokenizerFast - the tokenizer to wrap
         n_vocab: int - override the size of the vocabulary
         eos_token: int - override the EOS token
         slices: List[str] - configuration for slicer optimization; pass [] to disable,

{llguidance-0.7.20 → llguidance-0.7.22}/python/torch_tests/test_hf.py RENAMED Viewed

@@ -17,7 +17,7 @@ from llguidance import LLMatcher, LLTokenizer, LLExecutor
 import llguidance.hf
-from transformers import AutoTokenizer  # type: ignore[attr-defined]
+from transformers import AutoTokenizer
 def _build_tokenizer() -> LLTokenizer:

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/Cargo.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "llguidance_py"
-version = "0.7.20"
+version = "0.7.22"
 edition = "2021"
 license = "MIT"
 description = "Super-fast Structured Outputs"

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/src/py.rs RENAMED Viewed

@@ -241,14 +241,7 @@ impl PyTokenizer {
             }
         }
-        // we want decode_bytes([EOS]) etc to be empty
-        tokens[tok_eos as usize] = vec![];
-        // if let Some(t) = tok_bos {
-        //     tokens[t as usize] = vec![];
-        // }
         let info = TokRxInfo::new(tokens.len() as u32, tok_eos);
         let tok_trie = TokTrie::from(&info, &tokens);
         Ok(PyTokenizer {
             tok_trie: Arc::new(tok_trie),

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/tests/test_lark.rs RENAMED Viewed

@@ -1306,3 +1306,37 @@ fn test_json_min_max_properties() {
         ],
     );
 }
+#[test]
+fn test_json_format_email() {
+    json_test_many(
+        &json!({
+            "type": "string",
+            "format": "email",
+        }),
+        &[
+            json!("test@example.com"),
+            json!("foo.bar@example.com"),
+            json!("foo.bar@example-123.com"),
+            json!("foo+bar@example-123.com"),
+            json!("f$o#o`b-a!r@example-123.com"),
+            json!("fo%o#bar@example-123.com"),
+            json!("test@[192.168.1.1]"),
+        ],
+        &[
+            json!(""),
+            json!(" @example.com"),
+            json!("test@"),
+            json!("@example.com"),
+            json!("test@.com"),
+            json!("test@com"),
+            json!("test@com."),
+            json!("test@example..com"),
+            json!("test@example.c"),
+            json!("test@example.c."),
+            json!("test@.example.com"),
+            json!("test:2@example.com"),
+            json!("test[2]@example.com"),
+        ],
+    );
+}

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/tests/test_raw_parser.rs RENAMED Viewed

@@ -3,7 +3,7 @@ use llguidance::{
     api::TopLevelGrammar,
     earley::SlicedBiasComputer,
     toktrie::{InferenceCapabilities, TokEnv},
-    ParserFactory, TokenParser,
+    Matcher, ParserFactory, TokenParser,
 };
 use serde_json::{json, Value};
@@ -207,3 +207,64 @@ fn test_ff_early() {
         parser.consume_token(*tok).unwrap();
     }
 }
+#[test]
+fn test_err_state() {
+    let lark = r#"
+        start: /[a-z]*/
+    "#;
+    let tokens = get_tok_env().tokenize("fobarbazqu123");
+    let mut t2 = vec![];
+    for _ in 0..100 {
+        t2.push(tokens[0]);
+        t2.push(tokens[1]);
+        t2.push(tokens[2]);
+    }
+    t2.extend_from_slice(&tokens);
+    let mut matcher = Matcher::new(Ok(make_parser(lark)));
+    for tok in t2.iter() {
+        if let Err(e) = matcher.consume_token(*tok) {
+            let e = e.to_string();
+            println!("Error: {}", e);
+            assert!(e.contains("<state>"));
+            assert!(e.contains("Tokens:"));
+            return;
+        }
+    }
+    unreachable!();
+}
+#[test]
+fn test_trigger_lexer_error() {
+    let lark = r#"
+        start: /[a-z]*/
+    "#;
+    let tokens = get_tok_env().tokenize("fobarbazqu");
+    let mut matcher = Matcher::new(Ok(make_parser(lark)));
+    for tok in tokens.iter() {
+        matcher.consume_token(*tok).unwrap();
+    }
+    if let Err(e) = matcher.test_trigger_lexer_error() {
+        let e = e.to_string();
+        println!("Error: {}", e);
+        assert!(e.contains("<state>"));
+        assert!(e.contains("synthetic error"));
+    } else {
+        unreachable!();
+    }
+    // now all calls should return the same error
+    if let Err(e) = matcher.consume_token(123) {
+        let e = e.to_string();
+        println!("Error: {}", e);
+        assert!(e.contains("<state>"));
+        assert!(e.contains("synthetic error"));
+    } else {
+        unreachable!();
+    }
+}

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/install-deps.sh RENAMED Viewed

@@ -2,7 +2,7 @@
 # installing guidance for deps
 pip install pytest guidance huggingface_hub tokenizers jsonschema maturin[zig] \
-    torch transformers bitsandbytes ipython psutil mypy
+    torch transformers==4.52.1 bitsandbytes ipython psutil mypy
 pip uninstall -y guidance
 # print out versions

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/Cargo.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "toktrie"
-version = "0.7.20"
+version = "0.7.22"
 edition = "2021"
 license = "MIT"
 description = "LLM Token Trie library"

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/toktree.rs RENAMED Viewed

@@ -295,26 +295,25 @@ impl TokTrie {
     }
     fn tokens_dbg_ext(&self, toks: &[u32], quote: bool) -> String {
+        // if the token list is too long, we are typically interested in the most recent ones
         let (limited, toks) = if toks.len() > Self::MAX_DBG_TOKENS {
-            (true, &toks[0..Self::MAX_DBG_TOKENS])
+            ("…", &toks[toks.len() - Self::MAX_DBG_TOKENS..])
         } else {
-            (false, toks)
+            ("", toks)
         };
-        let mut joined = toks
+        let joined = toks
             .iter()
             .map(|t| self.token_dbg_ext(*t, false))
             .collect::<Vec<_>>()
             .join("‧");
-        if limited {
-            joined.push('…');
-        }
         if quote {
-            format!("⟦{}⟧", joined)
-        } else {
+            format!("⟦{}{}⟧", limited, joined)
+        } else if limited.is_empty() {
             joined
+        } else {
+            format!("{}{}", limited, joined)
         }
     }
@@ -1037,9 +1036,12 @@ impl TrieHash {
         self.children.sort_by_key(|e| e.byte);
         for entry in &mut self.children {
             num_ch -= 1;
+            assert!(num_parents < 0xff);
             entry.serialize(data, if num_ch == 0 { num_parents + 1 } else { 1 });
         }
-        data[idx].bits2 |= ((data.len() - idx) as u32) << 8;
+        let subtree_size = data.len() - idx;
+        assert!(subtree_size < 0x100_0000);
+        data[idx].bits2 |= (subtree_size as u32) << 8;
     }
 }

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie_hf_downloader/Cargo.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "toktrie_hf_downloader"
-version = "0.7.20"
+version = "0.7.22"
 edition = "2021"
 license = "MIT"
 description = "HuggingFace Hub download library support for toktrie and llguidance"

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie_hf_tokenizers/Cargo.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "toktrie_hf_tokenizers"
-version = "0.7.20"
+version = "0.7.22"
 edition = "2021"
 license = "MIT"
 description = "HuggingFace tokenizers library support for toktrie and llguidance"

{llguidance-0.7.20 → llguidance-0.7.22}/.github/workflows/rust.yml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/.github/workflows/wheels.yml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/.gitignore RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/CODE_OF_CONDUCT.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/Cargo.toml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/LICENSE RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/SECURITY.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/SUPPORT.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/c_sample/Makefile RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/c_sample/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/c_sample/c_sample.cpp RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/fast_forward.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/json_schema.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/mask_plot.png RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/optimizations.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/special_tokens.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/syntax.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/docs/toktrie.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/Cargo.toml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/expected_maskbench.json RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/jstats.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/scripts/split-stats.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/scripts/split_plot.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/src/json_stats.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/src/lib.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/json_stats/src/stats.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/LICENSE RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/build.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/cbindgen.toml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/grammars/character.json RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/grammars/json.json RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/llguidance.h RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/api.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/from_guidance.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/grammar.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/lexer.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/lexerspec.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/mod.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/perf.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/regexvec.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/earley/slicer.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/factory.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/ffi.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/ffi_par.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/grammar_builder.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/compiler.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/context_ref.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/context_simple/context.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/context_simple/draft.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/context_simple/mod.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/mod.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/numeric.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/schema.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json/shared_context.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/json_validation.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/ast.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/common.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/compiler.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/lexer.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/mod.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lark/parser.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/lib.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/logging.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/output.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/panic_utils.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/regex_rewrite.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/stop_controller.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/substring.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/parser/src/tokenizer_json.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/plan.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/__init__.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/_grammar_from.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/_lib.pyi RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/_struct_tag.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/_util.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/gbnf_to_lark.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/mlx.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/numpy.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/py.typed RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/llguidance/torch.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/mypy.ini RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/torch_tests/__init__.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/torch_tests/test_bitmask.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python/torch_tests/test_matcher.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/src/lib.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/src/llinterpreter.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/src/llmatcher.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/src/parserlimits.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/python_ext/src/pyjson.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/Cargo.toml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/cli.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/blog.sample.json RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/blog.schema.json RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/blog.schema.ll.json RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/arithmetic.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/c.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/chess.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/english.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/japanese.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/json.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/json_arr.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/list.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/from-llama.cpp/vllm-sql.gbnf RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/lark.lark RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/rfc.lark RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/rfc.xml RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/data/ulysses.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/gtest.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/lark.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/run.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/src/lib.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/src/minimal.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/src/sample_parser.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/tests/test_ll.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/sample_parser/tests/test_stop.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/annotate_asm.js RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/bump.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/cbindgen.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/checklinks.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/checklinks.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/ci-publish.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/disasm.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/gbnf_to_lark.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/gen-testcase.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/git-version.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/jsonschema-stats.js RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/remote-guidance-test.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/rust-size.js RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/rust_size.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/test-guidance.sh RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/tokenizer_test.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/scripts/update-git.py RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/LICENSE RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/README.md RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/bytes.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/lib.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/recognizer.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/rng.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/svob.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/src/tokenv.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie/tests/test_svob.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie_hf_downloader/LICENSE RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie_hf_downloader/src/lib.rs RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie_hf_tokenizers/LICENSE RENAMED Viewed

File without changes

{llguidance-0.7.20 → llguidance-0.7.22}/toktrie_hf_tokenizers/src/lib.rs RENAMED Viewed

File without changes

llguidance 0.7.20__tar.gz → 0.7.22__tar.gz

llguidance 0.7.20tar.gz → 0.7.22tar.gz