PyPI - llguidance - Versions diffs - 0.7.27__tar.gz → 0.7.30__tar.gz - Mend

llguidance 0.7.27tar.gz → 0.7.30tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (184) hide show

{llguidance-0.7.27 → llguidance-0.7.30}/CHANGELOG.md RENAMED Viewed

@@ -4,6 +4,19 @@ All notable changes to this project will be documented in this file. Dates are d
 If a release doesn't introduce any interesting changes (build fixes etc.), it's skipped.
+#### [0.7.30](https://github.com/guidance-ai/llguidance/compare/v0.7.29...0.7.30) 2025-06-23
+- parametric grammars [`#192`](https://github.com/guidance-ai/llguidance/pull/192)
+- allow for tokens up to ~2k bytes; fixes #188 [`#188`](https://github.com/guidance-ai/llguidance/issues/188)
+#### [0.7.29](https://github.com/guidance-ai/llguidance/compare/v0.7.28...0.7.29) 2025-06-06
+- cargo fmt
+#### [0.7.28](https://github.com/guidance-ai/llguidance/compare/v0.7.27...0.7.28) 2025-06-06
+- fix lexer_stack=... panic with numeric tokens [`4e91b0f`](https://github.com/guidance-ai/llguidance/commit/4e91b0fa0c03572a5fc221ac0e0b05035af9dcfa)
 #### [0.7.27](https://github.com/guidance-ai/llguidance/compare/v0.7.26...0.7.27) 2025-06-04
 - add toktrie_tiktoken and llguidance.tiktoken.lltokenizer_from_encoding [`#154`](https://github.com/guidance-ai/llguidance/issues/154)

{llguidance-0.7.27 → llguidance-0.7.30}/Cargo.lock RENAMED Viewed

@@ -1211,7 +1211,7 @@ checksum = "23fb14cb19457329c82206317a5663005a4d404783dc74f4252769b0d5f42856"
 [[package]]
 name = "llguidance"
-version = "0.7.27"
+version = "0.7.30"
 dependencies = [
  "anyhow",
  "derivre",
@@ -1230,7 +1230,7 @@ dependencies = [
 [[package]]
 name = "llguidance_py"
-version = "0.7.27"
+version = "0.7.30"
 dependencies = [
  "anyhow",
  "bytemuck",
@@ -2395,7 +2395,7 @@ dependencies = [
 [[package]]
 name = "toktrie"
-version = "0.7.27"
+version = "0.7.30"
 dependencies = [
  "anyhow",
  "bytemuck",
@@ -2406,7 +2406,7 @@ dependencies = [
 [[package]]
 name = "toktrie_hf_downloader"
-version = "0.7.27"
+version = "0.7.30"
 dependencies = [
  "anyhow",
  "hf-hub",
@@ -2417,7 +2417,7 @@ dependencies = [
 [[package]]
 name = "toktrie_hf_tokenizers"
-version = "0.7.27"
+version = "0.7.30"
 dependencies = [
  "anyhow",
  "log",
@@ -2429,7 +2429,7 @@ dependencies = [
 [[package]]
 name = "toktrie_tiktoken"
-version = "0.7.27"
+version = "0.7.30"
 dependencies = [
  "anyhow",
  "log",

{llguidance-0.7.27 → llguidance-0.7.30}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: llguidance
-Version: 0.7.27
+Version: 0.7.30
 License-File: LICENSE
 Summary: Bindings for the Low-level Guidance (llguidance) Rust library for use within Guidance
 Author: Michal Moskal
@@ -20,6 +20,7 @@ Project-URL: issue_tracker, https://github.com/microsoft/llguidance/issues
 ---
+* 2025-06-11 [Making Structured Outputs Go Brrr](https://guidance-ai.github.io/llguidance/llg-go-brrr) blog post released
 * 2025-05-20 LLGuidance [shipped](https://x.com/OpenAIDevs/status/1924915341052019166) in [OpenAI](https://x.com/OpenAIDevs/status/1924915343677653014) for JSON Schema
 * 2025-04-11 integration [merged](https://github.com/chromium/chromium/commit/07ca6337c2f714ba0477202414bd2b1692e70594) into Chromium
 * 2025-03-25 integration [merged](https://github.com/vllm-project/vllm/pull/14779) into vLLM (v0.8.2)
@@ -76,6 +77,9 @@ The library is currently integrated in:
 ## Technical details
+See [Making Structured Outputs Go Brrr](https://guidance-ai.github.io/llguidance/llg-go-brrr) for an overview of the library,
+including the design decisions, performance, and how it compares to other approaches.
 Given a context-free grammar, a tokenizer, and a prefix of tokens, llguidance computes a token mask - a set of tokens from the tokenizer - that, when added to the current token prefix, can lead to a valid string in the language defined by the grammar. Mask computation takes approximately 50μs of single-core CPU time for a tokenizer with 128k tokens. While this timing depends on the exact grammar, it holds, for example, for grammars derived from JSON schemas. There is no significant startup cost.
 The library implements a context-free grammar parser using Earley’s algorithm on top of a lexer based on [derivatives of regular expressions](https://github.com/microsoft/derivre). Mask computation is achieved by traversing the [prefix tree (trie)](./docs/toktrie.md) of all possible tokens, leveraging [highly optimized](./docs/optimizations.md) code.

{llguidance-0.7.27 → llguidance-0.7.30}/README.md RENAMED Viewed

@@ -8,6 +8,7 @@
 ---
+* 2025-06-11 [Making Structured Outputs Go Brrr](https://guidance-ai.github.io/llguidance/llg-go-brrr) blog post released
 * 2025-05-20 LLGuidance [shipped](https://x.com/OpenAIDevs/status/1924915341052019166) in [OpenAI](https://x.com/OpenAIDevs/status/1924915343677653014) for JSON Schema
 * 2025-04-11 integration [merged](https://github.com/chromium/chromium/commit/07ca6337c2f714ba0477202414bd2b1692e70594) into Chromium
 * 2025-03-25 integration [merged](https://github.com/vllm-project/vllm/pull/14779) into vLLM (v0.8.2)
@@ -64,6 +65,9 @@ The library is currently integrated in:
 ## Technical details
+See [Making Structured Outputs Go Brrr](https://guidance-ai.github.io/llguidance/llg-go-brrr) for an overview of the library,
+including the design decisions, performance, and how it compares to other approaches.
 Given a context-free grammar, a tokenizer, and a prefix of tokens, llguidance computes a token mask - a set of tokens from the tokenizer - that, when added to the current token prefix, can lead to a valid string in the language defined by the grammar. Mask computation takes approximately 50μs of single-core CPU time for a tokenizer with 128k tokens. While this timing depends on the exact grammar, it holds, for example, for grammars derived from JSON schemas. There is no significant startup cost.
 The library implements a context-free grammar parser using Earley’s algorithm on top of a lexer based on [derivatives of regular expressions](https://github.com/microsoft/derivre). Mask computation is achieved by traversing the [prefix tree (trie)](./docs/toktrie.md) of all possible tokens, leveraging [highly optimized](./docs/optimizations.md) code.

llguidance-0.7.30/docs/parametric.md ADDED Viewed

@@ -0,0 +1,134 @@
+# Parametric grammars
+In llguidance [grammar rules](./syntax.md) can be parameterized by a 64-bit integer.
+This allows for expressing concepts like "permutation of N elements", "unique selection of N elements",
+and other combinatorial structures in a concise way.
+The parametrized grammars are technically still context-free, just very large:
+each parametrized rule is treated as if it was expanded for each 64-bit integer value.
+Of course, the grammar is only materialized lazily, during Earley parsing.
+For example, this grammar describes permutations of 3 elements `a`, `b`, and `c`:
+```lark
+start    :  perm::0x0
+perm::_  :  ""                       %if is_ones([0:3])
+         |  "a" perm::set_bit(0)     %if bit_clear(0)
+         |  "b" perm::set_bit(1)     %if bit_clear(1)
+         |  "c" perm::set_bit(2)     %if bit_clear(2)
+```
+The `start` rule starts with an empty set of bits (`0x0`), and the `perm` rule expands to either an empty string
+(if all bits are set, i.e., all elements have been seen)
+or a choice of one of the remaining elements followed by a recursive call to `perm`
+with the corresponding bit set in the parameter (using `set_bit(k)`).
+Think of the `perm::_ : ...` syntax as:
+```lark
+perm(p)  :  ""                       %if p[0:3] == 0b111
+         |  "a" perm(p | (1 << 0))   %if p[0:1] == 0b0
+         |  "b" perm(p | (1 << 1))   %if p[1:2] == 0b0
+         |  "c" perm(p | (1 << 2))   %if p[2:3] == 0b0
+```
+Where `p[x:y]` is the bit range from `x` inclusive to `y` exclusive in the parameter `p`, that is `(p >> x) & ((1 << y) - 1)`.
+Currently, there is always a single parameter for each rule, and it is always a 64-bit integer.
+## Function reference
+The following functions are available in rule parameters. Assume current parameter is `p`,
+`v` is a 64-bit integer literal using decimal or hexadecimal notation,
+`k`, `x`, and `y` are bit indices (0-based).
+Additionally, `_` can be used to refer to `[0:64]`.
+- `_ => p` (self-reference)
+- `set_bit(k) => p | (1 << k)` sets the k-th bit in the parameter
+- `clear_bit(k) => p & ~(1 << k)` clears the k-th bit in the parameter
+- `bit_and(v) => p & v`
+- `bit_or(v) => p | v`
+- `incr([x:y]) => p[x:y] == 0b11...1 ? p : p + (1 << x)` - saturating increment of bits in the range `[x:y]`
+- `decr([x:y]) => p[x:y] == 0 ? p : p - (1 << x)` - saturating decrement of bits in the range `[x:y]`
+The following functions are available in rule conditions (`c` is a condition expression).
+All comparisons treat intergers as unsigned.
+- `true` and `true()` (always true)
+- `bit_clear(k) => p[k:k+1] == 0` (checks if the k-th bit is clear)
+- `bit_set(k) => p[k:k+1] == 1` (checks if the k-th bit is set)
+- `is_ones([x:y]) => p[x:y] == ((1 << (y - x)) - 1)` (checks if all bits in the range `[x:y]` are set)
+- `is_zeros([x:y]) => p[x:y] == 0` (checks if all bits in the range `[x:y]` are clear)
+- `eq([x:y], v) => p[x:y] == v` (checks if bits in the range `[x:y]` are equal to `v`)
+- `ne([x:y], v) => p[x:y] != v`
+- `lt([x:y], v) => p[x:y] < v`
+- `le([x:y], v) => p[x:y] <= v`
+- `gt([x:y], v) => p[x:y] > v`
+- `ge([x:y], v) => p[x:y] >= v`
+- `bit_count_eq([x:y], k) => bin(p[x:y]).count('1') == k` (checks if the number of set bits in the range `[x:y]` is equal to `k`)
+- `bit_count_ne([x:y], k) => bin(p[x:y]).count('1') != k`
+- `bit_count_lt([x:y], k) => bin(p[x:y]).count('1') < k`
+- `bit_count_le([x:y], k) => bin(p[x:y]).count('1') <= k`
+- `bit_count_gt([x:y], k) => bin(p[x:y]).count('1') > k`
+- `bit_count_ge([x:y], k) => bin(p[x:y]).count('1') >= k`
+- `and(c, c)` (logical AND of two conditions)
+- `or(c, c)` (logical OR of two conditions)
+- `not(c)` (logical negation of a condition)
+## Examples
+Any sequence of `a`, `b`, and `c` where each element occurs at least once:
+```lark
+start    :  perm::0x0
+perm::_  :  ""                       %if is_ones([0:3])
+         |  "a" perm::set_bit(0)
+         |  "b" perm::set_bit(1)
+         |  "c" perm::set_bit(2)
+```
+A sequence `s` matching `/a*b*/` where `len(s) < 20`:
+```lark
+start  : aa::0
+aa::_  : "b" aa::incr(_)    %if lt(_, 20)
+       | bb::_
+bb::_  : "a" bb::incr(_)    %if lt(_, 20)
+       | ""
+```
+A sequence of `a`, `b`, and `c` in any order,
+where `a` and `b` can occur at most 5 times each, and `c` at most 6 times.
+Note that you have to allocate enough bits for each element.
+```lark
+start  : lst::0x0
+lst::_ : "a" lst::incr([0:3])  %if lt([0:3], 5)
+       | "b" lst::incr([3:6])  %if lt([3:6], 5)
+       | "c" lst::incr([6:9])  %if lt([6:9], 6)
+       | ""
+```
+Pick at last 1 and at most 3 elements from `a`, `b`, `c`, `d`, `e`;
+each element can occur at most once.
+```lark
+start    :  perm::0x0
+perm::_  :  ""                       %if bit_count_ge(_, 1)
+         |  "a" perm::set_bit(0)     %if and(bit_clear(0), bit_count_lt(_, 3))
+         |  "b" perm::set_bit(1)     %if and(bit_clear(1), bit_count_lt(_, 3))
+         |  "c" perm::set_bit(2)     %if and(bit_clear(2), bit_count_lt(_, 3))
+         |  "d" perm::set_bit(3)     %if and(bit_clear(3), bit_count_lt(_, 3))
+         |  "e" perm::set_bit(4)     %if and(bit_clear(4), bit_count_lt(_, 3))
+```
+## Performance considerations
+All the rules above are right-recursive, which is generally [not ideal](./syntax.md#recursive-rules) for Earley parsing.
+The problem is, for a list of length `N` it will generate `O(N^2)` items during parsing
+(for item number `i`, it will generate about `i` items).
+However, if you were to make them left-recursive, it may generate `O(2^K)` items
+where `K` is the number of bits used, so do not do that.
+Practically, this means the rules will not work for lists longer than about 2000 elements.

{llguidance-0.7.27 → llguidance-0.7.30}/docs/syntax.md RENAMED Viewed

@@ -105,20 +105,34 @@ with a definition of a Python string, depending on how the model was trained.
 ### Reasoning/thinking
 Yet another example is "thinking" or reasoning models distilled from DeepSeek-R1.
-A grammar for forcing JSON may look like this:
+A grammar for forcing a JSON-formatted address may look like this:
 ```lark
-start: <think> "\n" /(.|\n)*/ </think> json
-json: %json { ... }
+start: <think> "\n" /(.|\n)*/ </think> address
+address: %json {
+    "type": "object",
+    "properties": {
+        "street": { "type": "string" },
+        "city": { "type": "string" },
+        "zip": { "type": "number" }
+    },
+    "required": ["street", "city", "state", "zip"]
+}
 ```
 Often, the chat format already includes initial `<think>\n` - in these cases
-you can use `start: /(.|\n)*/ </think> json` as the grammar.
+you can use `start: /(.|\n)*/ </think> address` as the grammar.
 You can also use `/(.|\n){1000,3000}/` to place lower and upper bounds on the thinking amount.
 This assumes `<think>` is a special token. If it was just a string, you would need
 to use [`suffix="</think>"`](#lazy-lexemes).
+### Parametric grammars (unique lists, permutations etc.)
+Rules can be parameterized by a 64-bit value, which allows for expressing
+combinatorial structures like unique lists and permutations.
+See [Parametric grammars](./parametric.md) for details.
 ### Lexeme options
 Some of these features (especially `stop`) are primarily for compatibility with [Guidance](https://github.com/guidance-ai/guidance).

{llguidance-0.7.27 → llguidance-0.7.30}/parser/Cargo.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "llguidance"
-version = "0.7.27"
+version = "0.7.30"
 edition = "2021"
 license = "MIT"
 description = "Super-fast Structured Outputs"

{llguidance-0.7.27 → llguidance-0.7.30}/parser/src/earley/from_guidance.rs RENAMED Viewed

@@ -182,9 +182,9 @@ impl GrammarInit {
         extra_lexemes: Vec<String>,
     ) -> Result<Arc<CGrammar>> {
         let t0 = Instant::now();
-        let (grammar, mut lexer_spec) = self.to_internal(tok_env, limits)?;
+        let (grammar, mut lexer_spec) = self.to_internal(tok_env, limits.clone())?;
         lexer_spec.add_extra_lexemes(&extra_lexemes);
-        compile_grammar(t0, grammar, lexer_spec, logger)
+        compile_grammar(t0, grammar, lexer_spec, logger, &limits)
     }
 }
@@ -193,6 +193,7 @@ fn compile_grammar(
     mut grammar: Grammar,
     lexer_spec: LexerSpec,
     logger: &mut Logger,
+    limits: &ParserLimits,
 ) -> Result<Arc<CGrammar>> {
     let log_grammar = logger.level_enabled(3) || (logger.level_enabled(2) && grammar.is_small());
     if log_grammar {
@@ -226,7 +227,7 @@ fn compile_grammar(
         writeln!(logger.info_logger(), "  ==> {}", grammar.stats()).unwrap();
     }
-    let grammars = Arc::new(grammar.compile(lexer_spec));
+    let grammars = Arc::new(grammar.compile(lexer_spec, limits)?);
     loginfo!(
         logger,

llguidance 0.7.27__tar.gz → 0.7.30__tar.gz

llguidance 0.7.27tar.gz → 0.7.30tar.gz