node-llama-cpp 2.8.11 → 2.8.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/llama/addon.cpp +0 -1
- package/llama/binariesGithubRelease.json +1 -1
- package/llama/gitRelease.bundle +0 -0
- package/llama/grammars/README.md +48 -5
- package/llama/grammars/json.gbnf +3 -3
- package/llama/grammars/json_arr.gbnf +3 -3
- package/llamaBins/linux-arm64/llama-addon.node +0 -0
- package/llamaBins/linux-armv7l/llama-addon.node +0 -0
- package/llamaBins/linux-x64/llama-addon.node +0 -0
- package/llamaBins/mac-arm64/default.metallib +0 -0
- package/llamaBins/mac-arm64/llama-addon.node +0 -0
- package/llamaBins/mac-x64/default.metallib +0 -0
- package/llamaBins/mac-x64/llama-addon.node +0 -0
- package/llamaBins/win-x64/llama-addon.exp +0 -0
- package/llamaBins/win-x64/llama-addon.lib +0 -0
- package/llamaBins/win-x64/llama-addon.node +0 -0
- package/package.json +1 -1
package/llama/addon.cpp
CHANGED
package/llama/gitRelease.bundle
CHANGED
|
Binary file
|
package/llama/grammars/README.md
CHANGED
|
@@ -59,9 +59,13 @@ Parentheses `()` can be used to group sequences, which allows for embedding alte
|
|
|
59
59
|
|
|
60
60
|
## Repetition and Optional Symbols
|
|
61
61
|
|
|
62
|
-
- `*` after a symbol or sequence means that it can be repeated zero or more times.
|
|
63
|
-
- `+` denotes that the symbol or sequence should appear one or more times.
|
|
64
|
-
- `?` makes the preceding symbol or sequence optional.
|
|
62
|
+
- `*` after a symbol or sequence means that it can be repeated zero or more times (equivalent to `{0,}`).
|
|
63
|
+
- `+` denotes that the symbol or sequence should appear one or more times (equivalent to `{1,}`).
|
|
64
|
+
- `?` makes the preceding symbol or sequence optional (equivalent to `{0,1}`).
|
|
65
|
+
- `{m}` repeats the precedent symbol or sequence exactly `m` times
|
|
66
|
+
- `{m,}` repeats the precedent symbol or sequence at least `m` times
|
|
67
|
+
- `{m,n}` repeats the precedent symbol or sequence at between `m` and `n` times (included)
|
|
68
|
+
- `{0,n}` repeats the precedent symbol or sequence at most `n` times (included)
|
|
65
69
|
|
|
66
70
|
## Comments and newlines
|
|
67
71
|
|
|
@@ -87,9 +91,11 @@ item ::= [^\n]+ "\n"
|
|
|
87
91
|
|
|
88
92
|
This guide provides a brief overview. Check out the GBNF files in this directory (`grammars/`) for examples of full grammars. You can try them out with:
|
|
89
93
|
```
|
|
90
|
-
./
|
|
94
|
+
./llama-cli -m <model> --grammar-file grammars/some-grammar.gbnf -p 'Some prompt'
|
|
91
95
|
```
|
|
92
96
|
|
|
97
|
+
`llama.cpp` can also convert JSON schemas to grammars either ahead of time or at each request, see below.
|
|
98
|
+
|
|
93
99
|
## Troubleshooting
|
|
94
100
|
|
|
95
101
|
Grammars currently have performance gotchas (see https://github.com/ggerganov/llama.cpp/issues/4218).
|
|
@@ -98,4 +104,41 @@ Grammars currently have performance gotchas (see https://github.com/ggerganov/ll
|
|
|
98
104
|
|
|
99
105
|
A common pattern is to allow repetitions of a pattern `x` up to N times.
|
|
100
106
|
|
|
101
|
-
While semantically correct, the syntax `x? x? x?.... x?` (with N repetitions)
|
|
107
|
+
While semantically correct, the syntax `x? x? x?.... x?` (with N repetitions) may result in extremely slow sampling. Instead, you can write `x{0,N}` (or `(x (x (x ... (x)?...)?)?)?` w/ N-deep nesting in earlier llama.cpp versions).
|
|
108
|
+
|
|
109
|
+
## Using GBNF grammars
|
|
110
|
+
|
|
111
|
+
You can use GBNF grammars:
|
|
112
|
+
|
|
113
|
+
- In [llama-server](../examples/server)'s completion endpoints, passed as the `grammar` body field
|
|
114
|
+
- In [llama-cli](../examples/main), passed as the `--grammar` & `--grammar-file` flags
|
|
115
|
+
- With [llama-gbnf-validator](../examples/gbnf-validator) tool, to test them against strings.
|
|
116
|
+
|
|
117
|
+
## JSON Schemas → GBNF
|
|
118
|
+
|
|
119
|
+
`llama.cpp` supports converting a subset of https://json-schema.org/ to GBNF grammars:
|
|
120
|
+
|
|
121
|
+
- In [llama-server](../examples/server):
|
|
122
|
+
- For any completion endpoints, passed as the `json_schema` body field
|
|
123
|
+
- For the `/chat/completions` endpoint, passed inside the `result_format` body field (e.g. `{"type", "json_object", "schema": {"items": {}}}`)
|
|
124
|
+
- In [llama-cli](../examples/main), passed as the `--json` / `-j` flag
|
|
125
|
+
- To convert to a grammar ahead of time:
|
|
126
|
+
- in CLI, with [examples/json_schema_to_grammar.py](../examples/json_schema_to_grammar.py)
|
|
127
|
+
- in JavaScript with [json-schema-to-grammar.mjs](../examples/server/public/json-schema-to-grammar.mjs) (this is used by the [server](../examples/server)'s Web UI)
|
|
128
|
+
|
|
129
|
+
Take a look at [tests](../../tests/test-json-schema-to-grammar.cpp) to see which features are likely supported (you'll also find usage examples in https://github.com/ggerganov/llama.cpp/pull/5978, https://github.com/ggerganov/llama.cpp/pull/6659 & https://github.com/ggerganov/llama.cpp/pull/6555).
|
|
130
|
+
|
|
131
|
+
Here is also a non-exhaustive list of **unsupported** features:
|
|
132
|
+
|
|
133
|
+
- `additionalProperties`: to be fixed in https://github.com/ggerganov/llama.cpp/pull/7840
|
|
134
|
+
- `minimum`, `exclusiveMinimum`, `maximum`, `exclusiveMaximum`
|
|
135
|
+
- `integer` constraints to be implemented in https://github.com/ggerganov/llama.cpp/pull/7797
|
|
136
|
+
- Remote `$ref`s in the C++ version (Python & JavaScript versions fetch https refs)
|
|
137
|
+
- Mixing `properties` w/ `anyOf` / `oneOf` in the same type (https://github.com/ggerganov/llama.cpp/issues/7703)
|
|
138
|
+
- `string` formats `uri`, `email`
|
|
139
|
+
- [`contains`](https://json-schema.org/draft/2020-12/json-schema-core#name-contains) / `minContains`
|
|
140
|
+
- `uniqueItems`
|
|
141
|
+
- `$anchor` (cf. [dereferencing](https://json-schema.org/draft/2020-12/json-schema-core#name-dereferencing))
|
|
142
|
+
- [`not`](https://json-schema.org/draft/2020-12/json-schema-core#name-not)
|
|
143
|
+
- [Conditionals](https://json-schema.org/draft/2020-12/json-schema-core#name-keywords-for-applying-subsche) `if` / `then` / `else` / `dependentSchemas`
|
|
144
|
+
- [`patternProperties`](https://json-schema.org/draft/2020-12/json-schema-core#name-patternproperties)
|
package/llama/grammars/json.gbnf
CHANGED
|
@@ -16,10 +16,10 @@ array ::=
|
|
|
16
16
|
string ::=
|
|
17
17
|
"\"" (
|
|
18
18
|
[^"\\\x7F\x00-\x1F] |
|
|
19
|
-
"\\" (["
|
|
19
|
+
"\\" (["\\bfnrt] | "u" [0-9a-fA-F]{4}) # escapes
|
|
20
20
|
)* "\"" ws
|
|
21
21
|
|
|
22
|
-
number ::= ("-"? ([0-9] | [1-9] [0-9]
|
|
22
|
+
number ::= ("-"? ([0-9] | [1-9] [0-9]{0,15})) ("." [0-9]+)? ([eE] [-+]? [0-9] [1-9]{0,15})? ws
|
|
23
23
|
|
|
24
24
|
# Optional space: by convention, applied in this grammar after literal chars when allowed
|
|
25
|
-
ws ::=
|
|
25
|
+
ws ::= | " " | "\n" [ \t]{0,20}
|
|
@@ -25,10 +25,10 @@ array ::=
|
|
|
25
25
|
string ::=
|
|
26
26
|
"\"" (
|
|
27
27
|
[^"\\\x7F\x00-\x1F] |
|
|
28
|
-
"\\" (["
|
|
28
|
+
"\\" (["\\bfnrt] | "u" [0-9a-fA-F]{4}) # escapes
|
|
29
29
|
)* "\"" ws
|
|
30
30
|
|
|
31
|
-
number ::= ("-"? ([0-9] | [1-9] [0-9]
|
|
31
|
+
number ::= ("-"? ([0-9] | [1-9] [0-9]{0,15})) ("." [0-9]+)? ([eE] [-+]? [1-9] [0-9]{0,15})? ws
|
|
32
32
|
|
|
33
33
|
# Optional space: by convention, applied in this grammar after literal chars when allowed
|
|
34
|
-
ws ::=
|
|
34
|
+
ws ::= | " " | "\n" [ \t]{0,20}
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "node-llama-cpp",
|
|
3
|
-
"version": "2.8.
|
|
3
|
+
"version": "2.8.12",
|
|
4
4
|
"description": "Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level",
|
|
5
5
|
"main": "dist/index.js",
|
|
6
6
|
"type": "module",
|