This module converts from Lark-like syntax to llguidance. It makes it easier to get started with a new grammar, and provides a familiar syntax, however is not a drop-in replacement for Lark.
Following are the extensions to Lark syntax:
- when several grammars are passed in one request (
grammars
field), they ones using Lark can reference others using syntax like@17
refering to grammar at index 17 in thegrammars
list, or@my_grammar
refering to grammar with"name": "my_grammar"
. - you can specify temperature for subgrammar by referencing it via
my_temp_json[temperature=0.7]: @json
syntax - you can also inline JSON schema in the lark grammar, using
%json { ... }
syntax (it behaves like non-terminal) - special tokens can referenced via
<token_name>
syntax, for example<|ENDOFTEXT|>
; they cannot be used inside of terminals, but can be used in regular rules; the exact syntax depends on the tokenizer - it is also possible to use numeric token ids, as in
<[128010]>
for llama's<|python_tag|>
; you can also use ranges like<[128000-128255]>
for all llama special tokens, or even lists of ranges like<[128000-128100,128130-128170]>
; ranges are inclusive max_tokens
,temperature
andstop
can be specified on rules, but the rule body must be a token expression, for example:mygen[stop="\n", max_tokens=10, temperature=0.7]: /.*/
- if
stop
is specified (possibly as""
) the rule is treated asgen()
in Guidance; otherwise it is treated aslexeme()
Following are currently not supported:
- lookarounds in lexer regexes
- lazy modifier (
?
) in lexer regexes; in llguidance the whole lexeme is either lazy or greedy - priorities
- templates
- imports (other than built-in
%import common
) - regexes use Rust
regex
crate syntax, not Python'sre
(though they are similar) - certain string syntax, see issue
Following features of llguidance are currently not exposed in Lark syntax:
- per-lexeme contextual and lazy flags
Here, we restrict the output to either normal text response, or a tool call to either Brave or Wolfram Alpha.
start: normal_text | brave | wolfram
normal_text: /(.|\n)*/
brave: <|python_tag|> "brave_search.call(query=" JSON_STRING ")" <|eom_id|>
wolfram: <|python_tag|> "wolfram_alpha.call(query=" JSON_STRING ")" <|eom_id|>
JSON_STRING_CHAR: /(\\([\"\\\/bfnrt]|u[a-fA-F0-9]{4})|[^\"\\\x00-\x1F\x7F])/
JSON_STRING: "\"" JSON_STRING_CHAR* "\""
Note that just as in lark uppercase identifiers define grammar lexemes
(also often called tokens) - they can't be recursive
(they are compiled to regular expressions).
This has performance implications, in particular you should avoid short lexemes.
If the grammar used json_string
not JSON_STRING
,
then each json_string
would consists of lexeme "
, followed
by any number of single-character lexemes, followed by lexeme "
.
Such grammar would be very slow to run.
With upper-case JSON_STRING
, the whole string is a lexeme.
BTW, in this case you may want to replace the JSON string with Python string, depending on how the model was trained.
You can also use Lark-like syntax to combine JSON schemas with regular output. In that case, you pass the JSON schemas as additional grammars, with the lark grammar being the top-level one.
start: normal_text | fun_call
// @fun0, @fun1 refer to other sub-grammars, see below
fun_call: <|python_tag|> ( @fun0 | @fun1 ) <|eom_id|>
normal_text: /(.|\n)*/
{
"grammars": [
{
"lark_grammar": "...the lark above...",
},
{"name": "fun0", "json_schema": { ... }},
{"name": "fun1", "json_schema": { ... }}
]
}
It is also possible to inline the JSON schema in the lark grammar, like so:
start: normal_text | fun_call
fun_call: <|python_tag|> ( fun0 | fun1 ) <|eom_id|>
normal_text: /(.|\n)*/
fun0: %json {
"type": "object",
...
}
fun1: %json {
"type": "object",
...
}