This module converts from Lark-like syntax to llguidance. It makes it easier to get started with a new grammar, and provides a familiar syntax, however is not a drop-in replacement for Lark.
Following are the extensions to Lark syntax:
- when several grammars are passed in one request (
grammars
field), they ones using Lark can reference others using syntax like@17
refering to grammar at index 17 in thegrammars
list, or@my_grammar
refering to grammar with"name": "my_grammar"
. - you can specify temperature for subgrammar by referencing it via
my_temp_json[temperature=0.7]: @json
syntax - special tokens can referenced via
<token_name>
syntax, for example<|ENDOFTEXT|>
; they cannot be used inside of terminals, but can be used in regular rules; the exact syntax depends on the tokenizer max_tokens
,temperature
andstop
can be specified on rules, but the rule body must be a token expression, for example:mygen[stop="\n", max_tokens=10, temperature=0.7]: /.*/
- if
stop
is specified (possibly as""
) the rule is treated asgen()
in Guidance; otherwise it is treated aslexeme()
Following are currently not supported:
- lookarounds in lexer regexes
- lazy modifier (
?
) in lexer regexes; in llguidance the whole lexeme is either lazy or greedy - priorities
- templates
- imports (other than built-in
%import common
) - regexes use Rust
regex
crate syntax, not Python'sre
(though they are similar)
Following features of llguidance are currently not exposed in Lark syntax:
- per-lexeme contextual and lazy flags