-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multimodal support with Phi 3 Vision + Transformers #1020
Open
nking-1
wants to merge
318
commits into
guidance-ai:main
Choose a base branch
from
nking-1:phi3vision
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
318 commits
Select commit
Hold shift + click to select a range
ae1baa6
int -> bytes
hudson-ai 077e4b9
Explicitly pass kwargs to EngineCallResponse
hudson-ai ede3333
matched
hudson-ai ed4b220
Add some comments
hudson-ai d26536c
Merge branch 'main' into lazy_grammars
hudson-ai 6f98db5
done is callable now
hudson-ai 7848c09
LLGUIDANCE_LOG_LEVEL
hudson-ai 2351793
Prelim greedy json
hudson-ai c8f89fc
Move temperature to get_logits
hudson-ai ecb902c
captures already decoded
hudson-ai 46c1cfb
More helpful exceptions
hudson-ai 1fd2445
Consume bytes in init
hudson-ai 791ce57
valid_next_bytes
hudson-ai 6293930
next_byte_mask
hudson-ai ffd51b3
adapt parser tests
hudson-ai 97a91fc
Fix ParserExceptions
hudson-ai 161cd9f
Serialize ByteRange as if they are wrapped in GenCommitPoint
hudson-ai eadd79e
Typo
hudson-ai 126aaa1
Epsilon for repr of grammars with null
hudson-ai 55b3079
Byte(b".") -> b"."
hudson-ai 3531c43
black
hudson-ai c59d1a2
more inclusive number schema
hudson-ai 36301e8
Byte(b".") -> b"."
hudson-ai bfe67c2
no more Byte/ByteRange in tests
hudson-ai 0ebb59c
use eos token as stop token
hudson-ai 4423906
fix LLGUIDANCE_LOG_LEVEL
mmoskal ee133e7
make string lexemes contextual
hudson-ai 7f35ed0
cache json definitions
hudson-ai 60855b6
Pass max_tokens to json
hudson-ai a6e7f33
<Bandaid> inject BOS token after process_prompt
hudson-ai d40c5ae
refactor sample_with_temperature to take mask
hudson-ai acddbc6
Make mock produce valid tokenizations
hudson-ai a4ec9a4
Temporary fix/hack for whitespace validation in tests
hudson-ai 217f8ee
Azure guidance temperature tests
hudson-ai 7580624
xfails
hudson-ai 36a4c7e
fix recursion check in commit_point()
mmoskal e677c14
mark flaky tests as xpass
mmoskal 063a011
Make mock always sample greedily
hudson-ai 85908b0
further fix for recursive regex
mmoskal 5d966e0
Merge branch 'lazy_grammars' of https://github.com/hudson-ai/guidance…
mmoskal a8469fc
remove some xfails (failed due to mock not forcing next token in byte…
hudson-ai 815ed6b
Using commit point as a regex (ish)
hudson-ai e1645c1
remove debug out
mmoskal 3dd9019
add tests for LLParser
mmoskal a031f20
add docs
mmoskal 34aa88b
add nimble fighter test
mmoskal 6b82dbf
more comments
mmoskal a88456f
more dolphin tests
mmoskal 2e40d43
test ll backtrack
mmoskal 4b8e423
more backtrack testing
mmoskal 3aca8e2
pop tokens test
mmoskal 20db158
add nullable lexeme test
mmoskal f8ef812
add ll_max_tokens test; fix test names
mmoskal feb6712
Remove 12 xfails
hudson-ai 4acc833
check gen() with different max_tokens
mmoskal ac05ebe
force parse from matched to done when getting captures
hudson-ai 959f2b9
remove more xfails
hudson-ai 6372a15
number as its own lexeme
hudson-ai 0a5a5d1
disambiguate json number lexemes
hudson-ai b545d04
fix test
hudson-ai 28b2309
Revert "disambiguate json number lexemes"
hudson-ai 1afc9bb
remove more xfails
hudson-ai 9ef9b07
fix https://github.com/hudson-ai/guidance/issues/18 (added test)
mmoskal 87c80b8
a few tests for grammar.match
hudson-ai b00a695
greedy tests for ambiguous ending lexemes
hudson-ai cf56a9e
nullable final lexeme test
hudson-ai 606b208
Delete code in model
hudson-ai 546a013
Remove all non-gen_mode branches in _gen.py
hudson-ai 1b26e13
Fix non-recursive selects inside commit points
hudson-ai 28171f0
serialize substring as regex
hudson-ai 34fd004
add test for https://github.com/hudson-ai/guidance/issues/20
mmoskal c4c4872
send EOS at end of Mock in substring tests
hudson-ai e4f37de
refresh xfails
hudson-ai 25f9caf
add test for https://github.com/hudson-ai/guidance/issues/19
mmoskal 43a105f
Revert "Revert "disambiguate json number lexemes""
hudson-ai bb9e742
remove more xfails
hudson-ai 6b30a97
whitespace test misspec
hudson-ai 84c1445
whitespace test misspec (remove xfail)
hudson-ai 6f81987
add nice man tests
mmoskal f76b82f
Add failing variant of 'nice man' test
hudson-ai b60641b
Make test less flaky
hudson-ai 733da2c
Add explicit xfails for commit points
hudson-ai 85f80b6
Change test to be less dependent on idiosyncratic Mock behavior
hudson-ai 86cd01b
Remove test case that was completely dependent on idiosyncratic Mock …
hudson-ai 142dfa9
Remove xfail from file
hudson-ai cb216f9
compact flag to json generation
hudson-ai 68f256b
add negative tests for compact/whitespace-flexible modes
hudson-ai b7d892a
string never returns singleton bytes
hudson-ai 724cd28
compact/flexible pydantic negative cases
hudson-ai 9dc7b86
consolidate generate_and_check implementations a bit
hudson-ai 61b857c
make tests pass with latest phi3
mmoskal 925e0f5
further test fixes
mmoskal 23433ee
Slight refactor of Engine __call__, removing 'next'
hudson-ai 9187500
Add extra layer of indirection -- 'get_next_token'
hudson-ai 8bf616c
Start refactoring get_logits w/o forced_butes or current_temp
hudson-ai c724445
drop unused attr
hudson-ai 578fb2c
annotations
hudson-ai 4bc1e3f
rough changes to GrammarlessEngine to get it working
hudson-ai d59cfd5
speed things up by sharing a trie
hudson-ai d66648d
remove forced_bytes code entirely (trie does the trick)
hudson-ai 722d682
mark _stop_ tests as xfail for now with az guidance
mmoskal b05fd96
fix log messages
hudson-ai c15877c
refactor get_logits function signature
hudson-ai 22e5704
pass lazy field in Gen
mmoskal 3b2e762
mark failing tool test as xfail
mmoskal b0972ca
use cpp ByteTrie
hudson-ai e8b57f0
remove old skip.txt
hudson-ai e33588e
Merge branch 'main' into lazy_grammars
hudson-ai 3703396
llguidance post_process
hudson-ai 653bf78
dedent ll_fighter
hudson-ai 28c786d
Revert most GrammarlessEngine changes while still making it work
hudson-ai 1a46468
adjust to new mid_process() api
mmoskal e122cd1
add test for stop token
mmoskal fbb6293
also check for lone stop tokens
mmoskal 7ec0168
No need to prime generator internally
hudson-ai 4798474
narrow types
hudson-ai 2c35ad9
don't print logs, other than warnings from server llguidance
mmoskal 857f5e3
the stop tests now pass (though they don't hide the stop from the model)
mmoskal c848b22
llguidance schemas
hudson-ai c2d3afb
more exceptions
hudson-ai 9408a98
move schemas to _schema.py
hudson-ai d6bcb3e
some unused imports
hudson-ai 51814ec
black
hudson-ai d84292a
ByteTrie to get next token in Mock
hudson-ai 281cab5
Remove common base Parser class of LLParser and ByteParser
hudson-ai 64e9574
typing
hudson-ai 198da5b
remove some unused code from gen
hudson-ai a32a9e3
add llguidance dependency
hudson-ai c58c275
mypy
hudson-ai df2a3d9
make smoke test more exact
hudson-ai a389885
Restore commit_point failure in tool_call test
hudson-ai cb61d81
Merge branch 'main' into lazy_grammars
hudson-ai 37cf618
Allow engine to take serialized or unserialized grammars
hudson-ai 77c5dbd
Terminate byte patterns (mock semantics changed)
hudson-ai 181ab71
Higher max_tokens (mock tokenizations changed)
hudson-ai f6bd641
metrics (maybe slightly fudged) for azure guidance
hudson-ai 9268784
EngineCallResponse protobuf->pydantic
hudson-ai 174c481
Handle stripping __LIST_APPEND: in validator
hudson-ai 215bd30
Move GuidanceEngineMetrics to schemas
hudson-ai fe8df69
remove protobuf code
hudson-ai 8a6b637
remove protobuf dep
hudson-ai 3b31b2c
remove protobuf tests
hudson-ai fae0e9c
fix test getattr
hudson-ai 8ef75f8
remove grammar from parser
hudson-ai ea5d4e9
move google.generativeai import up to module level and add ignore[imp…
hudson-ai dcc86f9
simplify Engine.__call__ since we already have an EngineCallResponse
hudson-ai 7064c68
Add some exception types
hudson-ai 101eeae
Make sure _parse generator actually returns
hudson-ai fd2ca41
Basic stop-reason exception handling
hudson-ai 53d3df6
no temperature or captures in regex serialized grammars
hudson-ai 53171e1
more type:ignore for google...
hudson-ai 0d0e249
Merge branch 'main' into lazy_grammars
hudson-ai 1b9774e
mypy
hudson-ai 96ec5ce
llguidance now on pypi
hudson-ai 8f9bede
Drop lazy/greedy NestedGrammar distinction, rename to Subgrammar
hudson-ai 9eec277
GenLexeme -> Lexeme
hudson-ai 3461a2b
Make subgrammar and lexeme more private for now
hudson-ai e7abb2b
remove regex implementation in favor of one based on gen
hudson-ai 312262d
Adjust regex tests for unicode
hudson-ai 1b3b360
black
hudson-ai 772ecdc
Merge branch 'main' into lazy_grammars
hudson-ai ba3f7f9
allow for "Authorization" header in az guidance
mmoskal 3bda4d7
make it user LLGUIDANCE_LOG_LEVEL
mmoskal 9118bb3
temporarily narrow exception handler to bet better error in CI
hudson-ai e9048c3
Revert "temporarily narrow exception handler to bet better error in CI"
hudson-ai f9b4195
add protobuf and sentencepiece as temporary test dependencies until i…
hudson-ai 44fed2f
Merge branch 'lazy_grammars' of https://github.com/hudson-ai/guidance…
mmoskal aad4b5a
\d -> [0-9] to prevent max_tokens from cutting off non-unicode digits
hudson-ai 1914c0b
DIVIDE by temperature...
hudson-ai acce0ef
hard-code unicode start bytes
hudson-ai c5b6997
compress representations a bit
hudson-ai 65adc62
Remove commit_point attr
hudson-ai 99d2406
Remove hidden attr
hudson-ai 24c9820
Remove nullable attr
hudson-ai cdf18e8
GenCommitPoint -> RegularGrammar
hudson-ai 0e78ae9
Placeholder init
hudson-ai 9235709
LLParser -> TokenParser
hudson-ai b896ca0
Make gen match multiline by default
hudson-ai 36fca3c
Restore old Mock code but flesh out its tokenizer and make sure it pr…
hudson-ai b9ac1e9
Restore original behavior in which "illegal" tokens terminate grammar…
hudson-ai 4ee9cff
Only make illegal tokens EOS at end of grammar
hudson-ai df1146e
matched -> is_accepting
hudson-ai e51ae59
commit_point -> as_regular_grammar
hudson-ai 024bd33
Temporarily(fingers crossed) deprecate tool calling
hudson-ai 7afa96a
allow http urls
mmoskal 9fd36e1
Merge branch 'main' into lazy_grammars
hudson-ai 8475d88
Merge branch 'main' into lazy_grammars
hudson-ai 7e72483
Initial attempts
hudson-ai f77f733
subgrammar tool call impl
hudson-ai 017313b
back to tool_args
hudson-ai 95314e1
Infer tool call prefix
hudson-ai 2edb750
Remove xfail
hudson-ai 9699fa1
test tools actually called
hudson-ai fa5d578
More leading text in tool call
hudson-ai 125028f
Be less clever
hudson-ai f5b5cfc
clean up gen captures a bit
hudson-ai 30163a2
Add no-tool option back in
hudson-ai b2a249a
temperature when tool calling
hudson-ai 616b366
Merge branch 'main' into lazy_grammars
hudson-ai 7eda809
Move GenData to _schema.py, keep mask as bytes
hudson-ai bf03d78
Merge branch 'main' into lazy_grammars
hudson-ai 6bd9fb3
Remove special case for allowing EOS in grammarless (logic now in all…
hudson-ai aea428a
fix types
hudson-ai 824f836
directly return eos_token_id rather than putting eos_token in dqueue …
hudson-ai 012ac7c
revert substring test to restore 'failure' case
hudson-ai b85f74d
Merge branch 'main' into lazy_grammars
hudson-ai 7027d77
Fix tool-call tests in light of fixed token_count underflow
hudson-ai c3f128f
Test multiple tool calls
hudson-ai da2cf45
Simplify test
hudson-ai 278cc42
Some ideas to implement prompt parts and multimodal components
nking-1 be35d3b
sketching out how we can process multimodal data down to get_logits
nking-1 6619093
Eliminating some things from grammarless (prototype)
nking-1 e0b010e
More work toward reworking grammarless
nking-1 29a1ba6
Saving draft of how multimodal might look for openai
nking-1 f922ccc
rename tests so pytest -k works
mmoskal 5fa82db
fix duplicate warning printing and add request data logging (at 4-5 l…
mmoskal 2772297
Rework the structure and strategy for storing prompt with multimodal …
nking-1 22a2827
Saving phi3vision dev notebook
nking-1 3982ccb
Revert some previous changes to tokenization in preparation for next …
nking-1 7fbd550
Refactor token parser to be more flexible in initialization
nking-1 0dd84b7
Refactor parser to give more control of usage when needed
nking-1 ec6f43f
Phi 3 vision with transformers- draft
nking-1 5755165
Undo grammarless changes
nking-1 d5e0ac8
Rename and export phi 3 vision model
nking-1 04fcd9f
Fix phi3 vision fixture import errors
nking-1 f61d620
Merge branch 'main' into lazy_grammars
hudson-ai 22c493b
Attempting a fix for phi 3 vision tokenization
nking-1 30a8559
Add phi 3 vision chat template
nking-1 3520114
add xfail for now (covered by llguidance issue #7)
hudson-ai e6260c2
add more explicit test for list append with no explicit stop (xfailed…
hudson-ai d046ef5
add back image pattern
nking-1 178cda5
save phi3vision dev notebook
nking-1 a72352b
Merge remote-tracking branch 'hudson/lazy_grammars' into multimodal_2
nking-1 2a0f3f7
image loading and passing to model
nking-1 9384b25
save phi-3 vision notebook
nking-1 c7b89fc
Merge remote-tracking branch 'upstream/main' into multimodal_2
nking-1 d474e6a
Fix tokenizer issue with phi3vision (hack, probably needs review)
nking-1 38cecb1
phi 3 vision chat template
nking-1 b4a2947
Merge branch 'main' into multimodal_2
nking-1 d7c5c10
dev notebooks for llguidance prompt processing
nking-1 cc8ac87
experimental phi 3 vision testing scripts
nking-1 73fa881
constraints tests for guidance + img
nking-1 761326b
Refactoring and cleanup of transformers & phi3v code
nking-1 a135311
Merge branch 'main' into multimodal_2
nking-1 160a449
KV caching for phi 3 vision
nking-1 2b7410b
Code cleanup - remove dev code
nking-1 4b46880
Small fixes to parameter types and logic
nking-1 105d648
Minor code cleanup
nking-1 af9d11b
Merge remote-tracking branch 'upstream/main' into phi3vision
nking-1 ee91785
parser PR feedback
nking-1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tokens will likely need to be recoded before being sent to the LLM for logits (I think only in the case that we just added a BOS token).
You could probably just throw that line in
create_token_parser
after callingprocess_prompt
..? Just since you'll have access to a tokenizer there.See
tests/model_integration/test_model.py::test_associativity