add:Idefics3 image model support #162
Open
+913
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
add Idefics3 support, the model address is : https://huggingface.co/mlx-community/SmolVLM-Instruct-4bit
command has been tested:
`
➜ mlx-swift-examples git:(add_Idefics3_support) ✗ ./mlx-run llm-tool --model mlx-community/SmolVLM-Instruct-4bit --prompt "Describe this image in detail" --image WechatIMG4754.jpg
--- xcodebuild: WARNING: Using the first of multiple matching destinations:
{ platform:macOS, arch:arm64, id:00006001-001439D81E69401E, name:My Mac }
{ platform:macOS, arch:x86_64, id:00006001-001439D81E69401E, name:My Mac }
{ platform:macOS, name:Any Mac }
Model loaded -> id("mlx-community/SmolVLM-Instruct-4bit")
Starting generation ...
Describe this image in detail by describing the objects in the image and their arrangement.
The image shows a child sitting on a couch. The child is wearing a blue shirt and a diaper and has a finger in his mouth. He is sitting on a pink and green pillow. The couch is covered in a white blanket and there are some other pillows and a toy on the couch. The child is looking to the right side of the frame.
The image is taken indoors, and the background is a wall with a------
Prompt: 6 tokens, 25.81322 tokens/s
Generation: 100 tokens, 179.972341 tokens/s, 0.555641s
`