Releases: ml-explore/mlx-swift-examples
2.21.2
What's Changed
- add VLM support, refactor common LM code into MLXLMCommon. breaking API changes by @davidkoski in #151
- based on models from https://github.com/Blaizzy/mlx-vlm
- for #132
Xcode 16
Xcode 16 is required to build the example applications and tools. Older Xcode can still build the libraries via swiftpm (so no changes in requirements to any applications or libraries that refer to this).
This change is required because the xcodeproj now refers to the local Package.swift
file to get builds consistent with external users. If needed we can switch back to using xcodeproj for library builds (internal) and swiftpm for library builds (external) -- if there is a problem please file an issue and it can be considered.
Additions
There are two new libraries:
MLXVLM
contains vision language models that combine images and text prompts to produce text results, e.g.describe this image
MLXLMCommon
contains theLanguageModel
code that is shared betweenMLXLLM
andMLXVLM
The API between LLM
and VLM
is identical aside from the preparation of the UserInput
.
let parameters = GenerateParameters()
// LLM prompt
let input = UserInput(prompt: "tell me a story")
// VLM prompt
let input = UserInput(prompt: "describe the image", images: [.url(url)])
// inference is identical
let result = try await modelContainer.perform { [generate, input] context in
let input = try await context.processor.prepare(input: input)
return try generate(input: input, parameters: parameters, context: context) { token in
// print tokens as they are generated, stop early, etc.
return .more
}
}
VLM example code is available in the llm-tool
example:
./mlx-run llm-tool eval --help
OVERVIEW: evaluate prompt and images to generate text (VLM)
USAGE: llm-tool eval <options>
OPTIONS:
--model <model> Name of the huggingface model or absolute path to directory
-p, --prompt <prompt> The message to be processed by the model. Use @path,@path to load from files, e.g. @/tmp/prompt.txt
--resize <resize> Resize images to this size (width, height)
--image <image> Paths or urls for input images
...
Breaking Changes
Probably no effect to code external to this repo:
- the mlx-swift-examples.xcodeproj now references the local
Package.swift
to build the libraries - the example code now uses the naming matching external uses of mlx-swift-examples, e.g.
import LLM
->import MLXLLM
- the library directories are now renamed to match their target names, e.g.
LLM
->MLXLLM
Breaking:
- some code will now need to import both
MLXLLM
andMLXLMCommon
(particularly code that loads models) MLXLMCommon
contains the common API between LLM and VLM
import MLXLLM
import MLXLMCommon
- constants for models have moved from
ModelConfiguration
toModelRegistry
- this is
MLXLM.ModelRegistry
and there is alsoMLXVLM.ModelRegistry
- let modelConfiguration = ModelConfiguration.phi3_5_4bit
+ let modelConfiguration = ModelRegistry.phi3_5_4bit
- the
loadModelContainer()
function is nowLLMModelFactory.shared.loadContainer()
- there is a new
VLMModelFactory
with identical methods for loading VLMs
- let modelContainer = try await LLM.loadModelContainer(configuration: modelConfiguration)
- {
+ let modelContainer = try await LLMModelFactory.shared.loadContainer(
+ configuration: modelConfiguration
+ ) {
ModelContainer.perform
is now throwing (and in MLXLMCommon):
- let result = await modelContainer.perform { model, tokenizer in
- LLM.generate(
+ let result = try await modelContainer.perform { model, tokenizer in
+ try MLXLMCommon.generate(
ModelConfiguration
previously had a way to register new configurations. This is now onLLMModelFactory
(andVLMModelFactory
has the same):
LLMModelFactory.shared.modelRegistry.register(configurations: [modelConfiguration])
Deprecations
An example at the end shows all of these deprecations in context.
Prefer to use the ModelContext.processor
to prepare prompts. Previously users would pass in a bare [Int]
of tokens, but in order to support more complex inputs (VLMs) the use of bare [Int]
is deprecated and callers should use UserInput
and LMInput
.
For example, previously callers might have done something like this:
let messages = [["role": "user", "content": prompt]]
let promptTokens = try await modelContainer.perform { _, tokenizer in
try tokenizer.applyChatTemplate(messages: messages)
}
Now that should be:
let input = try await context.processor.prepare(input: .init(prompt: prompt))
Which will initialize a UserInput
from the prompt text and produce an LMInput
that can be used to generate tokens.
This call to generate()
is now deprecated:
public func generate(
promptTokens: [Int], parameters: GenerateParameters, model: any LanguageModel,
tokenizer: Tokenizer,
extraEOSTokens: Set<String>? = nil,
didGenerate: ([Int]) -> GenerateDisposition
) throws -> GenerateResult
This consumed the [Int]
variety of tokens. Now this is preferred:
public func generate(
input: LMInput, parameters: GenerateParameters, context: ModelContext,
didGenerate: ([Int]) -> GenerateDisposition
) throws -> GenerateResult
This method on ModelContainer
is now deprecated:
/// Perform an action on the model and/or tokenizer. Callers _must_ eval any `MLXArray` before returning as
/// `MLXArray` is not `Sendable`.
@available(*, deprecated, message: "prefer perform(_:) that uses a ModelContext")
public func perform<R>(_ action: @Sendable (any LanguageModel, Tokenizer) throws -> R) rethrows
-> R
use this one instead (though the former still works):
/// Perform an action on the ``ModelContext``. Callers _must_ eval any `MLXArray` before returning as
/// `MLXArray` is not `Sendable`.
public func perform<R>(_ action: @Sendable (ModelContext) async throws -> R) async rethrows -> R
Example
Putting all of these deprecations together, previously you might have generated text like this:
let messages = [["role": "user", "content": prompt]]
let promptTokens = try await modelContainer.perform { _, tokenizer in
try tokenizer.applyChatTemplate(messages: messages)
}
let result = await modelContainer.perform { model, tokenizer in
LLM.generate(
promptTokens: promptTokens, parameters: generateParameters, model: model,
tokenizer: tokenizer, extraEOSTokens: modelConfiguration.extraEOSTokens
) { tokens in ... }
}
now do this:
let result = try await modelContainer.perform { context in
let input = try await context.processor.prepare(input: .init(prompt: prompt))
return try MLXLMCommon.generate(
input: input, parameters: generateParameters, context: context
) { tokens in ... }
}
Full Changelog: 1.18.2...2.21.2
1.18.2
Last tag before breaking API changes from #151 (vlm support)
What's Changed
- Fix DynamicNTKScalingRoPE by @DePasqualeOrg in #154
- Update Gemma and Gemma 2 to more closely follow Python implementations by @DePasqualeOrg in #156
- fixes from python side of stable-diffusion by @davidkoski in #158
- Add Embedders/Encoders support. by @anishbasu in #157
- Update
asData
for StableDiffusion Image by @LiYanan2004 in #159
New Contributors
- @anishbasu made their first contribution in #157
- @LiYanan2004 made their first contribution in #159
Full Changelog: 1.18.1...1.18.2
1.18.1
Release matching mlx-swift 0.18.1
What's Changed
- Remove AsyncAlgorithms package by @DePasqualeOrg in #100
- fix #102 -- extra eval of non-model/mlxarray data by @davidkoski in #103
- Fix MNIST predictions by @rounak in #110
- Add SuScaledRotaryEmbedding for Phi 3.5 by @DePasqualeOrg in #107
- refactor llm model load code to build inside the actor by @davidkoski in #108
- add kvcache, async eval, etc for #93 by @davidkoski in #109
- fix for #114 -- incorrect shape used in preload by @davidkoski in #115
- make sure config.json is downloaded by @davidkoski in #121
- Support Dynamic ModelType by @johnmai-dev in #123
- Support InternLM2 by @johnmai-dev in #124
- update swift-transformers by @johnmai-dev in #118
- pick up swiftpm change from #118 by @davidkoski in #122
- implement stable diffusion example by @davidkoski in #120
- Bump swift transformers by @awni in #129
- fix #128 -- use a DISAMBIGUATOR to have a unique bundleId by @davidkoski in #131
- fix #113 -- fall back to local files if offline by @davidkoski in #133
- chore: update Models.swift by @eltociear in #134
- Add LLama 3.2 1B and 3B lightweight models descriptions by @jsmp in #130
- Fix parameter count for quantized models by @awni in #137
- fix swift 6 warnings - thread safe tokenizer and model config by @davidkoski in #126
- use updated url by @davidkoski in #142
- Use chat template by @DePasqualeOrg in #135
- Add Phi 3.5 MoE by @DePasqualeOrg in #116
- Move models to subdirectory by @DePasqualeOrg in #148
- fix: add StableDiffusion to package products by @nanguoyu in #149
- update mlx-swift to 0.18.1 by @davidkoski in #147
New Contributors
Full Changelog: 1.16.0...1.18.1
updates for swift 6
Note: this has some breaking API changes so it is bumping the major version number to 1. This uses mlx-swift 0.16.0
v0.12.1
release matching 0.12.1 on mlx-swift and mlx