Skip to content

Releases: ml-explore/mlx-swift-examples

2.21.2

10 Dec 19:02
6ef303b
Compare
Choose a tag to compare

What's Changed

  • add VLM support, refactor common LM code into MLXLMCommon. breaking API changes by @davidkoski in #151

Xcode 16

Xcode 16 is required to build the example applications and tools. Older Xcode can still build the libraries via swiftpm (so no changes in requirements to any applications or libraries that refer to this).

This change is required because the xcodeproj now refers to the local Package.swift file to get builds consistent with external users. If needed we can switch back to using xcodeproj for library builds (internal) and swiftpm for library builds (external) -- if there is a problem please file an issue and it can be considered.

Additions

There are two new libraries:

  • MLXVLM contains vision language models that combine images and text prompts to produce text results, e.g. describe this image
  • MLXLMCommon contains the LanguageModel code that is shared between MLXLLM and MLXVLM

The API between LLM and VLM is identical aside from the preparation of the UserInput.

let parameters = GenerateParameters()

// LLM prompt
let input = UserInput(prompt: "tell me a story")

// VLM prompt
let input = UserInput(prompt: "describe the image", images: [.url(url)])

// inference is identical
let result = try await modelContainer.perform { [generate, input] context in
    let input = try await context.processor.prepare(input: input)
    return try generate(input: input, parameters: parameters, context: context) { token in
        // print tokens as they are generated, stop early, etc.
        return .more
    }
}

VLM example code is available in the llm-tool example:

./mlx-run llm-tool eval --help
OVERVIEW: evaluate prompt and images to generate text (VLM)

USAGE: llm-tool eval <options>

OPTIONS:
  --model <model>         Name of the huggingface model or absolute path to directory
  -p, --prompt <prompt>   The message to be processed by the model.  Use @path,@path to load from files, e.g. @/tmp/prompt.txt
  --resize <resize>       Resize images to this size (width, height)
  --image <image>         Paths or urls for input images
...

Breaking Changes

Probably no effect to code external to this repo:

  • the mlx-swift-examples.xcodeproj now references the local Package.swift to build the libraries
  • the example code now uses the naming matching external uses of mlx-swift-examples, e.g. import LLM -> import MLXLLM
  • the library directories are now renamed to match their target names, e.g. LLM -> MLXLLM

Breaking:

  • some code will now need to import both MLXLLM and MLXLMCommon (particularly code that loads models)
  • MLXLMCommon contains the common API between LLM and VLM
import MLXLLM
import MLXLMCommon
  • constants for models have moved from ModelConfiguration to ModelRegistry
  • this is MLXLM.ModelRegistry and there is also MLXVLM.ModelRegistry
-    let modelConfiguration = ModelConfiguration.phi3_5_4bit
+    let modelConfiguration = ModelRegistry.phi3_5_4bit
  • the loadModelContainer() function is now LLMModelFactory.shared.loadContainer()
  • there is a new VLMModelFactory with identical methods for loading VLMs
-     let modelContainer = try await LLM.loadModelContainer(configuration: modelConfiguration)
-    {
+     let modelContainer = try await LLMModelFactory.shared.loadContainer(
+          configuration: modelConfiguration
+    ) {
  • ModelContainer.perform is now throwing (and in MLXLMCommon):
-     let result = await modelContainer.perform { model, tokenizer in
-          LLM.generate(
+     let result = try await modelContainer.perform { model, tokenizer in
+          try MLXLMCommon.generate(
  • ModelConfiguration previously had a way to register new configurations. This is now on LLMModelFactory (and VLMModelFactory has the same):
LLMModelFactory.shared.modelRegistry.register(configurations: [modelConfiguration])

Deprecations

An example at the end shows all of these deprecations in context.

Prefer to use the ModelContext.processor to prepare prompts. Previously users would pass in a bare [Int] of tokens, but in order to support more complex inputs (VLMs) the use of bare [Int] is deprecated and callers should use UserInput and LMInput.

For example, previously callers might have done something like this:

let messages = [["role": "user", "content": prompt]]
let promptTokens = try await modelContainer.perform { _, tokenizer in
    try tokenizer.applyChatTemplate(messages: messages)
}

Now that should be:

let input = try await context.processor.prepare(input: .init(prompt: prompt))

Which will initialize a UserInput from the prompt text and produce an LMInput that can be used to generate tokens.

This call to generate() is now deprecated:

public func generate(
    promptTokens: [Int], parameters: GenerateParameters, model: any LanguageModel,
    tokenizer: Tokenizer,
    extraEOSTokens: Set<String>? = nil,
    didGenerate: ([Int]) -> GenerateDisposition
) throws -> GenerateResult

This consumed the [Int] variety of tokens. Now this is preferred:

public func generate(
    input: LMInput, parameters: GenerateParameters, context: ModelContext,
    didGenerate: ([Int]) -> GenerateDisposition
) throws -> GenerateResult

This method on ModelContainer is now deprecated:

    /// Perform an action on the model and/or tokenizer.  Callers _must_ eval any `MLXArray` before returning as
    /// `MLXArray` is not `Sendable`.
    @available(*, deprecated, message: "prefer perform(_:) that uses a ModelContext")
    public func perform<R>(_ action: @Sendable (any LanguageModel, Tokenizer) throws -> R) rethrows
        -> R

use this one instead (though the former still works):

    /// Perform an action on the ``ModelContext``.  Callers _must_ eval any `MLXArray` before returning as
    /// `MLXArray` is not `Sendable`.
    public func perform<R>(_ action: @Sendable (ModelContext) async throws -> R) async rethrows -> R

Example

Putting all of these deprecations together, previously you might have generated text like this:

            let messages = [["role": "user", "content": prompt]]
            let promptTokens = try await modelContainer.perform { _, tokenizer in
                try tokenizer.applyChatTemplate(messages: messages)
            }

            let result = await modelContainer.perform { model, tokenizer in
                LLM.generate(
                    promptTokens: promptTokens, parameters: generateParameters, model: model,
                    tokenizer: tokenizer, extraEOSTokens: modelConfiguration.extraEOSTokens
                ) { tokens in ... }
            }

now do this:

            let result = try await modelContainer.perform { context in
                let input = try await context.processor.prepare(input: .init(prompt: prompt))
                return try MLXLMCommon.generate(
                    input: input, parameters: generateParameters, context: context
                ) { tokens in ... }
            }

Full Changelog: 1.18.2...2.21.2

1.18.2

10 Dec 18:59
318044f
Compare
Choose a tag to compare

Last tag before breaking API changes from #151 (vlm support)

What's Changed

New Contributors

Full Changelog: 1.18.1...1.18.2

1.18.1

13 Nov 22:30
7baf9bc
Compare
Choose a tag to compare

Release matching mlx-swift 0.18.1

What's Changed

New Contributors

Full Changelog: 1.16.0...1.18.1

updates for swift 6

02 Aug 17:41
fb5ee82
Compare
Choose a tag to compare

Note: this has some breaking API changes so it is bumping the major version number to 1. This uses mlx-swift 0.16.0

v0.12.1

10 Jun 22:45
61c0703
Compare
Choose a tag to compare

release matching 0.12.1 on mlx-swift and mlx