Apple Silicone Neural Engine: Core ML model package format support #7105

qdrddr · 2024-04-25T01:57:44Z

Duplicates

I have searched the existing issues

Summary 💡

Problem

Please consider adding Core ML model package format support to utilize Apple Silicone Nural Engine + GPU.

Examples 🌈

Additional context
List of Core ML package format models

https://github.com/likedan/Awesome-CoreML-Models

Motivation 🔦

Utilize both ANE & GPU, not just GPU on Apple Silicon

ntindle · 2024-04-25T01:59:56Z

Do you have a quick start on this? Haven’t looked into core ML at all

qdrddr · 2024-04-25T12:58:59Z

Hi, yes. @ntindle

This is about running LLMs locally on Apple Silicone. Core ML is a framework that can redistribute workload across CPU, GPU & Nural Engine (ANE). ANE is available on all modern Apple Devices: iPhones & Macs (A14 or newer and M1 or newer). Ideally, we want to run LLMs on ANE only as it has optimizations for running ML tasks compared to GPU. Apple claims "deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations".

To utilize Core ML first, you need to convert a model from TensorFlow, PyTorch to Core ML model package format using coremltools (or simply utilize existing models in Core ML package format ).
Second, you must now use that converted package with an implementation designed for Apple Devices. Here is the Apple XCode reference PyTorch implementation.

https://machinelearning.apple.com/research/neural-engine-transformers

ntindle · 2024-04-25T13:00:13Z

We don't package any models with our code. Is it possible to use tools like Llamafile to do this?

qdrddr · 2024-04-25T13:08:21Z

I mean, it was a general overview; you don't have to package models; you just need to be able to use CoreML packaged models.

Work in progress on CoreML implementation for [whisper.cpp]. They see x3 performance improvements for some models. (ggerganov/whisper.cpp#548) you might be interested in.

You might also be interested in another implementation Swift Transformers. Example of CoreML application
https://github.com/huggingface/swift-chat

kcze · 2024-04-30T09:58:06Z

I'll be interested to look into this at some point

qdrddr changed the title ~~Apple Silicone Nural Engine: Core ML model package format support~~ Apple Silicone Neural Engine: Core ML model package format support Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apple Silicone Neural Engine: Core ML model package format support #7105

Apple Silicone Neural Engine: Core ML model package format support #7105

qdrddr commented Apr 25, 2024

ntindle commented Apr 25, 2024

qdrddr commented Apr 25, 2024

ntindle commented Apr 25, 2024

qdrddr commented Apr 25, 2024 •

edited

kcze commented Apr 30, 2024

Apple Silicone Neural Engine: Core ML model package format support #7105

Apple Silicone Neural Engine: Core ML model package format support #7105

Comments

qdrddr commented Apr 25, 2024

Duplicates

Summary 💡

Examples 🌈

Motivation 🔦

ntindle commented Apr 25, 2024

qdrddr commented Apr 25, 2024

ntindle commented Apr 25, 2024

qdrddr commented Apr 25, 2024 • edited

kcze commented Apr 30, 2024

qdrddr commented Apr 25, 2024 •

edited