Skip to content

Commit

Permalink
Update ORT GenAI examples (#1150)
Browse files Browse the repository at this point in the history
### Description

This PR updates the ORT GenAI examples in the main branch so that they
run correctly now.

### Motivation and Context

The examples in the main branch are now out-of-date after many recent
PRs.

---------

Co-authored-by: Chester Liu <[email protected]>
  • Loading branch information
kunal-vaishnavi and skyline75489 authored Dec 20, 2024
1 parent a82ca82 commit daefc4f
Show file tree
Hide file tree
Showing 29 changed files with 353 additions and 200 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/linux-gpu-x64-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ jobs:
docker run \
--gpus all \
--rm \
--volume /data/ortgenai_pytorch_models:/data/ortgenai_pytorch_models \
--volume /data/ortgenai/pytorch:/data/ortgenai/pytorch \
--volume $GITHUB_WORKSPACE:/ort_genai_src \
-e HF_TOKEN=$HF_TOKEN \
-w /ort_genai_src onnxruntimecudabuildx64 bash -c " \
Expand All @@ -146,6 +146,6 @@ jobs:
docker run \
--gpus all \
--rm \
--volume /data/ortgenai_pytorch_models:/data/ortgenai_pytorch_models \
--volume /data/ortgenai/pytorch:/data/ortgenai/pytorch \
--volume $GITHUB_WORKSPACE:/ort_genai_src \
-w /ort_genai_src onnxruntimecudabuildx64 bash -c "ORTGENAI_LOG_ORT_LIB=1 LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/ort_genai_src/build/cuda/ /ort_genai_src/build/cuda/unit_tests"
6 changes: 3 additions & 3 deletions .pipelines/nuget-publishing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,17 @@ parameters:
- name: ort_version
displayName: 'OnnxRuntime version'
type: string
default: '1.20.0-dev-20241023-1635-2d00351d7b'
default: '1.20.1'

- name: ort_cuda_version
displayName: 'OnnxRuntime GPU version'
type: string
default: '1.20.0-dev-20241022-1606-2d00351d7b'
default: '1.20.1'

- name: ort_dml_version
displayName: 'OnnxRuntime DML version'
type: string
default: '1.20.0-dev-20241023-1635-2d00351d7b'
default: '1.20.1'

- name: cuda_version
displayName: 'CUDA version'
Expand Down
6 changes: 3 additions & 3 deletions .pipelines/stages/jobs/steps/nuget-validation-step.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ steps:
Copy-Item -Force -Recurse -Verbose $(Build.BinariesDirectory)/nuget/* -Destination ${{ parameters.CsprojFolder }}
cd ${{ parameters.CsprojFolder }}
dotnet restore -r $(os)-$(arch) /property:Configuration=${{ parameters.CsprojConfiguration }} --source https://api.nuget.org/v3/index.json --source https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/nuget/v3/index.json --source $PWD --disable-parallel --verbosity detailed
dotnet run -r $(os)-$(arch) --configuration ${{ parameters.CsprojConfiguration }} --no-restore --verbosity normal -- -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} --non-interactive
dotnet run -r $(os)-$(arch) --configuration ${{ parameters.CsprojConfiguration }} --no-restore --verbosity normal -- -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} -e $(ep) --non-interactive
displayName: 'Run ${{ parameters.CsprojName }} With Artifact on Windows'
workingDirectory: '$(Build.Repository.LocalPath)'
condition: eq(variables['os'], 'win')
Expand Down Expand Up @@ -73,7 +73,7 @@ steps:
export ORTGENAI_LOG_ORT_LIB=1 && \
cd /ort_genai_src/${{ parameters.CsprojFolder }} && \
chmod +x ./bin/Release_Cuda/net6.0/linux-x64/${{ parameters.CsprojName }} && \
./bin/Release_Cuda/net6.0/linux-x64/${{ parameters.CsprojName }} -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} --non-interactive"
./bin/Release_Cuda/net6.0/linux-x64/${{ parameters.CsprojName }} -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} -e $(ep) --non-interactive"
displayName: 'Run ${{ parameters.CsprojName }} With Artifact on Linux CUDA'
workingDirectory: '$(Build.Repository.LocalPath)'
Expand All @@ -82,7 +82,7 @@ steps:
- bash: |
export ORTGENAI_LOG_ORT_LIB=1
cd ${{ parameters.CsprojFolder }}
dotnet run -r $(os)-$(arch) --configuration ${{ parameters.CsprojConfiguration }} --no-build --verbosity normal -- -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} --non-interactive
dotnet run -r $(os)-$(arch) --configuration ${{ parameters.CsprojConfiguration }} --no-build --verbosity normal -- -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} -e $(ep) --non-interactive
displayName: 'Run ${{ parameters.CsprojName }} With Artifact on Linux/macOS CPU'
workingDirectory: '$(Build.Repository.LocalPath)'
condition: and(or(eq(variables['os'], 'linux'), eq(variables['os'], 'osx')), eq(variables['ep'], 'cpu'))
8 changes: 4 additions & 4 deletions .pipelines/stages/jobs/steps/python-validation-step.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,9 @@ steps:
python -m pip install --no-index --find-links=$(Build.BinariesDirectory)/wheel $(pip_package_name)
if ("$(ep)" -eq "directml") {
python ${{ parameters.PythonScriptName }} -m .\${{ parameters.LocalFolder }}\${{ parameters.ModelFolder }} --provider dml --non-interactive
python ${{ parameters.PythonScriptName }} -m .\${{ parameters.LocalFolder }}\${{ parameters.ModelFolder }} -e dml --non-interactive
} else {
python ${{ parameters.PythonScriptName }} -m .\${{ parameters.LocalFolder }}\${{ parameters.ModelFolder }} --provider $(ep) --non-interactive
python ${{ parameters.PythonScriptName }} -m .\${{ parameters.LocalFolder }}\${{ parameters.ModelFolder }} -e $(ep) --non-interactive
}
displayName: 'Run ${{ parameters.PythonScriptName }} With Artifact on Windows'
workingDirectory: '$(Build.Repository.LocalPath)'
Expand All @@ -72,7 +72,7 @@ steps:
$python_exe -m pip install -r /ort_genai_src/test/python/cuda/ort/requirements.txt && \
cd /ort_genai_src/${{ parameters.PythonScriptFolder }} && \
$python_exe -m pip install --no-index --find-links=/ort_genai_binary/wheel $(pip_package_name) && \
$python_exe ${{ parameters.PythonScriptName }} -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} --provider $(ep) --non-interactive"
$python_exe ${{ parameters.PythonScriptName }} -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} -e $(ep) --non-interactive"
displayName: 'Run ${{ parameters.PythonScriptName }} With Artifact on Linux CUDA'
workingDirectory: '$(Build.Repository.LocalPath)'
Expand All @@ -91,7 +91,7 @@ steps:
fi
cd ${{ parameters.PythonScriptFolder }}
python -m pip install --no-index --find-links=$(Build.BinariesDirectory)/wheel $(pip_package_name)
python ${{ parameters.PythonScriptName }} -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} --provider $(ep) --non-interactive
python ${{ parameters.PythonScriptName }} -m ./${{ parameters.LocalFolder }}/${{ parameters.ModelFolder }} -e $(ep) --non-interactive
displayName: 'Run ${{ parameters.PythonScriptName }} With Artifact on Linux/macOS CPU'
workingDirectory: '$(Build.Repository.LocalPath)'
condition: and(or(eq(variables['os'], 'linux'), eq(variables['os'], 'osx')), eq(variables['ep'], 'cpu'))
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
# ONNX Runtime generate() API
# ONNX Runtime GenAI

## *Main branch contains new API changes and examples in main branch reflect these changes. For example scripts compatible with current release (0.5.2), [see release branch](https://github.com/microsoft/onnxruntime-genai/tree/rel-0.5.2).*


[![Latest version](https://img.shields.io/nuget/vpre/Microsoft.ML.OnnxRuntimeGenAI.Managed?label=latest)](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntimeGenAI.Managed/absoluteLatest)

Run Llama, Phi, Gemma, Mistral with ONNX Runtime.
Run generative AI models with ONNX Runtime.

This API gives you an easy, flexible and performant way of running LLMs on device.

It implements the generative AI loop for ONNX models, including pre and post processing, inference with ONNX Runtime, logits processing, search and sampling, and KV cache management.

You can call a high level `generate()` method to generate all of the output at once, or stream the output one token at a time.

See documentation at https://onnxruntime.ai/docs/genai.

|Support matrix|Supported now|Under development|On the roadmap|
Expand Down
116 changes: 59 additions & 57 deletions examples/c/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# ONNX Runtime generate() API C Example
# ONNX Runtime GenAI C Example

Note: ONNX Runtime GenAI needs to be built from source. The headers and shared libraries that are built need to be copied over to the appropriate folders (i.e. the `include` and `lib` folders). Building from source is necessary because these examples have been updated to run with the latest changes. Once the next version of ONNX Runtime GenAI is released, the below instructions will be accurate again.

## Setup

Expand All @@ -16,16 +18,16 @@ mkdir include
mkdir lib
```

## Phi-3 mini
## Phi-3.5 mini

### Download model

This example uses the [Phi-3 mini model](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct).
This example uses the [Phi-3.5 mini model](https://huggingface.co/microsoft/Phi-3.5-mini-instruct).

You can clone this entire model repository or download individual model variants. To download individual variants, you need to install the HuggingFace CLI.
You can clone this entire model repository or download individual model variants. To download individual variants, you need to install the Hugging Face CLI.

```bash
huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir .
huggingface-cli download microsoft/Phi-3.5-mini-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir .
```

### Windows x64 CPU
Expand All @@ -37,21 +39,21 @@ Change into the `onnxruntime-genai\examples\c` folder.
1. Install onnxruntime

```cmd
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-win-x64-1.19.2.zip -o onnxruntime-win-x64-1.19.2.zip
tar xvf onnxruntime-win-x64-1.19.2.zip
copy onnxruntime-win-x64-1.19.2\include\* include
copy onnxruntime-win-x64-1.19.2\lib\* lib
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/onnxruntime-win-x64-1.20.1.zip -o onnxruntime-win-x64-1.20.1.zip
tar -xvf onnxruntime-win-x64-1.20.1.zip
copy onnxruntime-win-x64-1.20.1\include\* include
copy onnxruntime-win-x64-1.20.1\lib\* lib
```

2. Install onnxruntime-genai

```cmd
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.4.0/onnxruntime-genai-win-cpu-x64-capi.zip -o onnxruntime-genai-win-cpu-x64-capi.zip
tar xvf onnxruntime-genai-win-cpu-x64-capi.zip
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.6.0/onnxruntime-genai-win-cpu-x64-capi.zip -o onnxruntime-genai-win-cpu-x64-capi.zip
tar -xvf onnxruntime-genai-win-cpu-x64-capi.zip
cd onnxruntime-genai-win-cpu-x64-capi
tar xvf onnxruntime-genai-0.4.0-win-x64.zip
copy onnxruntime-genai-0.4.0-win-x64\include\* ..\include
copy onnxruntime-genai-0.4.0-win-x64\lib\* ..\lib
tar -xvf onnxruntime-genai-0.6.0-win-x64.zip
copy onnxruntime-genai-0.6.0-win-x64\include\* ..\include
copy onnxruntime-genai-0.6.0-win-x64\lib\* ..\lib
cd ..
```

Expand Down Expand Up @@ -81,8 +83,8 @@ Change into the `onnxruntime-genai\examples\c` folder.
```cmd
mkdir onnxruntime-win-x64-directml
cd onnxruntime-win-x64-directml
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/Microsoft.ML.OnnxRuntime.DirectML.1.19.2.nupkg -o Microsoft.ML.OnnxRuntime.DirectML.1.19.2.nupkg
tar xvf Microsoft.ML.OnnxRuntime.DirectML.1.19.2.nupkg
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/Microsoft.ML.OnnxRuntime.DirectML.1.20.1.nupkg -o Microsoft.ML.OnnxRuntime.DirectML.1.20.1.nupkg
tar -xvf Microsoft.ML.OnnxRuntime.DirectML.1.20.1.nupkg
copy build\native\include\* ..\include
copy runtimes\win-x64\native\* ..\lib
cd ..
Expand All @@ -91,12 +93,12 @@ Change into the `onnxruntime-genai\examples\c` folder.
2. Install onnxruntime-genai

```cmd
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.4.0/onnxruntime-genai-win-directml-x64-capi.zip -o onnxruntime-genai-win-directml-x64-capi.zip
tar xvf onnxruntime-genai-win-directml-x64-capi.zip
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.6.0/onnxruntime-genai-win-directml-x64-capi.zip -o onnxruntime-genai-win-directml-x64-capi.zip
tar -xvf onnxruntime-genai-win-directml-x64-capi.zip
cd onnxruntime-genai-win-directml-x64-capi
tar xvf onnxruntime-genai-0.4.0-win-x64-dml.zip
copy onnxruntime-genai-0.4.0-win-x64-dml\include\* ..\include
copy onnxruntime-genai-0.4.0-win-x64-dml\lib\* ..\lib
tar -xvf onnxruntime-genai-0.6.0-win-x64-dml.zip
copy onnxruntime-genai-0.6.0-win-x64-dml\include\* ..\include
copy onnxruntime-genai-0.6.0-win-x64-dml\lib\* ..\lib
cd ..
```

Expand Down Expand Up @@ -124,21 +126,21 @@ Change into the `onnxruntime-genai\examples\c` folder.
1. Install onnxruntime

```cmd
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-win-arm64-1.19.2.zip -o onnxruntime-win-arm64-1.19.2.zip
tar xvf onnxruntime-win-arm64-1.19.2.zip
copy onnxruntime-win-arm64-1.19.2\include\* include
copy onnxruntime-win-arm64-1.19.2\lib\* lib
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/onnxruntime-win-arm64-1.20.1.zip -o onnxruntime-win-arm64-1.20.1.zip
tar -xvf onnxruntime-win-arm64-1.20.1.zip
copy onnxruntime-win-arm64-1.20.1\include\* include
copy onnxruntime-win-arm64-1.20.1\lib\* lib
```

2. Install onnxruntime-genai

```cmd
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.4.0/onnxruntime-genai-win-cpu-arm64-capi.zip -o onnxruntime-genai-win-cpu-arm64-capi.zip
tar xvf onnxruntime-genai-win-cpu-arm64-capi.zip
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.6.0/onnxruntime-genai-win-cpu-arm64-capi.zip -o onnxruntime-genai-win-cpu-arm64-capi.zip
tar -xvf onnxruntime-genai-win-cpu-arm64-capi.zip
cd onnxruntime-genai-win-cpu-arm64-capi
tar xvf onnxruntime-genai-0.4.0-win-arm64.zip
copy onnxruntime-genai-0.4.0-win-arm64\include\* ..\include
copy onnxruntime-genai-0.4.0-win-arm64\lib\* ..\lib
tar -xvf onnxruntime-genai-0.6.0-win-arm64.zip
copy onnxruntime-genai-0.6.0-win-arm64\include\* ..\include
copy onnxruntime-genai-0.6.0-win-arm64\lib\* ..\lib
cd ..
```

Expand Down Expand Up @@ -168,8 +170,8 @@ Change into the `onnxruntime-genai\examples\c` folder.
```cmd
mkdir onnxruntime-win-arm64-directml
cd onnxruntime-win-arm64-directml
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/Microsoft.ML.OnnxRuntime.DirectML.1.19.2.nupkg -o Microsoft.ML.OnnxRuntime.DirectML.1.19.2.nupkg
tar xvf Microsoft.ML.OnnxRuntime.DirectML.1.19.2.nupkg
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/Microsoft.ML.OnnxRuntime.DirectML.1.20.1.nupkg -o Microsoft.ML.OnnxRuntime.DirectML.1.20.1.nupkg
tar -xvf Microsoft.ML.OnnxRuntime.DirectML.1.20.1.nupkg
copy build\native\include\* ..\include
copy runtimes\win-arm64\native\* ..\lib
cd ..
Expand All @@ -178,12 +180,12 @@ Change into the `onnxruntime-genai\examples\c` folder.
2. Install onnxruntime-genai

```cmd
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.4.0/onnxruntime-genai-win-directml-arm64-capi.zip -o onnxruntime-genai-win-directml-arm64-capi.zip
tar xvf onnxruntime-genai-win-directml-arm64-capi.zip
curl -L https://github.com/microsoft/onnxruntime-genai/releases/download/v0.6.0/onnxruntime-genai-win-directml-arm64-capi.zip -o onnxruntime-genai-win-directml-arm64-capi.zip
tar -xvf onnxruntime-genai-win-directml-arm64-capi.zip
cd onnxruntime-genai-win-directml-arm64-capi
tar xvf onnxruntime-genai-0.4.0-win-arm64-dml.zip
copy onnxruntime-genai-0.4.0-win-arm64-dml\include\* ..\include
copy onnxruntime-genai-0.4.0-win-arm64-dml\lib\* ..\lib
tar -xvf onnxruntime-genai-0.6.0-win-arm64-dml.zip
copy onnxruntime-genai-0.6.0-win-arm64-dml\include\* ..\include
copy onnxruntime-genai-0.6.0-win-arm64-dml\lib\* ..\lib
cd ..
```

Expand Down Expand Up @@ -212,10 +214,10 @@ Change into the onnxruntime-genai directory.

```bash
cd examples/c
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-linux-x64-1.19.2.tgz -o onnxruntime-linux-x64-1.19.2.tgz
tar xvzf onnxruntime-linux-x64-1.19.2.tgz
cp onnxruntime-linux-x64-1.19.2/include/* include
cp onnxruntime-linux-x64-1.19.2/lib/* lib
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/onnxruntime-linux-x64-1.20.1.tgz -o onnxruntime-linux-x64-1.20.1.tgz
tar xvzf onnxruntime-linux-x64-1.20.1.tgz
cp onnxruntime-linux-x64-1.20.1/include/* include
cp onnxruntime-linux-x64-1.20.1/lib/* lib
cd ../..
```

Expand Down Expand Up @@ -258,16 +260,17 @@ cmake --build . --config Release
./phi3 path_to_model
```

## Phi-3 vision
## Phi-3.5 vision

### Download model

You can use one of the following models for this sample:
* [Phi-3 vision model for CPU](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-cpu)
* [Phi-3 vision model for CUDA](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-cuda)
* [Phi-3 vision model for DirectML](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct-onnx-directml)

Clone one of the models above.
This example uses the [Phi-3.5 vision model](https://huggingface.co/microsoft/Phi-3.5-vision-instruct).

You can clone this entire model repository or download individual model variants. To download individual variants, you need to install the Hugging Face CLI.

```bash
huggingface-cli download microsoft/Phi-3.5-vision-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir .
```

### Run on Windows

Expand All @@ -279,10 +282,10 @@ Change into the onnxruntime-genai folder.

```cmd
cd examples\c
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-win-x64-1.19.2.zip -o onnxruntime-win-x64-1.19.2.zip
tar xvf onnxruntime-win-x64-1.19.2.zip
copy onnxruntime-win-x64-1.19.2\include\* include
copy onnxruntime-win-x64-1.19.2\lib\* lib
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/onnxruntime-win-x64-1.20.1.zip -o onnxruntime-win-x64-1.20.1.zip
tar -xvf onnxruntime-win-x64-1.20.1.zip
copy onnxruntime-win-x64-1.20.1\include\* include
copy onnxruntime-win-x64-1.20.1\lib\* lib
```

2. Install onnxruntime-genai
Expand Down Expand Up @@ -324,10 +327,10 @@ Change into the onnxruntime-genai directory.

```bash
cd examples/c
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.19.2/onnxruntime-linux-x64-1.19.2.tgz -o onnxruntime-linux-x64-1.19.2.tgz
tar xvzf onnxruntime-linux-x64-1.19.2.tgz
cp onnxruntime-linux-x64-1.19.2/include/* include
cp onnxruntime-linux-x64-1.19.2/lib/* lib
curl -L https://github.com/microsoft/onnxruntime/releases/download/v1.20.1/onnxruntime-linux-x64-1.20.1.tgz -o onnxruntime-linux-x64-1.20.1.tgz
tar xvzf onnxruntime-linux-x64-1.20.1.tgz
cp onnxruntime-linux-x64-1.20.1/include/* include
cp onnxruntime-linux-x64-1.20.1/lib/* lib
cd ../..
```

Expand Down Expand Up @@ -368,4 +371,3 @@ cmake --build . --config Release
cd build/Release
./phi3v path_to_model
```

Loading

0 comments on commit daefc4f

Please sign in to comment.