Releases: bentoml/OpenLLM
v0.2.17
Installation
pip install openllm==0.2.17
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.17
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it --entrypoint=/bin/bash -P ghcr.io/bentoml/openllm:0.2.17 openllm --help
Find more information about this release in the CHANGELOG.md
What's Changed
- feat: optimize model saving and loading on single GPU by @aarnphm in #183
- fix(ci): update version correctly [skip ci] by @aarnphm in #184
- fix(models): setup xformers in base container and loading PyTorch meta weights by @aarnphm in #185
- infra(generation): initial work for generating tokens by @aarnphm in #186
- ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #187
- feat: --force-push to allow force push to bentocloud by @aarnphm in #188
Full Changelog: v0.2.16...v0.2.17
v0.2.16
Fixes a regression introduced between 0.2.13 to 0.2.15 wrt to vLLM not able to run correctly within the docker container
Full Changelog: v0.2.13...v0.2.16
v0.2.13
What changes?
Fixes auto-gptq kernel CUDA within base container.
Add support for all vLLM models. Update the vllm to latest stable commit.
Full Changelog: v0.2.12...v0.2.13
v0.2.12
News
OpenLLM now release a base container containing all compiled kernels, removing the needs for building kernels with openllm build
when using vLLM or auto-gptq
vLLM supports (experimental)
Currently, only OPT and Llama 2 supports vLLM. Simply use OPENLLM_LLAMA_FRAMEWORK=vllm
to startup openllm runners with vllm.
Installation
pip install openllm==0.2.11
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.11
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it --entrypoint=/bin/bash -P ghcr.io/bentoml/openllm:0.2.12 openllm --help
Find more information about this release in the CHANGELOG.md
New Contributors
- @RichardScottOZ made their first contribution in #155
Full Changelog: v0.2.11...v0.2.12
v0.2.11
Installation
pip install openllm==0.2.11
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.11
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
- fix(ci): correct tag for checkout by @aarnphm in #150
- fix: disable auto fixes by @aarnphm in #151
- chore: add nous to example default id as non-gated Llama by @aarnphm in #152
- feat: supports embeddings for T5 and ChatGLM family generation by @aarnphm in #153
Full Changelog: v0.2.10...v0.2.11
v0.2.10
Installation
pip install openllm==0.2.10
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.10
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
- feat(ci): automatic release semver + git archival installation by @aarnphm in #143
- docs: remove extraneous whitespace by @aarnphm in #144
- docs: update fine tuning model support by @aarnphm in #145
- fix(build): running from container choosing models correctly by @aarnphm in #141
- feat(client): embeddings by @aarnphm in #146
Full Changelog: v0.2.9...v0.2.10
v0.2.9
Installation
pip install openllm==0.2.9
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.9
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
- ci: release python earlier than building binary wheels by @aarnphm in #138
- docs: Update README.md by @parano in #139
Full Changelog: v0.2.8...v0.2.9
v0.2.8
Installation
pip install openllm==0.2.8
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.8
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
- feat(service): provisional API by @aarnphm in #133
- chore(deps): update bitsandbytes requirement from <0.40 to <0.42 by @dependabot in #137
- feat: vLLM integration for PagedAttention by @aarnphm in #134
Full Changelog: v0.2.7...v0.2.8
v0.2.7
Installation
pip install openllm==0.2.7
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.7
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
Full Changelog: v0.2.6...v0.2.7
v0.2.6
Installation
pip install openllm==0.2.6
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.6
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
- chore(ci): better release flow by @aarnphm in #131
- perf(serialisation): implement wrapper to reduce callstack by @aarnphm in #132
Full Changelog: v0.2.5...v0.2.6