Skip to content

Releases: bentoml/OpenLLM

v0.2.17

08 Aug 05:46
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.17

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.17

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it --entrypoint=/bin/bash -P ghcr.io/bentoml/openllm:0.2.17 openllm --help

Find more information about this release in the CHANGELOG.md

What's Changed

  • feat: optimize model saving and loading on single GPU by @aarnphm in #183
  • fix(ci): update version correctly [skip ci] by @aarnphm in #184
  • fix(models): setup xformers in base container and loading PyTorch meta weights by @aarnphm in #185
  • infra(generation): initial work for generating tokens by @aarnphm in #186
  • ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #187
  • feat: --force-push to allow force push to bentocloud by @aarnphm in #188

Full Changelog: v0.2.16...v0.2.17

v0.2.16

04 Aug 17:36
Compare
Choose a tag to compare

Fixes a regression introduced between 0.2.13 to 0.2.15 wrt to vLLM not able to run correctly within the docker container

Full Changelog: v0.2.13...v0.2.16

v0.2.13

03 Aug 06:27
Compare
Choose a tag to compare

What changes?

Fixes auto-gptq kernel CUDA within base container.
Add support for all vLLM models. Update the vllm to latest stable commit.

Full Changelog: v0.2.12...v0.2.13

v0.2.12

02 Aug 03:12
Compare
Choose a tag to compare

News

OpenLLM now release a base container containing all compiled kernels, removing the needs for building kernels with openllm build when using vLLM or auto-gptq

vLLM supports (experimental)

Currently, only OPT and Llama 2 supports vLLM. Simply use OPENLLM_LLAMA_FRAMEWORK=vllm to startup openllm runners with vllm.

Installation

pip install openllm==0.2.11

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.11

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it --entrypoint=/bin/bash -P ghcr.io/bentoml/openllm:0.2.12 openllm --help

Find more information about this release in the CHANGELOG.md

New Contributors

Full Changelog: v0.2.11...v0.2.12

v0.2.11

28 Jul 00:15
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.11

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.11

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • fix(ci): correct tag for checkout by @aarnphm in #150
  • fix: disable auto fixes by @aarnphm in #151
  • chore: add nous to example default id as non-gated Llama by @aarnphm in #152
  • feat: supports embeddings for T5 and ChatGLM family generation by @aarnphm in #153

Full Changelog: v0.2.10...v0.2.11

v0.2.10

25 Jul 17:49
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.10

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.10

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • feat(ci): automatic release semver + git archival installation by @aarnphm in #143
  • docs: remove extraneous whitespace by @aarnphm in #144
  • docs: update fine tuning model support by @aarnphm in #145
  • fix(build): running from container choosing models correctly by @aarnphm in #141
  • feat(client): embeddings by @aarnphm in #146

Full Changelog: v0.2.9...v0.2.10

v0.2.9

24 Jul 23:35
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.9

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.9

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • ci: release python earlier than building binary wheels by @aarnphm in #138
  • docs: Update README.md by @parano in #139

Full Changelog: v0.2.8...v0.2.9

v0.2.8

24 Jul 20:00
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.8

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.8

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • feat(service): provisional API by @aarnphm in #133
  • chore(deps): update bitsandbytes requirement from <0.40 to <0.42 by @dependabot in #137
  • feat: vLLM integration for PagedAttention by @aarnphm in #134

Full Changelog: v0.2.7...v0.2.8

v0.2.7

23 Jul 01:22
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.7

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.7

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.2.6...v0.2.7

v0.2.6

22 Jul 21:33
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.6

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.6

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • chore(ci): better release flow by @aarnphm in #131
  • perf(serialisation): implement wrapper to reduce callstack by @aarnphm in #132

Full Changelog: v0.2.5...v0.2.6