Skip to content

Releases: bentoml/OpenLLM

v0.4.29

26 Nov 07:59
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.29

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.29

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.29 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.4.28...v0.4.29

v0.4.28

24 Nov 07:20
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.28

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.28

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.28 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

Full Changelog: v0.4.27...v0.4.28

v0.4.26

22 Nov 11:58
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.26

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.26

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.26 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

  • fix(infra): setup higher timer for building container images by @aarnphm in #723
  • fix(client): correct schemas parser from correct response output by @aarnphm in #724
  • feat(openai): chat templates and complete control of prompt generation by @aarnphm in #725

Full Changelog: v0.4.25...v0.4.26

v0.4.25

22 Nov 09:34
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.25

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.25

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.25 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

  • fix(openai): correct stop tokens and finish_reason state by @aarnphm in #722

Full Changelog: v0.4.24...v0.4.25

v0.4.24

22 Nov 06:50
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.24

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.24

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.24 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.4.23...v0.4.24

v0.4.23

22 Nov 06:25
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.23

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.23

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.23 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

  • chore: cleanup unused prompt templates by @aarnphm in #713
  • feat(generation): add support for eos_token_id by @aarnphm in #714
  • fix(ci): tests by @aarnphm in #715
  • refactor: delete unused code by @aarnphm in #716
  • chore(logger): fix logger and streamline style by @aarnphm in #717
  • chore(strategy): compact and add stubs by @aarnphm in #718
  • chore(types): append additional types change by @aarnphm in #719
  • fix(base-image): update base image to include cuda for now by @aarnphm in #720

Full Changelog: v0.4.22...v0.4.23

v0.4.22

21 Nov 01:49
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.22

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.22

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.22 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

  • refactor: update runner helpers and add max_model_len by @aarnphm in #712

Full Changelog: v0.4.21...v0.4.22

v0.4.21

20 Nov 22:49
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.21

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.21

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.21 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

  • ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #711
  • chore(deps): bump taiki-e/install-action from 2.21.11 to 2.21.17 by @dependabot in #709
  • chore(deps): bump docker/build-push-action from 5.0.0 to 5.1.0 by @dependabot in #708
  • chore(deps): bump github/codeql-action from 2.22.5 to 2.22.7 by @dependabot in #707

Full Changelog: v0.4.20...v0.4.21

v0.4.20

20 Nov 22:18
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.20

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.20

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.20 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.4.19...v0.4.20

v0.4.19

20 Nov 08:19
Compare
Choose a tag to compare

Installation

pip install openllm==0.4.19

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.4.19

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.19 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.4.18...v0.4.19