Skip to content

Releases: bentoml/OpenLLM

v0.2.6

22 Jul 21:33
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.6

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.6

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • chore(ci): better release flow by @aarnphm in #131
  • perf(serialisation): implement wrapper to reduce callstack by @aarnphm in #132

Full Changelog: v0.2.5...v0.2.6

v0.2.5

21 Jul 18:14
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.5

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.5

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

  • feat(cli): query with per request instruction by @aarnphm in #130

Full Changelog: v0.2.4...v0.2.5

v0.2.4

21 Jul 08:34
Compare
Choose a tag to compare

Installation

pip install openllm==0.2.4

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.4

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.2.3...v0.2.4

v0.2.2

21 Jul 01:07
Compare
Choose a tag to compare

Patched releases

Fixes pip install "openllm[llama]" on CPU not to include vLLM

If users want to use vLLM, one can install it with pip install "openllm[vllm]"

Added a fine tuning script for LlaMA 2

and a few CLI utilities functions under openllm utils

Full Changelog: v0.2.0...v0.2.2

v0.2.0

20 Jul 01:11
Compare
Choose a tag to compare

LlaMA, Baichuan and GPT-NeoX supported!

LlaMA 2 is also supported

openllm start llama --model-id meta-llama/Llama-2-13b-hf

What's Changed

New Contributors

Full Changelog: v0.1.20...v0.2.0

v0.1.20

05 Jul 16:00
Compare
Choose a tag to compare

Installation

pip install openllm==0.1.20

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.20

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

Full Changelog: v0.1.19...v0.1.20

v0.1.19

29 Jun 05:05
Compare
Choose a tag to compare

Models

MPT supported, with its fine-tune and pre-trained variants

Fixes bugs for loading local modules, address some loading bugs

OpenLLM now tentitatively releases binary distribution for MacOS, Windows, and Linux

openllm.LLMConfig now supports dict() protocol

config = openllm.LLMConfig.for_model("opt")

print(config.items())
print(config.values())
print(config.keys())
print(dict(config))

See #85

Installation

pip install openllm==0.1.19

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.19

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.1.18...v0.1.19

v0.1.17

27 Jun 18:11
Compare
Choose a tag to compare

Installation

pip install openllm==0.1.17

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.17

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

Full Changelog: v0.1.16...v0.1.17

v0.1.15

26 Jun 23:23
Compare
Choose a tag to compare

Features

Fine-tuning support (Experimental)

One can serve OpenLLM models with any PEFT-compatible layers with --adapter-id

openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6-7b-quotes

It also supports adapter from custom path:

openllm start opt --model-id facebook/opt-6.7b --adapter-id /path/to/adapters

To use multiple adapters, use the following format:

openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6.7b-lora --adapter-id aarnphm/opt-6.7b-lora:french_lora

By default, the first adapter-id will be the default lora layer, but optionally users can change what Lora layer to use for inference via /v1/adapters:

curl -X POST http://localhost:3000/v1/adapters --json '{"adapter_name": "vn_lora"}'

Note that for multiple adapter-name and adapter-id, it is recomended to update to use the default adapter before sending the inference, to avoid any performance degradation

To include this into the Bento, one can also provide a --adapter-id into openllm build:

openllm build opt --model-id facebook/opt-6.7b --adapter-id ...

I will start rolling out support and scripts for more models so stay tuned!

Better GPU support (experimental)

0.1.15 comes with better GPU supports, meaning it respect CUDA_VISIBLE_DEVICES, allowing users to have full control on how they want to serve their models.

0.1.15 also brings experimental support for AMD GPU. ROCm does have support for CUDA_VISIBLE_DEVICES, therefore OpenLLM will also respect this behaviour for ROCm platform.

Installation

pip install openllm==0.1.15

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.15

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

Full Changelog: v0.1.14...v0.1.15

v0.1.14

26 Jun 02:20
Compare
Choose a tag to compare

Installation

pip install openllm==0.1.14

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.14

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

Full Changelog: v0.1.13...v0.1.14