Releases · bentoml/OpenLLM

22 Jul 21:33

v0.2.6

71689e5

v0.2.6

Installation

pip install openllm==0.2.6

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.6

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

chore(ci): better release flow by @aarnphm in #131
perf(serialisation): implement wrapper to reduce callstack by @aarnphm in #132

Full Changelog: v0.2.5...v0.2.6

Contributors

aarnphm

Assets 14

21 Jul 18:14

github-actions

v0.2.5

d49ff95

v0.2.5

Installation

pip install openllm==0.2.5

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.5

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

What's Changed

feat(cli): query with per request instruction by @aarnphm in #130

Full Changelog: v0.2.4...v0.2.5

Contributors

aarnphm

Assets 14

21 Jul 08:34

github-actions

v0.2.4

6b61217

v0.2.4

Installation

pip install openllm==0.2.4

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.2.4

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start opt

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.2.3...v0.2.4

Assets 14

21 Jul 01:07

aarnphm

v0.2.2

16118dd

v0.2.2

Patched releases

Fixes pip install "openllm[llama]" on CPU not to include vLLM

If users want to use vLLM, one can install it with pip install "openllm[vllm]"

Added a fine tuning script for LlaMA 2

and a few CLI utilities functions under openllm utils

Full Changelog: v0.2.0...v0.2.2

Assets 2

20 Jul 01:11

aarnphm

v0.2.0

f9ca164

v0.2.0

LlaMA, Baichuan and GPT-NeoX supported!

LlaMA 2 is also supported

openllm start llama --model-id meta-llama/Llama-2-13b-hf

What's Changed

feat: GPTNeoX by @aarnphm in #106
feat(test): snapshot testing by @aarnphm in #107
fix(resource): correctly parse CUDA_VISIBLE_DEVICES by @aarnphm in #114
feat(models): Baichuan by @hetaoBackend in #115
fix: add the requirements for baichuan by @hetaoBackend in #117
fix: build isolation by @aarnphm in #116
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #119
feat: GPTQ + vLLM and LlaMA by @aarnphm in #113

New Contributors

@hetaoBackend made their first contribution in #115

Full Changelog: v0.1.20...v0.2.0

Contributors

aarnphm, hetaoBackend, and pre-commit-ci

Assets 2

05 Jul 16:00

aarnphm

v0.1.20

4e35172

v0.1.20

Installation

pip install openllm==0.1.20

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.20

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

fix: running MPT on CPU by @aarnphm in #92
tests: add sanity check for openllm.client by @aarnphm in #93
feat: custom dockerfile templates by @aarnphm in #95
feat(llm): fine-tuning Falcon by @aarnphm in #98
feat: add citation by @aarnphm in #103
peft: improve speed and quality by @aarnphm in #102
chore: fix mpt loading on single GPU by @aarnphm in #105

Full Changelog: v0.1.19...v0.1.20

Contributors

aarnphm

Assets 16

29 Jun 05:05

github-actions

v0.1.19

07cc170

v0.1.19

Models

MPT supported, with its fine-tune and pre-trained variants

Fixes bugs for loading local modules, address some loading bugs

OpenLLM now tentitatively releases binary distribution for MacOS, Windows, and Linux

openllm.LLMConfig now supports dict() protocol

config = openllm.LLMConfig.for_model("opt")

print(config.items())
print(config.values())
print(config.keys())
print(dict(config))

See #85

Installation

pip install openllm==0.1.19

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.19

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

Full Changelog: v0.1.18...v0.1.19

Assets 14

27 Jun 18:11

aarnphm

v0.1.17

8dd8acb

v0.1.17

Installation

pip install openllm==0.1.17

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.17

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

feat(start): openllm start bento by @aarnphm in #80

Full Changelog: v0.1.16...v0.1.17

Contributors

aarnphm

Assets 2

26 Jun 23:23

aarnphm

v0.1.15

4a4d91a

v0.1.15

Features

Fine-tuning support (Experimental)

One can serve OpenLLM models with any PEFT-compatible layers with --adapter-id

openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6-7b-quotes

It also supports adapter from custom path:

openllm start opt --model-id facebook/opt-6.7b --adapter-id /path/to/adapters

To use multiple adapters, use the following format:

openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6.7b-lora --adapter-id aarnphm/opt-6.7b-lora:french_lora

By default, the first adapter-id will be the default lora layer, but optionally users can change what Lora layer to use for inference via /v1/adapters:

curl -X POST http://localhost:3000/v1/adapters --json '{"adapter_name": "vn_lora"}'

Note that for multiple adapter-name and adapter-id, it is recomended to update to use the default adapter before sending the inference, to avoid any performance degradation

To include this into the Bento, one can also provide a --adapter-id into openllm build:

openllm build opt --model-id facebook/opt-6.7b --adapter-id ...

I will start rolling out support and scripts for more models so stay tuned!

Better GPU support (experimental)

0.1.15 comes with better GPU supports, meaning it respect CUDA_VISIBLE_DEVICES, allowing users to have full control on how they want to serve their models.

0.1.15 also brings experimental support for AMD GPU. ROCm does have support for CUDA_VISIBLE_DEVICES, therefore OpenLLM will also respect this behaviour for ROCm platform.

Installation

pip install openllm==0.1.15

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.15

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

chore: better gif quality by @aarnphm in #71
[pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #74
feat: cascading resource strategies by @aarnphm in #72

Full Changelog: v0.1.14...v0.1.15

Contributors

aarnphm and pre-commit-ci

Assets 2

26 Jun 02:20

aarnphm

v0.1.14

e733fd5

v0.1.14

Installation

pip install openllm==0.1.14

To upgrade from a previous version, use the following command:

pip install --upgrade openllm==0.1.14

Usage

All available models: python -m openllm.models

To start a LLM: python -m openllm start dolly-v2

Find more information about this release in the CHANGELOG.md

What's Changed

models: migrate away from pipelines by @aarnphm in #60
fix(test): robustness by @aarnphm in #64
fix: converting envvar to string by @aarnphm in #68
chore: add more test matrices by @aarnphm in #70
feat: release binary distribution by @aarnphm in #66

Full Changelog: v0.1.13...v0.1.14

Contributors

aarnphm

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

Usage

What's Changed

Contributors

Installation

Usage

What's Changed

Contributors

Installation

Usage

Patched releases

LlaMA, Baichuan and GPT-NeoX supported!

What's Changed

New Contributors

Contributors

Installation

Usage

What's Changed

Contributors

Models

Installation

Usage

Installation

Usage

What's Changed

Contributors

Features

Fine-tuning support (Experimental)

Better GPU support (experimental)

Installation

Usage

What's Changed

Contributors

Installation

Usage

What's Changed

Contributors

Releases: bentoml/OpenLLM

v0.2.6

Installation

Usage

What's Changed

Contributors

v0.2.5

Installation

Usage

What's Changed

Contributors

v0.2.4

Installation

Usage

v0.2.2

Patched releases

v0.2.0

LlaMA, Baichuan and GPT-NeoX supported!

What's Changed

New Contributors

Contributors

v0.1.20

Installation

Usage

What's Changed

Contributors

v0.1.19

Models

Installation

Usage

v0.1.17

Installation

Usage

What's Changed

Contributors

v0.1.15

Features

Fine-tuning support (Experimental)

Better GPU support (experimental)

Installation

Usage

What's Changed

Contributors

v0.1.14

Installation

Usage

What's Changed

Contributors