Releases: bentoml/OpenLLM
v0.2.6
Installation
pip install openllm==0.2.6
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.6
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
- chore(ci): better release flow by @aarnphm in #131
- perf(serialisation): implement wrapper to reduce callstack by @aarnphm in #132
Full Changelog: v0.2.5...v0.2.6
v0.2.5
Installation
pip install openllm==0.2.5
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.5
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
What's Changed
Full Changelog: v0.2.4...v0.2.5
v0.2.4
Installation
pip install openllm==0.2.4
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.2.4
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start opt
Find more information about this release in the CHANGELOG.md
Full Changelog: v0.2.3...v0.2.4
v0.2.2
Patched releases
Fixes pip install "openllm[llama]"
on CPU not to include vLLM
If users want to use vLLM, one can install it with pip install "openllm[vllm]"
Added a fine tuning script for LlaMA 2
and a few CLI utilities functions under openllm utils
Full Changelog: v0.2.0...v0.2.2
v0.2.0
LlaMA, Baichuan and GPT-NeoX supported!
LlaMA 2 is also supported
openllm start llama --model-id meta-llama/Llama-2-13b-hf
What's Changed
- feat: GPTNeoX by @aarnphm in #106
- feat(test): snapshot testing by @aarnphm in #107
- fix(resource): correctly parse CUDA_VISIBLE_DEVICES by @aarnphm in #114
- feat(models): Baichuan by @hetaoBackend in #115
- fix: add the requirements for baichuan by @hetaoBackend in #117
- fix: build isolation by @aarnphm in #116
- ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in #119
- feat: GPTQ + vLLM and LlaMA by @aarnphm in #113
New Contributors
- @hetaoBackend made their first contribution in #115
Full Changelog: v0.1.20...v0.2.0
v0.1.20
Installation
pip install openllm==0.1.20
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.1.20
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start dolly-v2
Find more information about this release in the CHANGELOG.md
What's Changed
- fix: running MPT on CPU by @aarnphm in #92
- tests: add sanity check for openllm.client by @aarnphm in #93
- feat: custom dockerfile templates by @aarnphm in #95
- feat(llm): fine-tuning Falcon by @aarnphm in #98
- feat: add citation by @aarnphm in #103
- peft: improve speed and quality by @aarnphm in #102
- chore: fix mpt loading on single GPU by @aarnphm in #105
Full Changelog: v0.1.19...v0.1.20
v0.1.19
Models
MPT supported, with its fine-tune and pre-trained variants
Fixes bugs for loading local modules, address some loading bugs
OpenLLM now tentitatively releases binary distribution for MacOS, Windows, and Linux
openllm.LLMConfig
now supports dict() protocol
config = openllm.LLMConfig.for_model("opt")
print(config.items())
print(config.values())
print(config.keys())
print(dict(config))
See #85
Installation
pip install openllm==0.1.19
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.1.19
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start dolly-v2
Find more information about this release in the CHANGELOG.md
Full Changelog: v0.1.18...v0.1.19
v0.1.17
Installation
pip install openllm==0.1.17
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.1.17
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start dolly-v2
Find more information about this release in the CHANGELOG.md
What's Changed
Full Changelog: v0.1.16...v0.1.17
v0.1.15
Features
Fine-tuning support (Experimental)
One can serve OpenLLM models with any PEFT-compatible layers with --adapter-id
openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6-7b-quotes
It also supports adapter from custom path:
openllm start opt --model-id facebook/opt-6.7b --adapter-id /path/to/adapters
To use multiple adapters, use the following format:
openllm start opt --model-id facebook/opt-6.7b --adapter-id aarnphm/opt-6.7b-lora --adapter-id aarnphm/opt-6.7b-lora:french_lora
By default, the first adapter-id will be the default lora layer, but optionally users can change what Lora layer to use for inference via /v1/adapters
:
curl -X POST http://localhost:3000/v1/adapters --json '{"adapter_name": "vn_lora"}'
Note that for multiple adapter-name and adapter-id, it is recomended to update to use the default adapter before sending the inference, to avoid any performance degradation
To include this into the Bento, one can also provide a --adapter-id
into openllm build
:
openllm build opt --model-id facebook/opt-6.7b --adapter-id ...
I will start rolling out support and scripts for more models so stay tuned!
Better GPU support (experimental)
0.1.15 comes with better GPU supports, meaning it respect CUDA_VISIBLE_DEVICES
, allowing users to have full control on how they want to serve their models.
0.1.15 also brings experimental support for AMD GPU. ROCm does have support for CUDA_VISIBLE_DEVICES
, therefore OpenLLM will also respect this behaviour for ROCm platform.
Installation
pip install openllm==0.1.15
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.1.15
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start dolly-v2
Find more information about this release in the CHANGELOG.md
What's Changed
- chore: better gif quality by @aarnphm in #71
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #74
- feat: cascading resource strategies by @aarnphm in #72
Full Changelog: v0.1.14...v0.1.15
v0.1.14
Installation
pip install openllm==0.1.14
To upgrade from a previous version, use the following command:
pip install --upgrade openllm==0.1.14
Usage
All available models: python -m openllm.models
To start a LLM: python -m openllm start dolly-v2
Find more information about this release in the CHANGELOG.md
What's Changed
- models: migrate away from pipelines by @aarnphm in #60
- fix(test): robustness by @aarnphm in #64
- fix: converting envvar to string by @aarnphm in #68
- chore: add more test matrices by @aarnphm in #70
- feat: release binary distribution by @aarnphm in #66
Full Changelog: v0.1.13...v0.1.14