Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support vllm quantization #7297

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ivanvykopal
Copy link
Contributor

Title

I have implemented a feature for loading quantized models with vellum. The current version does not support quantized models with vllm.

Relevant issues

There is no relevant issue.

Type

🆕 New Feature

Changes

I changed the validate_environment in vllm/completion/handler.py to support the loading of quantized versions of models. It was done by providing several vllm's default parameters and updating them with parameters from optional_params.

[REQUIRED] Testing - Attach a screenshot of any new tests passing locall

If UI changes, send a screenshot/GIF of working UI fixes

Copy link

vercel bot commented Dec 18, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 20, 2024 5:17pm

@ivanvykopal ivanvykopal changed the title feat: support vllm quantization Support vllm quantization Dec 18, 2024
@@ -27,14 +27,31 @@ def __init__(self, status_code, message):


# check if vllm is installed
def validate_environment(model: str):
def validate_environment(model: str, optional_params: Union[dict, None]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be implemented in the transformation.py -

class VLLMConfig(HostedVLLMChatConfig):

@@ -142,7 +159,7 @@ def batch_completions(
)
"""
try:
llm, SamplingParams = validate_environment(model=model)
llm, SamplingParams, optional_params = validate_environment(model=model, optional_params=optional_params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a mock test w/ screenshot of this working?

similar to this -

async def test_azure_ai_with_image_url():

ideally this test would not add the vllm sdk as a dep on the ci/cd pipeline (so maybe use a magicmock object here)

@ivanvykopal
Copy link
Contributor Author

Hi @krrishdholakia, thank you for your comments.

I have moved the default parameters to transformation.py and also added tests for vllm, similar to Azure, as you provided an example.

Here is the screenshot of the tests.
image

@ivanvykopal ivanvykopal marked this pull request as draft December 19, 2024 19:46
@ivanvykopal ivanvykopal marked this pull request as ready for review December 19, 2024 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants