Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

vllm-project / llm-compressor Public

Notifications You must be signed in to change notification settings
Fork 67
Star 814

Code
Issues 23
Pull requests 33
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Pull requests: vllm-project/llm-compressor

Labels 11 Milestones 0

Labels 11 Milestones 0

New pull request New

33 Open 250 Closed

33 Open 250 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Update ReadMe and test for cpu_offloading

#1013 opened Dec 23, 2024 by dsikka

Loading…

2

[Test Run] Merge of tests using AutoModelForCausalLM - transformers latest release

#1011 opened Dec 23, 2024 by horheynm • Draft

1

Dataset Processing Args

#1006 opened Dec 20, 2024 by kylesayrs • Draft

1

[E2E Testing] KV-Cache

#1004 opened Dec 20, 2024 by horheynm

Loading…

2

Remove Neural Magic copyright from files

#992 opened Dec 18, 2024 by kylesayrs

Loading…

1

Add example for fp8 kv cache of phi3.5 and gemma2

#991 opened Dec 18, 2024 by mgoin

Loading…

1

[Test Run] Merge of tests using AutoModelForCausalLM - transformers main

#985 opened Dec 16, 2024 by horheynm • Draft

1

[Test Fix] Sparse model reload

#974 opened Dec 11, 2024 by horheynm • Draft

2

[Test Fix] Fix Consecutive oneshot

#971 opened Dec 11, 2024 by horheynm • Draft

2

[Test Fix] Fix/update test_run_compressed

#970 opened Dec 11, 2024 by horheynm • Draft

9

[Test Fix] Add Quantization then finetune tests

#964 opened Dec 9, 2024 by horheynm • Draft

4

Bitmask test

#956 opened Dec 5, 2024 by rahul-tuli • Draft

Dataset split fallbacks

#953 opened Dec 4, 2024 by kylesayrs

Loading…

3

Composability with sparse and quantization compressors

#948 opened Dec 2, 2024 by rahul-tuli

Loading…

6

Add int8 discussion section in readme

#944 opened Nov 29, 2024 by kylesayrs

Loading…

1

Remove uses of get_observer

#939 opened Nov 27, 2024 by kylesayrs

Loading…

2

[E2E Testing] Add recipe check vllm e2e

#929 opened Nov 21, 2024 by horheynm

Loading…

9

Move SparseGPTModifier location with backwards compatibility

#919 opened Nov 16, 2024 by kylesayrs • Draft

2

[Bugfix] Support model offloading SparseGPTQ

#918 opened Nov 16, 2024 by kylesayrs • Draft

1

VLM Support via GPTQ Hooks and Sequential Data Pipeline

#914 opened Nov 13, 2024 by kylesayrs

Loading…

2

2

Allow Shortcutting Min-max Observer

#887 opened Nov 1, 2024 by kylesayrs

Loading…

3

[WIP] Adding un-/semi-structured pruning via ALPS

#874 opened Oct 29, 2024 by kayhanbehdin • Draft

FSDP utils cleanup

#854 opened Oct 19, 2024 by kylesayrs

Loading…

Awq re implementation

#824 opened Oct 7, 2024 by rahul-tuli • Draft

2

Extend usability of calculate_offload_device_map

#768 opened Oct 2, 2024 by kylesayrs • Draft

Previous 1 2 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.