VRAM memory leak for Refact.AI 1.6B #332

tawek · 2024-03-07T08:50:53Z

Windows 11 fully updated.
WSL2 updated.
Docker Desktop for Windows latest , GPU works in docker. nvidia-sli reports the GPU fine.
Nvidia Cuda 12.2 Toolkit
Newest Nvidia drivers.
RTX 3080 10GB VRAM.
AMD R5800X3D 32GB RAM
No other GPU software running.

At first all looks good, model loads and is serving, but after some time memory utilization grows to 10GB and then GPU load stays at 100% for prolonged times, model times out I can only restart the docker container to fix it.
Actually it rises to 10GB of VRAM use pretty quickly. This is for 1.6B Refact.ai model.

Docker runs 'thenlper/gte-base' as well. When I delete it to gain a little VRAM, the responsiveness comes back for just a couple of queries more.

JetBrains IDEA Refact.AI plugin.

olegklimov · 2024-03-18T13:01:00Z

Thanks for reporting. I don't think we do anything that can cause memory leaks. Hmm maybe it's the torch version or cuda version or something like this 🤔

d3v2a · 2024-04-08T14:04:43Z

Same with deepseek-coder/1.3b/base (finetune), start at ~3GB and after one hour, up to 7gb.
When I change models, the memory is freed up and the model loads at 3 GB.

Os: linux mint
cuda: 12.3
Driver version: 545.29.06
docker: 26.0.0
Gpu: NVIDIA GeForce RTX 4060 Ti 16GO
AMD Ryzen 7 5800X, 64GO ram

olegklimov · 2024-04-09T15:41:57Z

I'll try to reproduce

olegklimov · 2024-04-10T17:09:30Z

I left 1.6b (regular backend) for a day, memory settled on 6.19 Gb of memory RAM. I additionally sent 750 completion requests today and it's still 6.19 GB. I don't say there's no leak, I can only say I tried and I don't see a leak in my setup 🤔

Not sure what to do...

olegklimov · 2024-04-10T17:23:06Z

Called for help from @mitya52

mitya52 · 2024-04-11T07:27:24Z

@d3v2a it looks like normal behavior. On start model allocates 3gb but when you start using it with large context (on large files for example) it allocates additional memory for it. I see no memory leaks with your case.

olegklimov · 2024-04-11T08:14:09Z

hmm now I see 11.9Gb on my setup 🤔

d3v2a · 2024-05-20T13:21:41Z

The problem seems not to be present in the last version
refact 1.6.1
refact-lsp 0.8.0

olegklimov · 2024-05-24T04:42:40Z

Cool!

tawek · 2024-05-24T12:02:55Z

I've updated to latest sha256:f1968874 and it works ok. Usage stabilized around 9.6 on 10GB VRAM GPU and there are no issues as it seems.

tawek closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VRAM memory leak for Refact.AI 1.6B #332

VRAM memory leak for Refact.AI 1.6B #332

tawek commented Mar 7, 2024 •

edited

olegklimov commented Mar 18, 2024

d3v2a commented Apr 8, 2024

olegklimov commented Apr 9, 2024

olegklimov commented Apr 10, 2024

olegklimov commented Apr 10, 2024

mitya52 commented Apr 11, 2024

olegklimov commented Apr 11, 2024

d3v2a commented May 20, 2024

olegklimov commented May 24, 2024

tawek commented May 24, 2024

VRAM memory leak for Refact.AI 1.6B #332

VRAM memory leak for Refact.AI 1.6B #332

Comments

tawek commented Mar 7, 2024 • edited

olegklimov commented Mar 18, 2024

d3v2a commented Apr 8, 2024

olegklimov commented Apr 9, 2024

olegklimov commented Apr 10, 2024

olegklimov commented Apr 10, 2024

mitya52 commented Apr 11, 2024

olegklimov commented Apr 11, 2024

d3v2a commented May 20, 2024

olegklimov commented May 24, 2024

tawek commented May 24, 2024

tawek commented Mar 7, 2024 •

edited