You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the unsloth colab notebook to finetune the unsloth/llama-3-8b-bnb-4bit model with data with a max context length of 18000. Whenever I kick off training, it always run out of memory. That doesn't seem to be the case with the yahma/alpaca example. Here's the error:
==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1
\\ /| Num examples = 102 | Num Epochs = 5
O^O/ \_/ \ Batch size per device = 2 | Gradient Accumulation steps = 4
\ / Total batch size = 8 | Total steps = 60
"-____-" Number of trainable parameters = 41,943,040
---------------------------------------------------------------------------
OutOfMemoryError Traceback (most recent call last)
[<ipython-input-7-3d62c575fcfd>](https://localhost:8080/#) in <cell line: 1>()
----> 1 trainer_stats = trainer.train()
13 frames
[/usr/local/lib/python3.10/dist-packages/accelerate/utils/operations.py](https://localhost:8080/#) in _convert_to_fp32(tensor)
779
780 def _convert_to_fp32(tensor):
--> 781 return tensor.float()
782
783 def _is_fp16_bf16_tensor(tensor):
OutOfMemoryError: CUDA out of memory. Tried to allocate 9.47 GiB. GPU 0 has a total capacity of 14.75 GiB of which 3.78 GiB is free. Process 2116 has 10.95 GiB memory in use. Of the allocated memory 10.79 GiB is allocated by PyTorch, and 23.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
Is the longer context length the reason for this to run run out of memory? What's the recommendation in this case to make this fine-tuning job possible
The text was updated successfully, but these errors were encountered:
I'm using the unsloth colab notebook to finetune the unsloth/llama-3-8b-bnb-4bit model with data with a max context length of 18000. Whenever I kick off training, it always run out of memory. That doesn't seem to be the case with the yahma/alpaca example. Here's the error:
Is the longer context length the reason for this to run run out of memory? What's the recommendation in this case to make this fine-tuning job possible
The text was updated successfully, but these errors were encountered: