RuntimeError when resuming training #66

piedrahitacarol · 2020-11-11T19:22:29Z

Hi, I was training on my own dataset, but when I run the code train.py with --resume option I got this error:

Traceback (most recent call last):
File "tools/train.py", line 239, in
trainer.train()
File "tools/train.py", line 147, in train
self.optimizer.step()
File "/home/cpiedrahita/anaconda3/envs/segmentron/lib/python3.6/site-packages/torch/optim/lr_scheduler.py", line 66, in wrapper
return wrapped(*args, **kwargs)
File "/home/cpiedrahita/anaconda3/envs/segmentron/lib/python3.6/site-packages/torch/optim/sgd.py", line 106, in step
p.data.add_(-group['lr'], d_p)
RuntimeError: value cannot be converted to type float without overflow: (6.33039e-07,-2.05687e-07)

My environment: python 3.6, pytorch 1.4, cuda 10.1

Thanks!

lhy118118 · 2021-01-08T03:04:14Z

Hello, I have met the same problem and do you solve it? Thank you very much!

piedrahitacarol · 2021-01-13T09:13:56Z

Hi, I didn't solve it sorry

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError when resuming training #66

RuntimeError when resuming training #66

piedrahitacarol commented Nov 11, 2020

lhy118118 commented Jan 8, 2021

piedrahitacarol commented Jan 13, 2021

RuntimeError when resuming training #66

RuntimeError when resuming training #66

Comments

piedrahitacarol commented Nov 11, 2020

lhy118118 commented Jan 8, 2021

piedrahitacarol commented Jan 13, 2021