-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training stops before epoch 0 after loading best.torch #218
Comments
Hi, Have you found a fix? |
I'm also trying to train my images on either of the contest weights (unet or scoring_model) but I'm not sure how to load those weights such that the training continues on them. @zeciro running that command with either unet or unet_weighted gives basically the same console output as the OP shows (no errors, but no output seems to be produced). |
Good day,
I am trying to train on my own dataset as with the case with issue #215. I opted to load the weights from the crowdAI dataset trained model and then continue training on my own images from there.
Using issue #160 as reference, I loaded the weights from best.torch.
(btw, is it correct to use
self.load('.../experiments/mapping_challenge_baseline/checkpoints/unet/best.torch')
?)I also set
self._initializar _model_weights = None'
.However it threw out an error:
‘module’ object has no attribute ‘_rebuild_tensor_v2’
Which I was able to fix via this thread.
Another error occurred:
And I fixed it via this thread.
Now, running
python main.py train --pipeline_name unet_weighted
does not throw any more errors, but training seems to not start at all (no prints of epoch 0).Here is the full printout of the console:
No errors are reported but the training does not seem to start. Do you have any ideas for why this is the case? Thank you.
The text was updated successfully, but these errors were encountered: