-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
an error with the training set direction #22
Comments
The data loading process is not related to the GPU. Therefore, I think you should not modified any data loading function on your GPU (3060). I have never seen this issue before. Have you run the data preparation scripts (like |
To be honest, this is really weird. I need to wait until I have a spare GPU server to show what the correct logging looks like. You could check the content of the data loaded into the training loop, after these lines: Lines 192 to 199 in 7de13f4
You can add code like these: print(batch["infos"][0]) # the GTs
print(batch["imgs"][0][0].shape) # the image's shape
# Or others You could analyze it based on the results or upload it here. |
Hello, I followed your suggestion and added the print statement. The current output is shown in the image below (Image 1). The prediction result tensor([], dtype=torch.int64) shows 0, and ids, areas, and labels all indicate that no objects were detected. I'm not quite sure why this is happening. I noticed that in the train_mot17.yaml file, there is a setting "USE_CROWDHUMAN: True". I initially suspected that the issue might be due to training with MOT17 while having Crowdhuman included in the configuration. So, I changed "USE_CROWDHUMAN: True" to false, but this resulted in an error (Image 2). I also tried some commands related to Submit and Evaluation, but I encountered a small issue. When using eval mode, I got the following error (Image 3), even though I don't have any related files. Could you please advise if this file is supposed to be generated automatically? If so, did I make a mistake somewhere? Sorry for the multiple questions, and I truly appreciate your help. Thank you very much. |
According to (Image 2), it seems that you did not successfully load any image and annotation from MOT17. You can add some breakpoints during the data loading process to determine where the problem is. For example, here: Lines 59 to 68 in 7de13f4
|
Hello, I recently installed this model to train a custom dataset. The environment setup is complete, and I first attempted to use the MOT17 dataset to test whether the training process works properly. However, during the training, I encountered some abnormal data, and I was wondering if you could provide any guidance on how to resolve this issue.
Currently, I have downloaded both the Crowdhuman and MOT17 datasets. However, while training, I noticed that all the loss values are zero, which seems to suggest that the data is not being properly loaded. To check the data loading path, I added the following line of code: print(f"Frame path: {frame_path}"). The result shows that the dataset is loading Crowdhuman, but when I issued the command, I set the dataset to MOT17. I'm not entirely sure where the problem lies—could you kindly take a look? Thank you very much.
Also, just to mention, my computer has only one GPU: an NVIDIA GeForce RTX 3060. Since my GPU is limited, do I need to modify any lambda functions in the code? I appreciate your help.
If there's anything that I didn't explain clearly, please feel free to let me know, and I will provide any additional details you may need.
Thank you again.
Here is the related information regarding the failed training.
The text was updated successfully, but these errors were encountered: