RasterVision
+Lightning
experiment results much worse than rastervision
pipeline results
#1770
Unanswered
aerotractjack
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Hi, I have not had the chance to look at this closely yet, but here are some initial thoughts:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, my colleague and I are using RasterVision to detect objects from aerial imagery. To get started, we build a pipeline using the rastervision pipeline setup and it achieves great results with little work from us. This is super cool! In an effort to gain more visibility/customize-ability into the pipeline, we want to recreate the experiment using rastervision to handle our datasets and predictions, and pytorch/lightning for our model.
Before I get into my question, I totally know this may not be a RasterVision issue, it could just be a model training/optimization or dataset question in which case this is not the place to ask, but I'm curious if I'm missing something obvious from the rastervision side. Following the
lightning
tutorial on the website, and all the great help you've provided me on the discussions the past week, I finally got a training+prediction pipeline working the other day. But strangely, even though my loss converges to 0 somewhat smoothly, my mAP and mAP50 metrics seem to level off around 50-60%, and my actual prediction boxes are all over the place.Colleague's working pipeline repository
My not-so-working RV+lightning repo
Both of us are using a pretrained version of ResNet50 and training on the same data. The only major difference is that he is using the pipeline and I am not. [Here is a link to his pipeline configuration] (https://github.com/aerotractjack/rvml/blob/seth/rvpipeline.py). One important snippet is how he defines his model
And here is how I create my model
My
LightningModel
is very similar to theObjectDetectionLearner
, you can see the code here:LightningModel
At first, I wasn't setting the weights for my ResNet, resulting in not using a pretrained model. Realizing this, I was able to set up my model to use the default COCOv1 weights, and that helped my mAP improve, but I still hit a wall at around 50% as you can see on tensorboard. I've included some images of my data, predictions, and tensorboard output. This is all being done on a small sample of my data in an attempt to narrow down my problem.
Here is a link to a sample training data chip
Here is a link to some sample predictions (RED) overlaid the ground truth (GREEN)
Here is a snapshot of my tensorboard training loss value
Here is a snapshot of my tensorboard validation mAP and mAP50 values
As you can see in the images, my loss curve is approaching 0 which is promising, and my mAP/mAP50 values increase for the first few epochs then begin to level off. Finally, my predictions seem completely random, and even though I have small loss values and an OK mAP, my predictions are not correct at all.
So far I've tried different variations of ResNet, different LR/optimizer values and combos, and training for much longer on more data. All results end up similar. My colleague's pipeline is performing very well, and during training his loss value looks similar (his is smoother) but his mAP and mAP50 values quickly approach 95% in a few (5-10) epochs.
Do you have any ideas where I could be going wrong? Am I leaving something out when constructing my model? Looking at rastervision source code, I do not see any major differences between how I make my model and how the pipeline does. Any help is super appreciated. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions