Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a different image size for linemod detection for KinectV2 #28

Open
JimmyDaSilva opened this issue Mar 22, 2016 · 17 comments
Open

Comments

@JimmyDaSilva
Copy link

Linemod in opencv is using a linearMemoryPyramid with levels {5,8} by default:
https://github.com/Itseez/opencv_contrib/blob/master/modules/rgbd/src/linemod.cpp#L1834:L1840

5 and 8 work fine for 640x480 images, but fails with every image sizes provided by the KinectV2 (512x424, 960x540, 1920x1080).
The problem has been discussed there:
opencv/opencv#4593

So I am trying to not use getDefaultLINEMOD but create my own detector using these lines in linemod_detect.cpp:

static const int T_LVLS[] = {3, 4};
std::vector< cv::Ptr<cv::linemod::Modality> > modalities;
modalities.push_back(new cv::linemod::ColorGradient());
modalities.push_back(new cv::linemod::DepthNormal());
detector_ = new cv::linemod::Detector(modalities, std::vector<int>(T_LVLS, T_LVLS +2));

instead of
detector_ = getDefaultLINEMOD()

For some image sizes the assertion then passes and the code runs. Unfortunately no objects are found...

@vrabaud I guess you know a bit more about the image processing behind. Do think this is something that actually work? Which values for T would you pick? How many pyramid levels do you think are necessary ?

Thanks for the help !
Jimmy

@nlyubova
Copy link
Contributor

Hi,

I've done it once but it is not completely adapted for different images scales, check it in ecto_image_pipeline

nlyubova/ecto_image_pipeline@e855706

@nlyubova
Copy link
Contributor

Regarding "For some image sizes the assertion then passes and the code runs. Unfortunately no objects are found..."

there can be another problem. You could try to train with the same image size that you will use for recognition (it worked for me), for example set in the training config file ;
renderer_width: 320
renderer_height: 240

@JimmyDaSilva
Copy link
Author

I got some positive results yesterday. I will come back on this later. But thanks for the help ! :)

@nlyubova
Copy link
Contributor

Did you try to change the threshold?

@nlyubova
Copy link
Contributor

Setting images size in training config definitely helps, especially if training and test images are very different in size (that was twice in my case)

@JimmyDaSilva
Copy link
Author

@nlyubova Don't worry more. It works great :)
You were absolutely right about adding the renderer_width and renderer_height in the training script. It changes everything!
I am now playing with the T_LVLS values in my code above for linemod_detect.cpp to get the best and fastest results.
I will post a new PR once I am done.

I didn't have to add the renderer_width and renderer_height for the detection script... so I don't really understand how this works.
But anyway it seems to work now !

@JimmyDaSilva
Copy link
Author

Hooray ! Linemod on Kinect2 is coming for ORK :)
linemod_kinect2

@nlyubova
Copy link
Contributor

wow! good job! so what id you change?

@JimmyDaSilva
Copy link
Author

In linemod_detect.cpp:

// detector_ = cv::linemod::getDefaultLINEMOD();
static const int T_LVLS[] = {4, 15};
std::vector< cv::Ptr<cv::linemod::Modality> > modalities;
modalities.push_back(new cv::linemod::ColorGradient());
modalities.push_back(new cv::linemod::DepthNormal());
detector_ = new cv::linemod::Detector(modalities, std::vector<int>(T_LVLS, T_LVLS +2));

And as you advised I have added in training.ork:

renderer_width: 960
renderer_height: 540

These params are for QHD images only.
For SD, I set T={2,4} renderer_width: 512 renderer_height: 424
Detection for HD takes for ever and never finishes.

@giacomodabisias
Copy link

Good Job.
I still believe that the code should be fixed in such a way that you can easily insert size and data type in order to use it with any depth camera. I will have a look soon and see if I can contribute.

@mikearmstrong800
Copy link

Thanks Jimmy and Nlyubova!

Thanks to you I get great linemod detection of the can with kinect2_bridge on ROS Kinetic on Ubuntu 16.04.

Mike

@wmy101
Copy link

wmy101 commented May 3, 2017

Hi, I have tried the modifications to the linemod and training files but am still having trouble detecting the coke can, majority of the time I just see "publishing to topic:/recognized_object_array". Very rarely I could catch a glimpse of the mesh, when it mistaken a random object for coke. If I use Kinect1 however, it detects fine. Just not sure what I am doing wrong with Kinectv2?

@TrinhNC
Copy link

TrinhNC commented Aug 6, 2017

Hi @JimmyDaSilva,
How could you get that precise detection. I am using both linemod and tabletop and when it detect the coke the position jumps around the image just like this problem:
https://groups.google.com/forum/#!searchin/object-recognition-kitchen/calibration%7Csort:relevance/object-recognition-kitchen/8LrJj2OLDKQ/bEjY45ytCgAJ . Do you have any idea why?

@MoranChen
Copy link

Hi @JimmyDaSilva
I have problem in detecting the coke can with kinect2 after I modify the train.ork and linemod_detect.cpp. No rostopic named /recognized_object_array when I run detection command. I am a beginner of ros and really need your help. Thanks.

@JimmyDaSilva
Copy link
Author

Sorry @MoranChen , I haven't been using ROS since 2016.
I advice to start by using my fork and branch, https://github.com/JimmyDaSilva/linemod/tree/fix_kinect2.

Very sorry but I don't have much time to use on this project.

To go back to the problem, I agree with @giacomodabisias, this should be improved up inside OpenCV.
@vrabaud: Would you know of a proper way to use linemod both on 4:3 and 16:9 ratio images ?

@waltejon
Copy link

waltejon commented Jan 25, 2019

Hello @JimmyDaSilva,

I also try to detect a coke/mug by using the Kinect V2. For this I cloned the most recent master branch of this repository.

I changed the code lines in linemod_detec.cpp according to your advice. Also I added renderer_width: 960 and renderer_height: 540 to training.ork.

Special feature of my setup is that the Kinect V2 is not connected to the Ubuntu computer. Instead the camera is connected to a Windows machine. This windows machine publishes the depth image and rgb image to seperate ROS topics (by using a implementation of ROS#). This works fine. I also publish the camera info as sensor_msgs/CameraInfo, so the input of LINEMOD should be fine.

Now my key issue is: Do you use a depth image (512 x 424 pixels) and a rgb image (960 x 540 pixels)? Or is the resolution of the rgb image 1920 x 1080 pixels?

When I use a depth image (512 x 424 pixels), a rgb image (960 x 540 pixels) and the changed files from above, I get an assertion fault. This fault says, there is an error within a OpenCV file.

Do I have to change another files apart from training.ork and linemod_detect.cpp?

Thanks a lot for your help in advance!

@JimmyDaSilva
Copy link
Author

@waltejon Sorry but I think that's all I changed.
I pretty sure I was using both the QHD images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants