-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUN RGB-D is not in millimeters #12
Comments
Thank you for pointing this out -- it is important to figure this out for a more general depth model. As such, could you please also check LanguageBind and their uploaded NYU-D -- I will look into their preprocessing pipeline instead of following ImageBind if it works on your own data. |
I will look into LanguageBind. I will say this: I updated the processing on my pipeline to match the circular shift, quantization, and camera intrinsics as the NYU data. The results on our data are still not very good. My suspicion is that SUN RGB-D has no people in it, and the text labels I am trying to match are about the locations of people in the scene. |
Below is the transformation pipeline in LanguageBind. The starting format is depth in mm (NOT DISPARITY). I ran their inference example from the git homepage and max_depth is configured to 10. So in summary: read in the data in mm, convert to meters, clamp between .01 and 10 meters. Divide by 10 meters. Resize and center crop to 224, and normalize by OPENAI_DATASET_MEAN, OPENAI_DATASET_STD. I tried running on the SUN RGB-D versions of the NYUv2 data directly and LanguageBind gave bad outputs. When I did a circular shift (to put it back into mm) it gave good results, so they are doing some preprocessing to convert the NYU data to mm first.
|
Got it, thanks @jbrownkramer! I will look into this. |
I was trying to apply this model to my own data and not getting good results. I ran the NYUv2 dataset through my code, and the results seem to be in line with those reported in the ViT-Lens paper.
Digging into it, the issue is - at least partly - that the NYUv2 data is not in millimeters. Here is the matlab code for converting the png files to mm that is in the SUNRGBDtoolbox (https://rgbd.cs.princeton.edu/):
In other words, the data in the png files is a circular shift left by 3 bits of the depth in mm (which for most data is just multiplying by 8).
I mention this because the code in #9 seems to indicate that it is assumed that the data is in mm. It might be important if other datasets get used that are in mm and not the SUN RGB-D format.
The text was updated successfully, but these errors were encountered: