Team 7: Lounis Bouzit, Ruohua Li, Xun Tu, Ziyi Liu
ROB 530 Final Project
The goal of this project was to integrate deep learning methods into existing SLAM algorithms to hopefully in turn improve the performance. Modern computer vision algorithms are dominated by deep learning techniques so we proposed replacing the existing pose estimation step and loop closure. We trained and tested two models: DeepVO for visual odometry and ResNet-50 for place recognition. As one approach we modified ORB-SLAM2 to communicate with a server running our own models and compared the performance. As another, we implemented an optimization problem initialized based on the outputs of our deep learning based visual odometry.
Helper files for training and running DeepVO model. We have outputted results in deepvo/poses for ease of use in other parts of our pipeline.
- Inside deepvo/ change self.data_dir to your path to KITTI dataset
- Add pre-trained model to deepvo/model (provided by alexart13)
from deepvo_handler import DeepVOHandler
dvoh = DeepVOHandler('00') # KITTI sequence number 00
N = dvoh.get_len()
for i in range(N):
rel_pose, abs_pose = dvoh.get_pose(i, prev_pose) # Gets pose relative to previous frame
Demo files to show how we have built, tested and deployed the ResNet-50 model.Including: demo of our network, demo of the usage of our model as a feature extractor, demo of the socket communication, demo of our own loop closing algorithm
- PyTorch
- torchvision
- [OpenCV] (
Please check the readme files contained in the related folders.
Two main functions called and which use GTSAM optimizations on poses provided from DeepVO (use in deepvo/ to export).
- Change variable kitti_path to your path to KITTI dataset
- Uses deepvo/poses, ensure those are exported first with directions above
python3 [SEQ_NUM]
python3 [SEQ_NUM]
ORB SLAM run using DeepVO as motion model for visual odometry.
python3 [SEQ_NUM]
./Examples/Stereo/stereo_kitti Vocabulary/ORBvoc.txt Examples/Stereo/KITTIX.yaml PATH_TO_DATASET_FOLDER/dataset/sequences/SEQUENCE_NUMBER
ORB SLAM run using ResNet-50 as the feature extractor in loop detection part. (We are separating the two pipelines because our models are kind of incompatible. When using ResNet-50 as the feature extractor in loop detection task, the overall loop closing algorithm stays unchanged, so the whole loop closing task still depends on keyframes. However, our DeepVO model does not generate keyframes. Thus, we evaluate the performances separately. Also, this is part of the reason why we tried to develop our own SLAM pipeline trying to merge the two models together)
System: Ubuntu 20.02 LTS OpenCV: 3.2.0 Eigen: 3.2.10
Under ORB-SLAM2 directory, open the terminal and type
Open a separate window, and type
./Examples/Stereo/stereo_kitti Vocabulary/ORBvoc.txt Examples/Stereo/KITTIX.yaml PATH_TO_DATASET_FOLDER/dataset/sequences/SEQUENCE_NUMBER
as is explained in the readme file contained in ORB_SLAM2_resnet
- In ORB-SLAM 2 folder, if the program is failed in running ./ for ORB-SLAM 2:
error: ‘decay_t’ is not a member of ‘std’
try changing "-std=c++11" to "-std=c++14" in CMakeList.txt
- In ORB-SLAM 2 folder, if the program is failed saying that "no module 'Pangolin' or 'Eigen' is found", even if you have already installed them, try replacing the codes in CMakeList.txt
find_package(Eigen3 3.1.0 REQUIRED)
list(APPEND CMAKE_INCLUDE_PATH "/usr/local/include")
find_package (Eigen3 3.3 REQUIRED NO_MODULE)
- The files "", "resnet50_places365.pth.tar", "categories_places365.txt" are just for you to have a brief preview on them. They are NOT working in ORB-SLAM2 pipeline. To see how to use them, please go into "ORB-SLAM2 ResNet" pipeline