Running OpenVINS for very long durations #481

tianyilim · 2024-11-02T15:14:33Z

Hello, thanks for the good work and great code!

I've been trying to get OpenVINS to run on large-scale, long datasets, in particular the Vision Benchmark in Rome dataset. This dataset is quite interesting as it contains handheld sequences > 20min and > 2km in length.

Initially, OpenVINS seems to do quite well, but as the trajectory gets longer, I find the covariance estimates also increase, and I guess without place recognition techniques to "anchor" it, the trajectory eventually drifts away Youtube video (skip to ~3:15 to see filter divergence).

Maybe this is a limitation of a pure EKF-based VIO system, but are there any things that i can try to tackle this issue? E.g force reinitialization when covariance increases, tuning configs, etc.? Thanks in advance!

Overall Config

%YAML:1.0 # need to specify the file type at the top!

verbosity: "INFO" # ALL, DEBUG, INFO, WARNING, ERROR, SILENT

use_fej: true # if first-estimate Jacobians should be used (enable for good consistency)
integration: "rk4" # discrete, rk4, analytical (if rk4 or analytical used then analytical covariance propagation is used)
use_stereo: true # if we have more than 1 camera, if we should try to track stereo constraints between pairs
max_cameras: 2 # how many cameras we have 1 = mono, 2 = stereo, >2 = binocular (all mono tracking)

calib_cam_extrinsics: true # if the transform between camera and IMU should be optimized R_ItoC, p_CinI
calib_cam_intrinsics: true # if camera intrinsics should be optimized (focal, center, distortion)
calib_cam_timeoffset: true # if timeoffset between camera and IMU should be optimized
calib_imu_intrinsics: true # if imu intrinsics should be calibrated (rotation and skew-scale matrix)
calib_imu_g_sensitivity: false # if gyroscope gravity sensitivity (Tg) should be calibrated

max_clones: 20 # how many clones in the sliding window
# ! set a bit larger hopefully for local 'loop closure'
max_slam: 100 # number of features in our state vector
max_slam_in_update: 25 # update can be split into sequential updates of batches, how many in a batch
max_msckf_in_update: 50 # how many MSCKF features to use in the update
# ! VBR Spagna initializes okay
dt_slam_delay: 0.0 # delay before initializing (helps with stability from bad initialization...)
# ! VBR Spagna weirdness in grav vector
gravity_mag: 9.7 # magnitude of gravity in this location

feat_rep_msckf: "GLOBAL_3D"
feat_rep_slam: "ANCHORED_MSCKF_INVERSE_DEPTH"
feat_rep_aruco: "ANCHORED_MSCKF_INVERSE_DEPTH"

# zero velocity update parameters we can use
# we support either IMU-based or disparity detection.
try_zupt: false
zupt_chi2_multipler: 0 # set to 0 for only disp-based
zupt_max_velocity: 0.05
zupt_noise_multiplier: 50
zupt_max_disparity: 0.5 # set to 0 for only imu-based
zupt_only_at_beginning: true

# ==================================================================
# ==================================================================

init_window_time: 2.0 # how many seconds to collect initialization information
init_imu_thresh: 1.5 # threshold for variance of the accelerometer to detect a "jerk" in motion
init_max_disparity: 10.0 # max disparity to consider the platform stationary (dependent on resolution)
init_max_features: 50 # how many features to track during initialization (saves on computation)

init_dyn_use: true # if dynamic initialization should be used
init_dyn_mle_opt_calib: false # if we should optimize calibration during intialization (not recommended)
init_dyn_mle_max_iter: 50 # how many iterations the MLE refinement should use (zero to skip the MLE)
init_dyn_mle_max_time: 0.05 # how many seconds the MLE should be completed in
init_dyn_mle_max_threads: 8 # how many threads the MLE should use
init_dyn_num_pose: 6 # number of poses to use within our window time (evenly spaced)
init_dyn_min_deg: 10.0 # orientation change needed to try to init

init_dyn_inflation_ori: 10 # what to inflate the recovered q_GtoI covariance by
init_dyn_inflation_vel: 100 # what to inflate the recovered v_IinG covariance by
init_dyn_inflation_bg: 10 # what to inflate the recovered bias_g covariance by
init_dyn_inflation_ba: 100 # what to inflate the recovered bias_a covariance by
init_dyn_min_rec_cond: 1e-12 # reciprocal condition number thresh for info inversion

init_dyn_bias_g: [ 0.0053, -0.0014, -0.0000 ] # initial gyroscope bias guess
init_dyn_bias_a: [ 0.1118,  0.1242,  0.3905 ] # initial accelerometer bias guess
# init_dyn_bias_g: [ 0.0, 0.0, 0.0 ] # initial gyroscope bias guess
# init_dyn_bias_a: [ 0.0, 0.0, 0.0 ] # initial accelerometer bias guess

# ==================================================================
# ==================================================================

record_timing_information: false # if we want to record timing information of the method
record_timing_filepath: "/tmp/traj_timing.txt" # https://docs.openvins.com/eval-timing.html#eval-ov-timing-flame

# if we want to save the simulation state and its diagional covariance
# use this with rosrun ov_eval error_simulation
save_total_state: false
filepath_est: "/tmp/ov_estimate.txt"
filepath_std: "/tmp/ov_estimate_std.txt"
filepath_gt: "/tmp/ov_groundtruth.txt"

# ==================================================================
# ==================================================================

# our front-end feature tracking parameters
# ! "real time" performance is not key so I increased here a lot
# we have a KLT and descriptor based (KLT is better implemented...)
use_klt: true # if true we will use KLT, otherwise use a ORB descriptor + robust matching
num_pts: 400 # number of points (per camera) we will extract and try to track
fast_threshold: 10 # threshold for fast extraction (warning: lower threshs can be expensive)
grid_x: 5 # extraction sub-grid count for horizontal direction (uniform tracking)
grid_y: 5 # extraction sub-grid count for vertical direction (uniform tracking)
min_px_dist: 15 # distance between features (features near each other provide less information)
knn_ratio: 0.65 # descriptor knn threshold for the top two descriptor matches
track_frequency: 30.0 # frequency we will perform feature tracking at (in frames per second / hertz)
# ! raw resolution is quite high. Found that this helps tracking
downsample_cameras: true # will downsample image in half if true
num_opencv_threads: 4 # -1: auto, 0-1: serial, >1: number of threads
histogram_method: "CLAHE" # NONE, HISTOGRAM, CLAHE

# aruco tag tracker for the system
# DICT_6X6_1000 from https://chev.me/arucogen/
use_aruco: false
num_aruco: 1024
downsize_aruco: true

# ==================================================================
# ==================================================================

# camera noises and chi-squared threshold multipliers
up_msckf_sigma_px: 1
up_msckf_chi2_multipler: 1
up_slam_sigma_px: 1
up_slam_chi2_multipler: 1
up_aruco_sigma_px: 1
up_aruco_chi2_multipler: 1

# masks for our images
use_mask: false

# imu and camera spacial-temporal
# imu config should also have the correct noise values
relative_config_imu: "vbr_imu_chain.yaml"
relative_config_imucam: "vbr_imucam_chain.yaml"

IMU Config

%YAML:1.0

imu0:
  T_i_b:
    - [1.0, 0.0, 0.0, 0.0]
    - [0.0, 1.0, 0.0, 0.0]
    - [0.0, 0.0, 1.0, 0.0]
    - [0.0, 0.0, 0.0, 1.0]
  
  # ! these are the raw values from dataset calib
  # accelerometer_noise_density: 0.001602090212250569  # [ m / s^2 / sqrt(Hz) ]   ( accel "white noise" )
  # accelerometer_random_walk: 1.4361954439703917e-05  # [ m / s^3 / sqrt(Hz) ].  ( accel bias diffusion )
  # gyroscope_noise_density: 0.00013361278165885748    # [ rad / s / sqrt(Hz) ]   ( gyro "white noise" )
  # gyroscope_random_walk: 3.1760920521174444e-06      # [ rad / s^2 / sqrt(Hz) ] ( gyro bias diffusion )

  # ! Practically speaking, try to increase the noise values to make the filter more robust to noise (lets try 10x)
  accelerometer_noise_density: 0.01602090212250569   # [ m / s^2 / sqrt(Hz) ]   ( accel "white noise" )
  accelerometer_random_walk: 1.4361954439703917e-04  # [ m / s^3 / sqrt(Hz) ].  ( accel bias diffusion )
  gyroscope_noise_density: 0.0013361278165885748     # [ rad / s / sqrt(Hz) ]   ( gyro "white noise" )
  gyroscope_random_walk: 3.1760920521174444e-05      # [ rad / s^2 / sqrt(Hz) ] ( gyro bias diffusion )

  rostopic: /imu/data
  time_offset: 0.0

  update_rate: 100.0
  # three different modes supported:
  # "calibrated" (same as "kalibr"), "kalibr", "rpng"
  model: "kalibr"
  # how to get from Kalibr imu.yaml result file:
  #   - Tw is imu0:gyroscopes:M:
  #   - R_IMUtoGYRO: is imu0:gyroscopes:C_gyro_i:
  #   - Ta is imu0:accelerometers:M:
  #   - R_IMUtoACC not used by Kalibr
  #   - Tg is imu0:gyroscopes:A:
  Tw:
    - [ 1.0, 0.0, 0.0 ]
    - [ 0.0, 1.0, 0.0 ]
    - [ 0.0, 0.0, 1.0 ]
  R_IMUtoGYRO:
    - [ 1.0, 0.0, 0.0 ]
    - [ 0.0, 1.0, 0.0 ]
    - [ 0.0, 0.0, 1.0 ]
  Ta:
    - [ 1.0, 0.0, 0.0 ]
    - [ 0.0, 1.0, 0.0 ]
    - [ 0.0, 0.0, 1.0 ]
  R_IMUtoACC:
    - [ 1.0, 0.0, 0.0 ]
    - [ 0.0, 1.0, 0.0 ]
    - [ 0.0, 0.0, 1.0 ]
  Tg:
    - [ 0.0, 0.0, 0.0 ]
    - [ 0.0, 0.0, 0.0 ]
    - [ 0.0, 0.0, 0.0 ]

Camera Config

%YAML:1.0

# ! values taken from the dataset calibration
cam0:
  T_imu_cam:
    - [ 0.037625047406, -0.007947906881, 0.999260319728, 0.108169265686]
    - [-0.999241088127, -0.010385921748, 0.037541715839, 0.245263320981]
    - [ 0.010079861425, -0.999914478046, -0.008332645918, -0.060574570772]
    - [ 0.000000000000, 0.000000000000, 0.000000000000, 1.000000000000]
  cam_overlaps: [1]
  camera_model: pinhole
  distortion_coeffs: [-0.15442153959026667, 0.22415792664801637, 0.00028611532170761544, -0.0011553024948397013]
  distortion_model: radtan
  intrinsics: [1265.4157426547047, 1259.125033829179, 649.1764873455104, 349.11875004082356]
  resolution: [1388, 700]
  rostopic: /camera_left/image_raw
  timeshift_cam_imu: 0.0014439866415333613
cam1:
  T_imu_cam:
    - [ 0.047533765306, 0.002997724519, 0.998865133441, 0.117518620394]
    - [ -0.998851236543, -0.005926227938, 0.047550889366, -0.235758955147]
    - [ 0.006062046927, -0.999977946492, 0.002712584915, -0.066699484195]
    - [ 0.000000000000, 0.000000000000, 0.000000000000, 1.000000000000]
  cam_overlaps: [0]
  camera_model: pinhole
  distortion_coeffs: [-0.13964919973164264, 0.3032246444678759, -0.0006841993380782645, -0.0017521213083201822]
  distortion_model: radtan
  intrinsics: [1270.980680475146, 1266.6167801504557, 692.2080529147544, 346.53773474480096]
  resolution: [1388, 700]
  rostopic: /camera_right/image_raw
  timeshift_cam_imu: 0.0015213183603639074

tianyilim · 2024-11-03T20:19:29Z

As an initial strategy, I wanted to reset the filter state, and use the last-known state as a initialization point.

Looking at #398, re-starting the ROS node is not an option for me.

I modified the Serial ROS1 bag reader code to attempt this, I'm not too sure if I did this properly -- but if someone could sanity check my impl, it would be greatly appreciated :)

https://github.com/tianyilim/open_vins/blob/3d7cec971e78a82aa1fd6f9c535e1b79b2b79630/ov_msckf/src/ros/ROS1Visualizer.cpp#L487

Essentially, I modified the ROS1Visualizer. After feeding a camera measurement, we can monitor the covariance and see if it has exceeded a threshold. If so, we trigger filter resetting

    // After processing, we need to check if the filter has diverged. We do this by a simple check of the covariance for now.
    std::shared_ptr<State> state = _app->get_state();
    std::vector<std::shared_ptr<Type>> statevars;
    statevars.push_back(state->_imu->pose()->p());
    statevars.push_back(state->_imu->pose()->q());
    Eigen::Matrix<double, 6, 6> covariance_posori = StateHelper::get_marginal_covariance(_app->get_state(), statevars);

    // Check if the covariance is too large
    const double frobnorm = covariance_posori.norm();
    PRINT_INFO(CYAN "@ %0.4f | Covariance Frobenius Norm: %.4f from VIOManager %s\n" RESET, message.timestamp, frobnorm,
               _app->name.c_str());
    
    // TODO to be tuned!
    constexpr double threshold_covariance = 1.0;
    if (frobnorm > threshold_covariance) {
      PRINT_ERROR("Covariance is too large! (%.3f) Resetting...\n", frobnorm);

      // Get rid of all features and clones in the state so that it is as "clean" as possible
      StateHelper::marginalize_all_clones(state);
      StateHelper::marginalize_all_slam(state);
      camera_queue.clear();

      // Naive way, but we need to reset the state
      reset_state = std::make_unique<Eigen::Matrix<double, 17, 1>>();
      reset_state->block(1, 0, 4, 1) = state->_imu->quat(); // Apart from position, we re-initialize with the curr state
      // reset_state->block(5, 0, 3, 1) = state->_imu->pos();  // cannot reset position so easily??
      reset_state->block(5, 0, 3, 1) = Eigen::Vector3d::Zero(); // Reset position
      reset_state->block(8, 0, 3, 1) = state->_imu->vel();     // take the current velocity
      reset_state->block(11, 0, 3, 1) = state->_imu->bias_g(); // take the current biases
      reset_state->block(14, 0, 3, 1) = state->_imu->bias_a(); //

      // Destroy the current application and create a new one to ensure all state is "reset"
      VioManagerOptions currParams = _app->get_params();
      _app.reset();
      _app = std::make_shared<VioManager>(currParams, "reset"+std::to_string(reset_counter++));
      PRINT_INFO(YELLOW "Reset VIOManager %s\n" RESET, _app->name.c_str());
    }

Knowing the previous state, we can then "abuse" the initialize_with_gt function. This is done here (only in stereo for now):

https://github.com/tianyilim/open_vins/blob/3d7cec971e78a82aa1fd6f9c535e1b79b2b79630/ov_msckf/src/ros/ROS1Visualizer.cpp#L583

  // check if we need to reinit
  if (reset_state) {
    (*reset_state)(0, 0) = timestamp;
    _app->initialize_with_gt(*reset_state);
    reset_state.reset(); // Clear this so we do not reinit again
    PRINT_INFO(YELLOW "Reinitialize VIOManager %s\n" RESET, _app->name.c_str());
    PRINT_DEBUG(YELLOW "Number of states after reset: IMU clones: %d, SLAM features: %d\n" RESET, _app->get_state()->_clones_IMU.size(),
               _app->get_state()->_features_SLAM.size());
  }

omeredemen · 2024-11-11T17:05:29Z

it is a good work thanks for sharing. i wanted to do a reset state like you did. but i haven't started yet.
can you share what was the result you get with your updates?

goldbattle · 2024-11-23T20:03:29Z

This might be the covariance is becoming ill-defined, or the statistical chi2 checks start to fail at that point. Could you take a look at that point and see if the filter stops getting feature updates (MSCKF and SLAM)?

I think one concern with your approach is that you will remove all the tracked features, so basically have non-continuous pose tracking. Probably a better method is to "update" with a fake GPS / mocap sensors to reduce the uncertainty of the system to remove this instability (if you want to avoid this issue with the global uncertainty being unbounded).

goldbattle added user-platform User has trouble running on their own platform. dataset Dataset issue request or issue labels Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running OpenVINS for very long durations #481

Running OpenVINS for very long durations #481

tianyilim commented Nov 2, 2024 •

edited

Loading

tianyilim commented Nov 3, 2024 •

edited

Loading

omeredemen commented Nov 11, 2024

goldbattle commented Nov 23, 2024

Running OpenVINS for very long durations #481

Running OpenVINS for very long durations #481

Comments

tianyilim commented Nov 2, 2024 • edited Loading

tianyilim commented Nov 3, 2024 • edited Loading

omeredemen commented Nov 11, 2024

goldbattle commented Nov 23, 2024

tianyilim commented Nov 2, 2024 •

edited

Loading

tianyilim commented Nov 3, 2024 •

edited

Loading