Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGINT handling #476

Open
PhilippPolterauer opened this issue Sep 26, 2022 · 5 comments
Open

SIGINT handling #476

PhilippPolterauer opened this issue Sep 26, 2022 · 5 comments
Labels
bug Something isn't working enhancement New feature or request
Milestone

Comments

@PhilippPolterauer
Copy link
Contributor

Bug Description
when trying to stop a running launch file via CTRL+C the ros driver does not terminate cleanly.
We encountered this issue when writing a C++ plugin which utilizes RAII and found out that objects which are created inside the C++ plugin are not properly destructed.

Steps to Reproduce

  1. Install recent version
    1. mkdir -p ~/dev_ws/src/; cd ~/dev_ws/src; git clone --recursive https://github.com/cyberbotics/webots_ros2.git; cd ..
    2. colcon build --symlink-install
  2. Launch a webots simulation (e.g. tesla)
    source install/local_setup.bash; ros2 launch webots_ros2_tesla robot_launch.py
  3. press Ctrl-C
  4. observe error below
philpolt@vanlinnb12001:~/ros/webots_shutdown_issue$ ros2 launch webots_ros2_tesla robot_launch.py 
[INFO] [launch]: All log files can be found below /home/philpolt/.ros/log/2022-09-26-10-19-37-228324-vanlinnb12001-413857
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [webots-1]: process started with pid [413860]
[INFO] [ros2_supervisor.py-2]: process started with pid [413862]
[INFO] [lane_follower-3]: process started with pid [413864]
[INFO] [driver-4]: process started with pid [413868]
[driver-4] Cannot connect to Webots instance on socket "/tmp/webots-255926/ipc/vehicle/extern", retrying in 0 second...
[ros2_supervisor.py-2] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/Ros2Supervisor/extern", retrying in 0 second...
[driver-4] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/vehicle/extern", retrying in 1 second...
[ros2_supervisor.py-2] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/Ros2Supervisor/extern", retrying in 1 second...
[driver-4] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/vehicle/extern", retrying in 2 seconds...
[ros2_supervisor.py-2] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/Ros2Supervisor/extern", retrying in 2 seconds...
[driver-4] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/vehicle/extern", retrying in 3 seconds...
[ros2_supervisor.py-2] Cannot connect to Webots instance on socket "/tmp/webots-1234/ipc/Ros2Supervisor/extern", retrying in 3 seconds...
^C[WARNING] [launch]: user interrupted with ctrl-c (SIGINT)
[driver-4] [INFO] [1664180389.155356030] [rclcpp]: signal_handler(signal_value=2)
[lane_follower-3] Traceback (most recent call last):
[lane_follower-3]   File "/home/philpolt/ros/webots_shutdown_issue/install/webots_ros2_tesla/lib/webots_ros2_tesla/lane_follower", line 11, in <module>
[lane_follower-3]     load_entry_point('webots-ros2-tesla', 'console_scripts', 'lane_follower')()
[lane_follower-3]   File "/home/philpolt/ros/webots_shutdown_issue/build/webots_ros2_tesla/webots_ros2_tesla/lane_follower.py", line 76, in main
[lane_follower-3]     rclpy.spin(follower)
[lane_follower-3]   File "/opt/ros/galactic/lib/python3.8/site-packages/rclpy/__init__.py", line 196, in spin
[lane_follower-3]     executor.spin_once()
[lane_follower-3]   File "/opt/ros/galactic/lib/python3.8/site-packages/rclpy/executors.py", line 704, in spin_once
[lane_follower-3]     handler, entity, node = self.wait_for_ready_callbacks(timeout_sec=timeout_sec)
[lane_follower-3]   File "/opt/ros/galactic/lib/python3.8/site-packages/rclpy/executors.py", line 690, in wait_for_ready_callbacks
[lane_follower-3]     return next(self._cb_iter)
[lane_follower-3]   File "/opt/ros/galactic/lib/python3.8/site-packages/rclpy/executors.py", line 588, in _wait_for_ready_callbacks
[lane_follower-3]     wait_set.wait(timeout_nsec)
[lane_follower-3] KeyboardInterrupt
[ERROR] [ros2_supervisor.py-2]: process has died [pid 413862, exit code -2, cmd '/home/philpolt/ros/webots_shutdown_issue/install/webots_ros2_driver/lib/webots_ros2_driver/ros2_supervisor.py --ros-args'].
[INFO] [driver-4]: process has finished cleanly [pid 413868]
[INFO] [webots-1]: process has finished cleanly [pid 413860]
[ERROR] [lane_follower-3]: process has died [pid 413864, exit code -2, cmd '/home/philpolt/ros/webots_shutdown_issue/install/webots_ros2_tesla/lib/webots_ros2_tesla/lane_follower --ros-args'].

Error:
[ERROR] [ros2_supervisor.py-2]: process has died [pid 413862, exit code -2, cmd '/home/philpolt/ros/webots_shutdown_issue/install/webots_ros2_driver/lib/webots_ros2_driver/ros2_supervisor.py --ros-args'].

Expected behavior
The signal SIGINT should be properly handled inside the driver node. The launch file process ros2_supervisor.py-2 should finish cleanly!

System

  • Webots Version: R2022b
  • ROS Version: galactic and humble
  • Operating System: Linux Ubuntu 20.04
  • Graphics Card: NVIDIA Quadro RTX 3000
@angel-ayala
Copy link
Contributor

This seems to be an issue with Ros2Supervisor main loop. Apparently, python plugins are forced to be stopped without proper exit handling. Similarly in #655 , when the Ros2Supervisor.__supervisor_step_callback enter the condition of wb_robot_step() == -1 here, the only return triggered to end the execution happens on WebotsNode C++ side only, leaving the Python side without proper stop and cleaning procedure.
In the side of ROS2 client, at first sight, can be a resource leak when the simulation is resetted because Nodes associated results are never removed and always reinstanciated at the Python side.

Any thoughts, @ygoumaz ?
I was tempting a workaround with a topic publish from Ros2Supervisor to notify the wb_robot_step() == -1 condition without success. Then I add the wb_robot_step() == -1 verification in the Python plugins, calling sys.exit(0) and only works inside step() finishing the driver process cleanly.

Other thing to try maybe can be add a shutdown() method to the plugins.
At the ROS2 side also have, lifecycle which can also be a good option to create a management for node states.

@angel-ayala
Copy link
Contributor

angel-ayala commented May 31, 2023

Adding the shutdown() methods seems feasible.

In PythonPlugin.cpp

void PythonPlugin::stop() {
    PyObject_CallMethod(mPyPlugin, "stop", "");
    std::cout << "PythonPlugin::stop" << std::endl;
//     PyErr_Print();
    std::cout << "PythonPlugin::stop::print" << std::endl;
    Py_Finalize();
    std::cout << "PythonPlugin::stop::finalize" << std::endl;
  }

the Python method

def stop(self):
    self.__node.get_logger().info('stopping')
    self.__node.destroy_node()
    sys.exit(0)

In WebotsNode.cpp

int WebotsNode::step() {
    if (gShutdownSignalReceived && !mWaitingForUrdfRobotToBeRemoved) {
      mRemoveUrdfRobotPublisher->publish(mRemoveUrdfRobotMessage);
      mWaitingForUrdfRobotToBeRemoved = true;
    }

    const int result = wb_robot_step(mStep);
    if (result == -1) {    
      // Check if the plugin is actually an instance of PythonPlugin
      for (std::shared_ptr<PluginInterface> plugin : mPlugins) {
        std::shared_ptr<webots_ros2_driver::PythonPlugin> pythonPlugin = std::dynamic_pointer_cast<webots_ros2_driver::PythonPlugin>(plugin);
        if(pythonPlugin) {
          std::cout << "stopssss" << std::endl;
          pythonPlugin->stop();  // stop only for python plugins
        }
      }
      return result;
    
    }
    for (std::shared_ptr<PluginInterface> plugin : mPlugins)
      plugin->step();

    return result;
  }

Using this code in Driver.cpp

std::shared_ptr<webots_ros2_driver::WebotsNode> node = std::make_shared<webots_ros2_driver::WebotsNode>(robotName);
  node->init();
  RCLCPP_INFO(node->get_logger(), "Controller successfully connected to robot in Webots simulation.");
  while (true) {
    if (node->step() == -1)
      break;
    rclcpp::spin_some(node);
  }
  std::cout << "outlook" << std::endl;
//   rclcpp::spin_some(node);
//   std::cout << "last spin" << std::endl;
  RCLCPP_INFO(node->get_logger(), "Controller successfully disconnected from robot in Webots simulation.");
  std::cout << "shutting down" << std::endl;
  rclcpp::shutdown();
  std::cout << "clean" << std::endl;
  wb_robot_cleanup();
  std::cout << "return" << std::endl;
  return 0;

However, the print trace reach until std::cout << "clean" << std::endl; before the process exit with code 1, with both Python plugins.
Apparently there is an unhandled event when wb_robot_cleanup(); is called.

UPDATED
Apparently, seems to be a twice call of the wb_robot_cleanup() function, called once at Robot.__del__ Python method, and at the end of Driver.cpp

@ygoumaz
Copy link
Contributor

ygoumaz commented Jun 2, 2023

Regarding issue #655, I will implement a shutdown handler in the launch file that will clean all other started nodes, excluding Webots, and then respawn them. However, before proceeding with this implementation in Iron, I am awaiting the merge of #760.

Regarding this issue, I believe your approach is correct. While I haven't examined it in depth yet, you could potentially invoke wb_robot_cleanup in the driver only if the plugin is not a Python plugin. Otherwise, we can adjust the Python API similarly to what was done for the constructor and wb_robot_init, thus avoiding a redundant call in Robot.del.

@angel-ayala
Copy link
Contributor

I will test the wb_robot_cleanup condition, initial idea is itering again the WebotsNode.plugin elements to correctly choose a shutdown method.

In this regard, the Python API seems to require a more in depth examination.
I developed a Gym environment using webots_ros2 and encountered that the info gathered at the environment side is delayed. Initial findings show the lack of spin_some() method at the side of rlcpy (the mantainers already know this), since the C++ ROS nodes appears to be faster than Python ROS nodes in using the DDS layer due at each step the driver call spin_some() in the C++ side and only spin_once(timeout_sec=0) in the Python side, this last requiring more time to finish all the ROS callbacks in one step.

@ygoumaz
Copy link
Contributor

ygoumaz commented Jul 26, 2023

Still related to #788.

@ygoumaz ygoumaz added this to the 2023.1.2 milestone Jul 26, 2023
@ygoumaz ygoumaz added bug Something isn't working enhancement New feature or request labels Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Development

No branches or pull requests

3 participants