Camera-Based Human Tracking and Following Algorithm for Mobile Robots
Autonomous ground vehicle (AGV) following system using CNNs and Range Imaging
Overview
Developed a camera-based human tracking pipeline to enable the safe and efficient relocation of airport robots. This algorithm detects and tracks human movement in real time using a CNN and a 3D depth camera to create a dynamic path plan.
Technical Implementation
⚠️ Note on Source Code
Due to employer confidentiality, source code cannot be shared. This web page contains a
sanitized technical overview and demo materials.
- ROS2 node: Developed a ROS2 node that subscribes to an Intel RealSense D455's RGB and depth streams along with the camera info containing its camera intrinsics
- Tracking model: Used Ultralytics' YOLOv11 tracking model to publish an annotated image and an ENU pose
- Simulation: Created a Python script to generate custom Gazebo SDF worlds for simulation
- Deployment: Built and optimized CUDA-enabled Dockerfiles to enable GPU-acceleration on an NVIDIA Jetson Orin NX
- Operator UI: Built an app using HTML and Javascript to adjust parameters, toggle tracking, and visualize the current person being tracked in real time
Key Features
- Multi-person tracking: Distinguishes between multiple people in a scene
- Real-time inference: Detection publishes at 10 Hz with GPU-acceleration enabled
- Following behavior: Maintains a fixed distance between the robot and the tracked person (adaptive cruise control)
- Navigation integration: Seamless integration with ROS navigation stack
- Validation: Comprehensive testing in simulation as well as real life
- Tunable parameters: Tracking parameters can be adjusted before tracking is toggled
- Live visualization: Visualization occurs in real time
Results & Future Work
The system successfully worked on the sim as well as the real robot. In the future, planned enhancements include integrating gesture-based control to remove the dependence of a carried mobile device and enabling more intuitive human-robot interaction. Additionally, implementing facial recognition will allow the system to identify authorized personnel, ensuring only those approved can be followed which will enhance overall security.
Stack