چكيده به لاتين
Visual tracking of humans has been one of the most impressive and important areas in machine vision and has attracted a great deal of attention in recent decades. Despite significant improvements, RGB-based object tracking can be very challenging in some complex situations, such as low illumination, background clutter, as well as bad weather (rain, fog, smoke, etc). Thermal cameras can record infrared radiation emitted by people with temperatures above absolute zero, therefore using thermal data, as well as visible data, could provide more information from the scene to the tracker and improve its performance. These cameras can shoot in the dark as an advantage, and they resist the changes in brightness and light shadows over visible spectrum cameras. On that account, by using RGB-TIR data, the information provided by visible images can be used in normal illumination and weather conditions, and thermal image information can be used in darkness and low illumination conditions.
In this study, the proposed tracker has two adapters including general and modality, to use visible and thermal images simultaneously. The general adapter takes both images and extracts the common features between the two modalities. The modality adapter consists of two branches that each belong to one of the images and encodes modality-specific information. Finally, these features are concatenated, and using the RoIAlign layer, the features of each region of interest are extracted. The instance adapter captures these features and specifies the location of the target. The achieved results from the proposed method on recent challenging benchmarks prove that this proposed tracking approach is more efficient than state-of-the-art trackers and has good speed and accuracy.