چكيده به لاتين
Visual object tracking is one of the interesting and important areas in machine vision that can be applied to motion analysis, augmented reality, vehicle navigation, and robotics. Although several tracking approaches have been proposed over last decade with successful results, but object tracking is still an open research area. In this thesis, two methods have been proposed to solve the problem of occlusion in the tracking process. In the first method, a robust visual tracking method is introduced which exploits the relationships of targets in adjacent frames using patchwise joint sparse representation. Two sets of overlapping patches with different sizes are extracted from target candidates to construct two dictionaries by considering joint sparse representation. The structural sparse appearance model with this representation provides two-fold advantages; considering the correlation of target patches over time, and generating local features of target thoroughly. Furthermore, the position of candidate patches and their occlusion levels are utilized simultaneously to obtain the final likelihood of target candidates. In the second proposed method based on deep learning, the tracking problem decomposed into the localization and classification tasks. The localization network employs the information in the current frame and provides an additional location to enhance object tracking. The Siamese network has been employed to find the target among many candidates obtained close to the target location in the previous frame, as well as the one estimated from the localization network in the current frame. The achieved results from both proposed methods on recent challenging benchmarks prove that this proposed tracking approach outperforms well among the state-of-the-art trackers. In the OS and DP metrics of OPE criterion, the sparse tracker achieves 1.3% and 3.3% improvement over other trackers, respectively. Also, the second proposed tracker achieves 0.4% and 0.1% improvement over other trackers based on OS and DP metrics of OPE criterion, respectively.
Keywords: visual object tracking, occlusion, sparse representation, deep learning