World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Distractor-Aware Tracking with Multi-Task and Dynamic Feature Learning

    https://doi.org/10.1142/S0218126621500316Cited by:2 (Source: Crossref)

    Recently, deep trackers based on the siamese networking are enjoying increasing popularity in the tracking community. Generally, those trackers learn a high-level semantic embedding space for feature representation but lose low-level fine-grained details. Meanwhile, the learned high-level semantic features are not updated during online tracking, which results in tracking drift in presence of target appearance variation and similar distractors. In this paper, we present a novel end-to-end trainable Convolutional Neural Network (CNN) based on the siamese network for distractor-aware tracking. It enhances target appearance representation in both the offline training stage and online tracking stage. In the offline training stage, this network learns both the low-level fine-grained details and high-level coarse-grained semantics simultaneously in a multi-task learning framework. The low-level features with better resolution are complementary to semantic features and able to distinguish the foreground target from background distractors. In the online stage, the learned low-level features are fed into a correlation filter layer and updated in an interpolated manner to encode target appearance variation adaptively. The learned high-level features are fed into a cross-correlation layer without online update. Therefore, the proposed tracker benefits from both the adaptability of the fine-grained correlation filter and the generalization capability of the semantic embedding. Extensive experiments are conducted on the public OTB100 and UAV123 benchmark datasets. Our tracker achieves state-of-the-art performance while running with a real-time frame-rate.

    This paper was recommended by Regional Editor Tongquan Wei.