Residual memory inference network for regression tracking with weighted gradient harmonized loss

Huanlong Zhang; Jiapeng Zhang; Guohao Nie; Jilin Hu; W. J.(Chris) Zhang

doi:10.1016/j.ins.2022.03.047

Residual memory inference network for regression tracking with weighted gradient harmonized loss

Huanlong Zhang, Jiapeng Zhang, Guohao Nie, Jilin Hu^*, W. J.(Chris) Zhang

^*Kontaktforfatter

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

6 Citationer (Scopus)

Abstract

Recently, the memory mechanism has been widely implemented in target tracking. However, these trackers hardly balance the stability of long-term memory with the plasticity of short-term memory through an elegant and efficient mechanism. A residual memory inference network (RMIT) is proposed to exploit the history of target states and last visual features. Specifically, RMIT consists of a base layer and a residual memory layer by synergizing short-and long-term memories. The base layer can be regarded as Discriminative Correlation Filter (DCF) reformulation that maintains the short-term memory to accommodate rapid appearance changes. The residual memory layer can extend residual learning from the spatial domain to the Spatio-temporal domain via ConvLSTM to obtain long-term memory of the target appearance. To avoid model degradation due to sample imbalance, we introduce a weighted gradient harmonized loss to improve the discrimination of the tracker. Then, response scores can be served as a basis of the adaptive learning strategy to ensure the reliability of memory updates. The proposed method performs favorably and has been extensively validated on six benchmark datasets, including OTB-50/100, TC-128, UAV-123, and VOT-2016/2018 against several advanced methods.

Originalsprog	Engelsk
Tidsskrift	Information Sciences
Vol/bind	597
Sider (fra-til)	105-124
Antal sider	20
ISSN	0020-0255
DOI	https://doi.org/10.1016/j.ins.2022.03.047
Status	Udgivet - jun. 2022

Bibliografisk note

Publisher Copyright:
© 2022 Elsevier Inc.

Adgang til dokumentet

10.1016/j.ins.2022.03.047

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

Link to publication in Scopus

Citationsformater

@article{8512be19685c4937814e1d7fb7ea0d5e,

title = "Residual memory inference network for regression tracking with weighted gradient harmonized loss",

abstract = "Recently, the memory mechanism has been widely implemented in target tracking. However, these trackers hardly balance the stability of long-term memory with the plasticity of short-term memory through an elegant and efficient mechanism. A residual memory inference network (RMIT) is proposed to exploit the history of target states and last visual features. Specifically, RMIT consists of a base layer and a residual memory layer by synergizing short-and long-term memories. The base layer can be regarded as Discriminative Correlation Filter (DCF) reformulation that maintains the short-term memory to accommodate rapid appearance changes. The residual memory layer can extend residual learning from the spatial domain to the Spatio-temporal domain via ConvLSTM to obtain long-term memory of the target appearance. To avoid model degradation due to sample imbalance, we introduce a weighted gradient harmonized loss to improve the discrimination of the tracker. Then, response scores can be served as a basis of the adaptive learning strategy to ensure the reliability of memory updates. The proposed method performs favorably and has been extensively validated on six benchmark datasets, including OTB-50/100, TC-128, UAV-123, and VOT-2016/2018 against several advanced methods.",

keywords = "Long-short term memory, Residual network, Visual tracking",

author = "Huanlong Zhang and Jiapeng Zhang and Guohao Nie and Jilin Hu and Zhang, {W. J.(Chris)}",

note = "Funding Information: This work was supported by the National Natural Science Foundation of China under Grant (61873246, 62072416, 62102373, 61806181, 62006213), Program for Science & Technology Innovation Talents in Universities of Henan Province, China (21HASTIT028), Natural Science Foundation of Henan Province, China (202300410495) and Zhongyuan Science and Technology Innovation Leadership Program, China (214200510026). Funding Information: Huanlong Zhang received the Ph.D. degree from the School of Aeronautics and Astronautics, Shanghai Jiao Tong University, China, in 2015. He is currently an Associate Professor with the College of Electric and Information Engineering, Zhengzhou University of Light Industry, Henan, Zhengzhou, China. His research has been funded by the National Natural Science Foundation of China (NSFC), the Key Science and Technology. Henan Province et al. He has published more than 40 technical articles in refereed journals and conference proceedings. His research interests include pattern recognition, machine learning, image processing, computer vision, and intelligent human-machine systems. Publisher Copyright: {\textcopyright} 2022 Elsevier Inc.",

year = "2022",

month = jun,

doi = "10.1016/j.ins.2022.03.047",

language = "English",

volume = "597",

pages = "105--124",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier",

}

TY - JOUR

T1 - Residual memory inference network for regression tracking with weighted gradient harmonized loss

AU - Zhang, Huanlong

AU - Zhang, Jiapeng

AU - Nie, Guohao

AU - Hu, Jilin

AU - Zhang, W. J.(Chris)

N1 - Funding Information: This work was supported by the National Natural Science Foundation of China under Grant (61873246, 62072416, 62102373, 61806181, 62006213), Program for Science & Technology Innovation Talents in Universities of Henan Province, China (21HASTIT028), Natural Science Foundation of Henan Province, China (202300410495) and Zhongyuan Science and Technology Innovation Leadership Program, China (214200510026). Funding Information: Huanlong Zhang received the Ph.D. degree from the School of Aeronautics and Astronautics, Shanghai Jiao Tong University, China, in 2015. He is currently an Associate Professor with the College of Electric and Information Engineering, Zhengzhou University of Light Industry, Henan, Zhengzhou, China. His research has been funded by the National Natural Science Foundation of China (NSFC), the Key Science and Technology. Henan Province et al. He has published more than 40 technical articles in refereed journals and conference proceedings. His research interests include pattern recognition, machine learning, image processing, computer vision, and intelligent human-machine systems. Publisher Copyright: © 2022 Elsevier Inc.

PY - 2022/6

Y1 - 2022/6

N2 - Recently, the memory mechanism has been widely implemented in target tracking. However, these trackers hardly balance the stability of long-term memory with the plasticity of short-term memory through an elegant and efficient mechanism. A residual memory inference network (RMIT) is proposed to exploit the history of target states and last visual features. Specifically, RMIT consists of a base layer and a residual memory layer by synergizing short-and long-term memories. The base layer can be regarded as Discriminative Correlation Filter (DCF) reformulation that maintains the short-term memory to accommodate rapid appearance changes. The residual memory layer can extend residual learning from the spatial domain to the Spatio-temporal domain via ConvLSTM to obtain long-term memory of the target appearance. To avoid model degradation due to sample imbalance, we introduce a weighted gradient harmonized loss to improve the discrimination of the tracker. Then, response scores can be served as a basis of the adaptive learning strategy to ensure the reliability of memory updates. The proposed method performs favorably and has been extensively validated on six benchmark datasets, including OTB-50/100, TC-128, UAV-123, and VOT-2016/2018 against several advanced methods.

AB - Recently, the memory mechanism has been widely implemented in target tracking. However, these trackers hardly balance the stability of long-term memory with the plasticity of short-term memory through an elegant and efficient mechanism. A residual memory inference network (RMIT) is proposed to exploit the history of target states and last visual features. Specifically, RMIT consists of a base layer and a residual memory layer by synergizing short-and long-term memories. The base layer can be regarded as Discriminative Correlation Filter (DCF) reformulation that maintains the short-term memory to accommodate rapid appearance changes. The residual memory layer can extend residual learning from the spatial domain to the Spatio-temporal domain via ConvLSTM to obtain long-term memory of the target appearance. To avoid model degradation due to sample imbalance, we introduce a weighted gradient harmonized loss to improve the discrimination of the tracker. Then, response scores can be served as a basis of the adaptive learning strategy to ensure the reliability of memory updates. The proposed method performs favorably and has been extensively validated on six benchmark datasets, including OTB-50/100, TC-128, UAV-123, and VOT-2016/2018 against several advanced methods.

KW - Long-short term memory

KW - Residual network

KW - Visual tracking

UR - http://www.scopus.com/inward/record.url?scp=85126537045&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2022.03.047

DO - 10.1016/j.ins.2022.03.047

M3 - Journal article

AN - SCOPUS:85126537045

SN - 0020-0255

VL - 597

SP - 105

EP - 124

JO - Information Sciences

JF - Information Sciences

ER -

Residual memory inference network for regression tracking with weighted gradient harmonized loss

Abstract

Bibliografisk note

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater