OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer

Xiaoxu Li; Dongliang Chang; Zhanyu Ma; Zheng-Hua Tan; Jing-Hao Xue; Jie  Cao; Jingyi Yu; Jun Guo

doi:10.1109/TIP.2020.2990277

OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer

Xiaoxu Li, Dongliang Chang, Zhanyu Ma, Zheng-Hua Tan, Jing-Hao Xue, Jie Cao, Jingyi Yu, Jun Guo

Research output: Contribution to journal › Journal article › Research › peer-review

28 Citations (Scopus)

112 Downloads (Pure)

Abstract

A deep neural network of multiple nonlinear layers forms a large function space, which can easily lead to overfitting when it encounters small-sample data. To mitigate overfitting in small-sample classification, learning more discriminative features from small-sample data is becoming a new trend. To this end, this paper aims to find a subspace of neural networks that can facilitate a large decision margin. Specifically, we propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain orthogonal during both the training and test processes. The Rademacher complexity of a network using the OSL is only 1 K, where K is the number of classes, of that of a network using the fully connected classification layer, leading to a tighter generalization error bound. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets, as well as its applicability to large-sample datasets. Codes are available at: https://github.com/dongliangchang/OSLNet.

Original language	English
Article number	9088302
Journal	I E E E Transactions on Image Processing
Volume	29
Pages (from-to)	6482-6495
Number of pages	14
ISSN	1057-7149
DOIs	https://doi.org/10.1109/TIP.2020.2990277
Publication status	Published - 2020

Keywords

Deep neural network
Orthogonal softmax layer
overfitting
small-sample classification

Access to Document

10.1109/TIP.2020.2990277

Accepted author manuscriptAccepted author manuscript, 15.5 MB

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{316f0b172610479fa8de5de0f256d056,

title = "OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer",

abstract = "A deep neural network of multiple nonlinear layers forms a large function space, which can easily lead to overfitting when it encounters small-sample data. To mitigate overfitting in small-sample classification, learning more discriminative features from small-sample data is becoming a new trend. To this end, this paper aims to find a subspace of neural networks that can facilitate a large decision margin. Specifically, we propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain orthogonal during both the training and test processes. The Rademacher complexity of a network using the OSL is only 1 K, where K is the number of classes, of that of a network using the fully connected classification layer, leading to a tighter generalization error bound. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets, as well as its applicability to large-sample datasets. Codes are available at: https://github.com/dongliangchang/OSLNet. ",

keywords = "Deep neural network, Orthogonal softmax layer, overfitting, small-sample classification",

author = "Xiaoxu Li and Dongliang Chang and Zhanyu Ma and Zheng-Hua Tan and Jing-Hao Xue and Jie Cao and Jingyi Yu and Jun Guo",

year = "2020",

doi = "10.1109/TIP.2020.2990277",

language = "English",

volume = "29",

pages = "6482--6495",

journal = "I E E E Transactions on Image Processing",

issn = "1057-7149",

publisher = "IEEE",

}

TY - JOUR

T1 - OSLNet

T2 - Deep Small-Sample Classification with an Orthogonal Softmax Layer

AU - Li, Xiaoxu

AU - Chang, Dongliang

AU - Ma, Zhanyu

AU - Tan, Zheng-Hua

AU - Xue, Jing-Hao

AU - Cao, Jie

AU - Yu, Jingyi

AU - Guo, Jun

PY - 2020

Y1 - 2020

N2 - A deep neural network of multiple nonlinear layers forms a large function space, which can easily lead to overfitting when it encounters small-sample data. To mitigate overfitting in small-sample classification, learning more discriminative features from small-sample data is becoming a new trend. To this end, this paper aims to find a subspace of neural networks that can facilitate a large decision margin. Specifically, we propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain orthogonal during both the training and test processes. The Rademacher complexity of a network using the OSL is only 1 K, where K is the number of classes, of that of a network using the fully connected classification layer, leading to a tighter generalization error bound. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets, as well as its applicability to large-sample datasets. Codes are available at: https://github.com/dongliangchang/OSLNet.

AB - A deep neural network of multiple nonlinear layers forms a large function space, which can easily lead to overfitting when it encounters small-sample data. To mitigate overfitting in small-sample classification, learning more discriminative features from small-sample data is becoming a new trend. To this end, this paper aims to find a subspace of neural networks that can facilitate a large decision margin. Specifically, we propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain orthogonal during both the training and test processes. The Rademacher complexity of a network using the OSL is only 1 K, where K is the number of classes, of that of a network using the fully connected classification layer, leading to a tighter generalization error bound. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets, as well as its applicability to large-sample datasets. Codes are available at: https://github.com/dongliangchang/OSLNet.

KW - Deep neural network

KW - Orthogonal softmax layer

KW - overfitting

KW - small-sample classification

UR - http://www.scopus.com/inward/record.url?scp=85087547443&partnerID=8YFLogxK

U2 - 10.1109/TIP.2020.2990277

DO - 10.1109/TIP.2020.2990277

M3 - Journal article

SN - 1057-7149

VL - 29

SP - 6482

EP - 6495

JO - I E E E Transactions on Image Processing

JF - I E E E Transactions on Image Processing

M1 - 9088302

ER -

OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer

Abstract

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this