Real-world super-resolution of face-images from surveillance cameras

Andreas Aakerberg; Kamal Nasrollahi; Thomas B. Moeslund

doi:10.1049/ipr2.12359

Real-world super-resolution of face-images from surveillance cameras

Andreas Aakerberg^*, Kamal Nasrollahi, Thomas B. Moeslund

^*Corresponding author for this work

Research output: Contribution to journal › Journal article › Research › peer-review

13 Citations (Scopus)

64 Downloads (Pure)

Abstract

Most existing face image Super-Resolution (SR) methods assume that the Low-Resolution (LR) images were artificially downsampled from High-Resolution (HR) images with bicubic interpolation. This operation changes the natural image characteristics and reduces noise. Hence, SR methods trained on such data most often fail to produce good results when applied to real LR images. To solve this problem, a novel framework for the generation of realistic LR/HR training pairs is proposed. The framework estimates realistic blur kernels, noise distributions, and JPEG compression artifacts to generate LR images with similar image characteristics as the ones in the source domain. This allows to train an SR model using high-quality face images as Ground-Truth (GT). For better perceptual quality, a Generative Adversarial Network (GAN) based SR model is used, where the commonly used VGG-loss [1] is exchanged with LPIPS-loss [2]. Experimental results on both real and artificially corrupted face images show that our method results in more detailed reconstructions with less noise compared to the existing State-of-the-Art (SoTA) methods. In addition, it is shown that the traditional non-reference Image Quality Assessment (IQA) methods fail to capture this improvement and demonstrate that the more recent NIMA metric [3] correlates better with human perception via Mean Opinion Rank (MOR).

Original language	English
Journal	IET Image Processing
Volume	16
Issue number	2
Pages (from-to)	442-452
Number of pages	11
ISSN	1751-9659
DOIs	https://doi.org/10.1049/ipr2.12359
Publication status	Published - Feb 2022

Bibliographical note

Publisher Copyright:
© 2021 The Authors. IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology

Access to Document

10.1049/ipr2.12359Licence: CC BY 4.0

Open Access articleFinal published version, 5.56 MBLicence: CC BY 4.0

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{31100fdba94b4e59825d16511b7f1a61,

title = "Real-world super-resolution of face-images from surveillance cameras",

abstract = "Most existing face image Super-Resolution (SR) methods assume that the Low-Resolution (LR) images were artificially downsampled from High-Resolution (HR) images with bicubic interpolation. This operation changes the natural image characteristics and reduces noise. Hence, SR methods trained on such data most often fail to produce good results when applied to real LR images. To solve this problem, a novel framework for the generation of realistic LR/HR training pairs is proposed. The framework estimates realistic blur kernels, noise distributions, and JPEG compression artifacts to generate LR images with similar image characteristics as the ones in the source domain. This allows to train an SR model using high-quality face images as Ground-Truth (GT). For better perceptual quality, a Generative Adversarial Network (GAN) based SR model is used, where the commonly used VGG-loss [1] is exchanged with LPIPS-loss [2]. Experimental results on both real and artificially corrupted face images show that our method results in more detailed reconstructions with less noise compared to the existing State-of-the-Art (SoTA) methods. In addition, it is shown that the traditional non-reference Image Quality Assessment (IQA) methods fail to capture this improvement and demonstrate that the more recent NIMA metric [3] correlates better with human perception via Mean Opinion Rank (MOR).",

author = "Andreas Aakerberg and Kamal Nasrollahi and Moeslund, {Thomas B.}",

note = "Publisher Copyright: {\textcopyright} 2021 The Authors. IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology",

year = "2022",

month = feb,

doi = "10.1049/ipr2.12359",

language = "English",

volume = "16",

pages = "442--452",

journal = "IET Image Processing",

issn = "1751-9659",

publisher = "Institution of Engineering and Technology",

number = "2",

}

TY - JOUR

T1 - Real-world super-resolution of face-images from surveillance cameras

AU - Aakerberg, Andreas

AU - Nasrollahi, Kamal

AU - Moeslund, Thomas B.

PY - 2022/2

Y1 - 2022/2

N2 - Most existing face image Super-Resolution (SR) methods assume that the Low-Resolution (LR) images were artificially downsampled from High-Resolution (HR) images with bicubic interpolation. This operation changes the natural image characteristics and reduces noise. Hence, SR methods trained on such data most often fail to produce good results when applied to real LR images. To solve this problem, a novel framework for the generation of realistic LR/HR training pairs is proposed. The framework estimates realistic blur kernels, noise distributions, and JPEG compression artifacts to generate LR images with similar image characteristics as the ones in the source domain. This allows to train an SR model using high-quality face images as Ground-Truth (GT). For better perceptual quality, a Generative Adversarial Network (GAN) based SR model is used, where the commonly used VGG-loss [1] is exchanged with LPIPS-loss [2]. Experimental results on both real and artificially corrupted face images show that our method results in more detailed reconstructions with less noise compared to the existing State-of-the-Art (SoTA) methods. In addition, it is shown that the traditional non-reference Image Quality Assessment (IQA) methods fail to capture this improvement and demonstrate that the more recent NIMA metric [3] correlates better with human perception via Mean Opinion Rank (MOR).

AB - Most existing face image Super-Resolution (SR) methods assume that the Low-Resolution (LR) images were artificially downsampled from High-Resolution (HR) images with bicubic interpolation. This operation changes the natural image characteristics and reduces noise. Hence, SR methods trained on such data most often fail to produce good results when applied to real LR images. To solve this problem, a novel framework for the generation of realistic LR/HR training pairs is proposed. The framework estimates realistic blur kernels, noise distributions, and JPEG compression artifacts to generate LR images with similar image characteristics as the ones in the source domain. This allows to train an SR model using high-quality face images as Ground-Truth (GT). For better perceptual quality, a Generative Adversarial Network (GAN) based SR model is used, where the commonly used VGG-loss [1] is exchanged with LPIPS-loss [2]. Experimental results on both real and artificially corrupted face images show that our method results in more detailed reconstructions with less noise compared to the existing State-of-the-Art (SoTA) methods. In addition, it is shown that the traditional non-reference Image Quality Assessment (IQA) methods fail to capture this improvement and demonstrate that the more recent NIMA metric [3] correlates better with human perception via Mean Opinion Rank (MOR).

UR - http://www.scopus.com/inward/record.url?scp=85117888761&partnerID=8YFLogxK

U2 - 10.1049/ipr2.12359

DO - 10.1049/ipr2.12359

M3 - Journal article

AN - SCOPUS:85117888761

SN - 1751-9659

VL - 16

SP - 442

EP - 452

JO - IET Image Processing

JF - IET Image Processing

IS - 2

ER -

Real-world super-resolution of face-images from surveillance cameras

Abstract

Bibliographical note

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this