Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction

Jan Østergaard

doi:10.1109/DCC50243.2021.00035

Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

4 Citations (Scopus)

Abstract

It was recently shown that the combination of source prediction, two-times oversampling, and noise shaping, can be used to obtain a robust (multiple-description) audio coding frame- work for networks with packet loss probabilities less than 10%. Specifically, it was shown that audio signals could be encoded into two descriptions (packets), which were separately sent over a communication channel. Each description yields a desired performance by itself, and when they are combined, the performance is improved. This paper extends the previ- ous work to an arbitrary number of descriptions (packets) by using fractional oversampling and a new decoding principle. We demonstrate that, due to source aliasing, existing MSE optimized reconstruction rules from noisy sampled data, performs poorly from a perceptual point of view. A simple reconstruction rule is proposed, that improves the PEAQ objective difference grades (ODG) by more than 2 points. The proposed audio coder enables low- delay high-quality audio streaming on networks with late packet arrivals or packet losses. With a coding delay of 2.5 ms, and a total bitrate of 300 kbps, it is demonstrated that mean PEAQ ODGs around -0.65 can be obtained for 48 kHz (mono) music (pop & rock), and packet loss probabilities of 20%.

Original language	English
Title of host publication	Proceedings - DCC 2021 : 2021 Data Compression Conference
Editors	Ali Bilgin, Michael W. Marcellin, Joan Serra-Sagrista, James A. Storer
Number of pages	10
Publisher	IEEE Signal Processing Society
Publication date	2021
Pages	273-282
Article number	9418676
ISBN (Print)	978-1-6654-4785-0
ISBN (Electronic)	978-1-6654-0333-7
DOIs	https://doi.org/10.1109/DCC50243.2021.00035
Publication status	Published - 2021
Event	2021 Data Compression Conference (DCC) - Snowbird, United States Duration: 23 Mar 2021 → 26 Mar 2021

Conference

Conference	2021 Data Compression Conference (DCC)
Country/Territory	United States
City	Snowbird
Period	23/03/2021 → 26/03/2021

Series	Data Compression Conference. Proceedings
ISSN	1068-0314

Keywords

Multiple descriptions
audio coding
fractional sampling
low delay
noise shaping
source predictions

Access to Document

10.1109/DCC50243.2021.00035

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@inproceedings{e5811b10240044a48477303dfb87ca9c,

title = "Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction",

abstract = "It was recently shown that the combination of source prediction, two-times oversampling, and noise shaping, can be used to obtain a robust (multiple-description) audio coding frame- work for networks with packet loss probabilities less than 10%. Specifically, it was shown that audio signals could be encoded into two descriptions (packets), which were separately sent over a communication channel. Each description yields a desired performance by itself, and when they are combined, the performance is improved. This paper extends the previ- ous work to an arbitrary number of descriptions (packets) by using fractional oversampling and a new decoding principle. We demonstrate that, due to source aliasing, existing MSE optimized reconstruction rules from noisy sampled data, performs poorly from a perceptual point of view. A simple reconstruction rule is proposed, that improves the PEAQ objective difference grades (ODG) by more than 2 points. The proposed audio coder enables low- delay high-quality audio streaming on networks with late packet arrivals or packet losses. With a coding delay of 2.5 ms, and a total bitrate of 300 kbps, it is demonstrated that mean PEAQ ODGs around -0.65 can be obtained for 48 kHz (mono) music (pop & rock), and packet loss probabilities of 20%.",

keywords = "Multiple descriptions, audio coding, fractional sampling, low delay, noise shaping, source predictions",

author = "Jan {\O}stergaard",

year = "2021",

doi = "10.1109/DCC50243.2021.00035",

language = "English",

isbn = "978-1-6654-4785-0",

series = "Data Compression Conference. Proceedings",

publisher = "IEEE Signal Processing Society",

pages = "273--282",

editor = "Ali Bilgin and Marcellin, {Michael W.} and Joan Serra-Sagrista and Storer, {James A.}",

booktitle = "Proceedings - DCC 2021",

address = "United States",

note = " 2021 Data Compression Conference (DCC) ; Conference date: 23-03-2021 Through 26-03-2021",

}

Østergaard, J 2021, Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction. in A Bilgin, MW Marcellin, J Serra-Sagrista & JA Storer (eds), Proceedings - DCC 2021: 2021 Data Compression Conference., 9418676, IEEE Signal Processing Society, Data Compression Conference. Proceedings, pp. 273-282, 2021 Data Compression Conference (DCC), Snowbird, Utah, United States, 23/03/2021. https://doi.org/10.1109/DCC50243.2021.00035

Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction. / Østergaard, Jan.
Proceedings - DCC 2021: 2021 Data Compression Conference. ed. / Ali Bilgin; Michael W. Marcellin; Joan Serra-Sagrista; James A. Storer. IEEE Signal Processing Society, 2021. p. 273-282 9418676 (Data Compression Conference. Proceedings).

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction

AU - Østergaard, Jan

PY - 2021

Y1 - 2021

N2 - It was recently shown that the combination of source prediction, two-times oversampling, and noise shaping, can be used to obtain a robust (multiple-description) audio coding frame- work for networks with packet loss probabilities less than 10%. Specifically, it was shown that audio signals could be encoded into two descriptions (packets), which were separately sent over a communication channel. Each description yields a desired performance by itself, and when they are combined, the performance is improved. This paper extends the previ- ous work to an arbitrary number of descriptions (packets) by using fractional oversampling and a new decoding principle. We demonstrate that, due to source aliasing, existing MSE optimized reconstruction rules from noisy sampled data, performs poorly from a perceptual point of view. A simple reconstruction rule is proposed, that improves the PEAQ objective difference grades (ODG) by more than 2 points. The proposed audio coder enables low- delay high-quality audio streaming on networks with late packet arrivals or packet losses. With a coding delay of 2.5 ms, and a total bitrate of 300 kbps, it is demonstrated that mean PEAQ ODGs around -0.65 can be obtained for 48 kHz (mono) music (pop & rock), and packet loss probabilities of 20%.

AB - It was recently shown that the combination of source prediction, two-times oversampling, and noise shaping, can be used to obtain a robust (multiple-description) audio coding frame- work for networks with packet loss probabilities less than 10%. Specifically, it was shown that audio signals could be encoded into two descriptions (packets), which were separately sent over a communication channel. Each description yields a desired performance by itself, and when they are combined, the performance is improved. This paper extends the previ- ous work to an arbitrary number of descriptions (packets) by using fractional oversampling and a new decoding principle. We demonstrate that, due to source aliasing, existing MSE optimized reconstruction rules from noisy sampled data, performs poorly from a perceptual point of view. A simple reconstruction rule is proposed, that improves the PEAQ objective difference grades (ODG) by more than 2 points. The proposed audio coder enables low- delay high-quality audio streaming on networks with late packet arrivals or packet losses. With a coding delay of 2.5 ms, and a total bitrate of 300 kbps, it is demonstrated that mean PEAQ ODGs around -0.65 can be obtained for 48 kHz (mono) music (pop & rock), and packet loss probabilities of 20%.

KW - Multiple descriptions

KW - audio coding

KW - fractional sampling

KW - low delay

KW - noise shaping

KW - source predictions

UR - http://www.scopus.com/inward/record.url?scp=85106000668&partnerID=8YFLogxK

U2 - 10.1109/DCC50243.2021.00035

DO - 10.1109/DCC50243.2021.00035

M3 - Article in proceeding

SN - 978-1-6654-4785-0

T3 - Data Compression Conference. Proceedings

SP - 273

EP - 282

BT - Proceedings - DCC 2021

A2 - Bilgin, Ali

A2 - Marcellin, Michael W.

A2 - Serra-Sagrista, Joan

A2 - Storer, James A.

PB - IEEE Signal Processing Society

T2 - 2021 Data Compression Conference (DCC)

Y2 - 23 March 2021 through 26 March 2021

ER -

Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction

Abstract

Conference

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Open source MATLAB implementation of MD_DSQ coder

Cite this

Low Delay Robust Audio Coding by Noise Shaping, Fractional Sampling, and Source Prediction

Abstract

Conference

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Datasets

Open source MATLAB implementation of MD_DSQ coder

Cite this