TY - GEN
T1 - Hydranet
T2 - 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020
AU - Kaspersen, Esbern Torgard
AU - Kounalakis, Tsampikos
AU - Erkut, Cumhur
PY - 2020/5
Y1 - 2020/5
N2 - Real-time source separation has become increasingly important, as more and more applications, such as voice recognition and voice commands, require clean audio input in noisy environments. Recent developments in deep learning have allowed models to directly exploit the waveform of the audio, making real-time separation achievable. In this paper, we propose a 1-D convolutional U-Net structure to separate waveform input. This structure incorporates recurrent layers, to exploit longer temporal connections in the audio signal. Our proposed network architecture is also benefiting from the addition of an extra output channel, measuring the distortion of the other output channels. Our proposed methodology is experimentally shown to yield state-of-the-art results, using only 0.76 seconds of input audio.
AB - Real-time source separation has become increasingly important, as more and more applications, such as voice recognition and voice commands, require clean audio input in noisy environments. Recent developments in deep learning have allowed models to directly exploit the waveform of the audio, making real-time separation achievable. In this paper, we propose a 1-D convolutional U-Net structure to separate waveform input. This structure incorporates recurrent layers, to exploit longer temporal connections in the audio signal. Our proposed network architecture is also benefiting from the addition of an extra output channel, measuring the distortion of the other output channels. Our proposed methodology is experimentally shown to yield state-of-the-art results, using only 0.76 seconds of input audio.
KW - Audio and Speech Processing
KW - Deep learning
KW - Machine Learning.
KW - Sound
KW - Source Separation
UR - http://www.scopus.com/inward/record.url?scp=85089216561&partnerID=8YFLogxK
U2 - 10.1109/ICASSP40776.2020.9053357
DO - 10.1109/ICASSP40776.2020.9053357
M3 - Article in proceeding
AN - SCOPUS:85089216561
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4327
EP - 4331
BT - 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
PB - IEEE (Institute of Electrical and Electronics Engineers)
Y2 - 4 May 2020 through 8 May 2020
ER -