Noise Cancellation in Speech Signals Using PATRAN Technology

  • Hermansen, Kjeld (Projektdeltager)



Digital Communications The consequences of noise in speech signals go from being irritating/uncomfortable to causing reduced/no speech intelligibility. Noise reduction techniques are thus aiming at increasing speech comfort and - intelligibility. In both cases it is mandatory that the removal of noise takes place without adding artefacts (random tones) and without affecting the speech component to any appreciate extent. A prerequisition for optimal separation of speech and noise is exploitation of all a priori knowledge of the two components (one microphone scenario). Various methods for noise reduction in a speech signal are known. These methods include spectral subtraction and other filtering methods, e.g. Wiener filtering. In spectral subtraction the obtained noise power spectrum is subtracted from the speech power signals in order to obtain a noise reduction. A time domain speech signal is reconstructed using the resulting spectrum, e.g. by use of the inverse Fourier transform. Hereby the time-domain signal is reconstructed from the noise-reduced power spectrum and the unmodified phase spectrum. Even though this method has been found to be useful, it has the drawback that the noise reduction is based on an estimate of the noise spectrum and is therefore dependent on stationarity in the noise signal to perform optimally. As the background noise spectrum is based on long term (2-3 seconds) some differences due to statistic fluctuation will exist between this spectrum and the actual noise spectrum (short term). These errors in noise estimation tend to affect the small spectral regions of the output, and will result in short duration random tones in the noise reduced signal. Even though these random noise tones are often a low-energy signal compared to the total energy in the speech signal, the random tone noise tends to be very irritating to listen to due to psycho-acoustic effects. The object of the research is to provide a method, which enables noise reduction in a speech signal and avoids the above-mentioned drawbacks. The research is based on the circumstance that a model based representation describing the quasi-stationary part of the speech signal can be generated on the basis of a noisy spectrum, which is generated by spectral subtraction of a spectrum generated on the basis of the speech signal and a spectrum generated as an estimate of the noise power spectrum. The spectral subtraction enables the use of model based representation using fbg-parameters enables an improved noise reduction, as it enables use of a priori knowledge of speech signals. (Kjeld Hermansen)
Effektiv start/slut dato31/12/200331/12/2003