Abstract
Keyword spotting (KWS) is, in many instances, intended to run on smart electronic devices characterized by limited computational resources. To meet computational constraints, a series of techniques —ranging from feature and acoustic model parameter quantization to the reduction of the number of model
parameters and required multiplications— has been explored in the literature. With this same aim, in this paper, we study a straightforward alternative consisting of the reduction of the spectro/cepstro-temporal resolution of log-Mel and Melfrequency cepstral coefficient feature matrices commonly employed in KWS. We show that the feature matrix size has a strong impact on the number of multiplications/energy consumption of a state-of-the-art KWS acoustic model based on convolutional neural network. Experimental results demonstrate that the number of elements in commonly used speech feature matrices can be reduced by a factor of 8 while essentially maintaining KWS performance. Even more interestingly, this size reduction leads to a 9.6× number of multiplications/energy consumption, 4.0× training time and 3.7× inference time reduction.
parameters and required multiplications— has been explored in the literature. With this same aim, in this paper, we study a straightforward alternative consisting of the reduction of the spectro/cepstro-temporal resolution of log-Mel and Melfrequency cepstral coefficient feature matrices commonly employed in KWS. We show that the feature matrix size has a strong impact on the number of multiplications/energy consumption of a state-of-the-art KWS acoustic model based on convolutional neural network. Experimental results demonstrate that the number of elements in commonly used speech feature matrices can be reduced by a factor of 8 while essentially maintaining KWS performance. Even more interestingly, this size reduction leads to a 9.6× number of multiplications/energy consumption, 4.0× training time and 3.7× inference time reduction.
Original language | English |
---|---|
Title of host publication | IberSPEECH 2022 |
Publication date | 2022 |
DOIs | |
Publication status | Published - 2022 |
Event | IberSPEECH 2022 - Granada, Spain Duration: 14 Nov 2022 → 16 Nov 2022 |
Conference
Conference | IberSPEECH 2022 |
---|---|
Country/Territory | Spain |
City | Granada |
Period | 14/11/2022 → 16/11/2022 |