An enhanced fuzzy c-means algorithm for audio segmentation and classification

Mohammad A. Haque; Jong Myon Kim

doi:10.1007/s11042-011-0921-z

An enhanced fuzzy c-means algorithm for audio segmentation and classification

Mohammad A. Haque, Jong Myon Kim^*

^*Kontaktforfatter

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

11 Citationer (Scopus)

Abstract

Automated audio segmentation and classification play important roles in multimedia content analysis. In this paper, we propose an enhanced approach, called the correlation intensive fuzzy c-means (CIFCM) algorithm, to audio segmentation and classification that is based on audio content analysis. While conventional methods work by considering the attributes of only the current frame or segment, the proposed CIFCM algorithm efficiently incorporates the influence of neighboring frames or segments in the audio stream. With this method, audio-cuts can be detected efficiently even when the signal contains audio effects such as fade-in, fade-out, and cross-fade. A number of audio features are analyzed in this paper to explore the differences between various types of audio data. The proposed CIFCM algorithm works by detecting the boundaries between different kinds of sounds and classifying them into clusters such as silence, speech, music, speech with music, and speech with noise. Our experimental results indicate that the proposed method outperforms the state-of-the-art FCM approach in terms of audio segmentation and classification.

Originalsprog	Engelsk
Tidsskrift	Multimedia Tools and Applications
Vol/bind	63
Udgave nummer	2
Sider (fra-til)	485-500
Antal sider	16
ISSN	1380-7501
DOI	https://doi.org/10.1007/s11042-011-0921-z
Status	Udgivet - 1 mar. 2013

Adgang til dokumentet

10.1007/s11042-011-0921-z

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

http://www.scopus.com/inward/record.url?scp=84874937341&partnerID=8YFLogxK

Citationsformater

@article{462bf8b97ad54fa186472d03f93a3907,

title = "An enhanced fuzzy c-means algorithm for audio segmentation and classification",

abstract = "Automated audio segmentation and classification play important roles in multimedia content analysis. In this paper, we propose an enhanced approach, called the correlation intensive fuzzy c-means (CIFCM) algorithm, to audio segmentation and classification that is based on audio content analysis. While conventional methods work by considering the attributes of only the current frame or segment, the proposed CIFCM algorithm efficiently incorporates the influence of neighboring frames or segments in the audio stream. With this method, audio-cuts can be detected efficiently even when the signal contains audio effects such as fade-in, fade-out, and cross-fade. A number of audio features are analyzed in this paper to explore the differences between various types of audio data. The proposed CIFCM algorithm works by detecting the boundaries between different kinds of sounds and classifying them into clusters such as silence, speech, music, speech with music, and speech with noise. Our experimental results indicate that the proposed method outperforms the state-of-the-art FCM approach in terms of audio segmentation and classification.",

keywords = "Audio segmentation and classification, Database retrieval, Fuzzy c-means algorithm, Multimedia",

author = "Haque, {Mohammad A.} and Kim, {Jong Myon}",

year = "2013",

month = mar,

day = "1",

doi = "10.1007/s11042-011-0921-z",

language = "English",

volume = "63",

pages = "485--500",

journal = "Multimedia Tools and Applications",

issn = "1380-7501",

publisher = "Springer",

number = "2",

}

TY - JOUR

T1 - An enhanced fuzzy c-means algorithm for audio segmentation and classification

AU - Haque, Mohammad A.

AU - Kim, Jong Myon

PY - 2013/3/1

Y1 - 2013/3/1

N2 - Automated audio segmentation and classification play important roles in multimedia content analysis. In this paper, we propose an enhanced approach, called the correlation intensive fuzzy c-means (CIFCM) algorithm, to audio segmentation and classification that is based on audio content analysis. While conventional methods work by considering the attributes of only the current frame or segment, the proposed CIFCM algorithm efficiently incorporates the influence of neighboring frames or segments in the audio stream. With this method, audio-cuts can be detected efficiently even when the signal contains audio effects such as fade-in, fade-out, and cross-fade. A number of audio features are analyzed in this paper to explore the differences between various types of audio data. The proposed CIFCM algorithm works by detecting the boundaries between different kinds of sounds and classifying them into clusters such as silence, speech, music, speech with music, and speech with noise. Our experimental results indicate that the proposed method outperforms the state-of-the-art FCM approach in terms of audio segmentation and classification.

AB - Automated audio segmentation and classification play important roles in multimedia content analysis. In this paper, we propose an enhanced approach, called the correlation intensive fuzzy c-means (CIFCM) algorithm, to audio segmentation and classification that is based on audio content analysis. While conventional methods work by considering the attributes of only the current frame or segment, the proposed CIFCM algorithm efficiently incorporates the influence of neighboring frames or segments in the audio stream. With this method, audio-cuts can be detected efficiently even when the signal contains audio effects such as fade-in, fade-out, and cross-fade. A number of audio features are analyzed in this paper to explore the differences between various types of audio data. The proposed CIFCM algorithm works by detecting the boundaries between different kinds of sounds and classifying them into clusters such as silence, speech, music, speech with music, and speech with noise. Our experimental results indicate that the proposed method outperforms the state-of-the-art FCM approach in terms of audio segmentation and classification.

KW - Audio segmentation and classification

KW - Database retrieval

KW - Fuzzy c-means algorithm

KW - Multimedia

UR - http://www.scopus.com/inward/record.url?scp=84874937341&partnerID=8YFLogxK

U2 - 10.1007/s11042-011-0921-z

DO - 10.1007/s11042-011-0921-z

M3 - Journal article

AN - SCOPUS:84874937341

SN - 1380-7501

VL - 63

SP - 485

EP - 500

JO - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

IS - 2

ER -

An enhanced fuzzy c-means algorithm for audio segmentation and classification

Abstract

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater