On High Dimensional Searching Spaces and Learning Methods

Hossein Yazdani*, Daniel Ortiz-Arroyo, Kazimierz Choroś, Halina Kwasnicka

*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceedingBidrag til bog/antologiForskningpeer review

5 Citationer (Scopus)

Abstract

In data science, there are important parameters that affect the accuracy of the algorithms used. Some of these parameters are: the type of data objects, the membership assignments, and distance or similarity functions. In this chapter we describe different data types , membership functions , and similarity functions and discuss the pros and cons of using each of them. Conventional similarity functions evaluate objects in the vector space. Contrarily, Weighted Feature Distance (WFD) functions compare data objects in both feature and vector spaces, preventing the system from being affected by some dominant features. Traditional membership functions assign membership values to data objects but impose some restrictions. Bounded Fuzzy Possibilistic Method (BFPM) makes possible for data objects to participate fully or partially in several clusters or even in all clusters. BFPM introduces intervals for the upper and lower boundaries for data objects with respect to each cluster. BFPM facilitates algorithms to converge and also inherits the abilities of conventional fuzzy and possibilistic methods. In Big Data applications knowing the exact type of data objects and selecting the most accurate similarity [1] and membership assignments is crucial in decreasing computing costs and obtaining the best performance. This chapter provides data types taxonomies to assist data miners in selecting the right learning method on each selected data set. Examples illustrate how to evaluate the accuracy and performance of the proposed algorithms. Experimental results show why these parameters are important.
OriginalsprogEngelsk
TitelData Science and Big Data : An Environment of Computational Intelligence
Antal sider20
Vol/bind24
ForlagSpringer
Publikationsdato2017
Sider29-48
ISBN (Trykt)978-3-319-53474-2
ISBN (Elektronisk)978-3-319-53474-9
DOI
StatusUdgivet - 2017
NavnStudies in Big Data
Vol/bind24
ISSN2197-6503

Bibliografisk note

Publisher Copyright:
© 2017, Springer International Publishing AG.

Fingeraftryk

Dyk ned i forskningsemnerne om 'On High Dimensional Searching Spaces and Learning Methods'. Sammen danner de et unikt fingeraftryk.

Citationsformater