On High Dimensional Searching Spaces and Learning Methods

Hossein Yazdani*, Daniel Ortiz-Arroyo, Kazimierz Choroś, Halina Kwasnicka

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingBook chapterResearchpeer-review

5 Citations (Scopus)

Abstract

In data science, there are important parameters that affect the accuracy of the algorithms used. Some of these parameters are: the type of data objects, the membership assignments, and distance or similarity functions. In this chapter we describe different data types , membership functions , and similarity functions and discuss the pros and cons of using each of them. Conventional similarity functions evaluate objects in the vector space. Contrarily, Weighted Feature Distance (WFD) functions compare data objects in both feature and vector spaces, preventing the system from being affected by some dominant features. Traditional membership functions assign membership values to data objects but impose some restrictions. Bounded Fuzzy Possibilistic Method (BFPM) makes possible for data objects to participate fully or partially in several clusters or even in all clusters. BFPM introduces intervals for the upper and lower boundaries for data objects with respect to each cluster. BFPM facilitates algorithms to converge and also inherits the abilities of conventional fuzzy and possibilistic methods. In Big Data applications knowing the exact type of data objects and selecting the most accurate similarity [1] and membership assignments is crucial in decreasing computing costs and obtaining the best performance. This chapter provides data types taxonomies to assist data miners in selecting the right learning method on each selected data set. Examples illustrate how to evaluate the accuracy and performance of the proposed algorithms. Experimental results show why these parameters are important.
Original languageEnglish
Title of host publicationData Science and Big Data : An Environment of Computational Intelligence
Number of pages20
Volume24
PublisherSpringer
Publication date2017
Pages29-48
ISBN (Print)978-3-319-53474-2
ISBN (Electronic)978-3-319-53474-9
DOIs
Publication statusPublished - 2017
SeriesStudies in Big Data
Volume24
ISSN2197-6503

Bibliographical note

Publisher Copyright:
© 2017, Springer International Publishing AG.

Keywords

  • Bounded fuzzy-possibilistic method
  • Membership function
  • Distance function
  • Supervised learning
  • Unsupervised learning
  • Clustering
  • Data type
  • Critical objects
  • Outstanding objects
  • Weighted feature distance

Fingerprint

Dive into the research topics of 'On High Dimensional Searching Spaces and Learning Methods'. Together they form a unique fingerprint.

Cite this