Addressing structural and linguistic heterogeneity in the Web

Jacobo Rouces Gonzalez, Gerard de Melo, Katja Hose

Research output: Contribution to journalJournal articleResearchpeer-review

2 Citations (Scopus)
217 Downloads (Pure)

Abstract

An increasing number of structured knowledge bases have become available on the Web, enabling many new forms of analyses and applications. However, the fact that the data is being published by different parties with different vocabularies and ontologies means that there is a high degree of heterogeneity and no common schema. At the same time, the abundance of different human languages across unstructured data presents a similar problem, because most text mining tools only cater to the English language. This paper presents solutions for these two kinds of heterogeneity. It introduces Klint, a Web-based system that automatically creates mappings to transform knowledge from heterogeneous sources into FrameBase, which is a broad linked data schema that enables the representation of a wide range of knowledge. With Klint, a user can review and edit the mappings with a streamlined interface, which in turn allows for human-level accuracy with minimum human effort. The paper further describes how FrameBase can be extended to support multilingual labels, which can aid in extending current tools for integrating English text into FrameBase knowledge.
Original languageEnglish
JournalAI Communications
Volume31
Issue number1
Pages (from-to)3-18
ISSN0921-7126
DOIs
Publication statusPublished - 2018

Fingerprint

Dive into the research topics of 'Addressing structural and linguistic heterogeneity in the Web'. Together they form a unique fingerprint.

Cite this