Addressing structural and linguistic heterogeneity in the Web

Jacobo Rouces Gonzalez, Gerard de Melo, Katja Hose

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

45 Downloads (Pure)

Resumé

An increasing number of structured knowledge bases have become available on the Web, enabling many new forms of analyses and applications. However, the fact that the data is being published by different parties with different vocabularies and ontologies means that there is a high degree of heterogeneity and no common schema. At the same time, the abundance of different human languages across unstructured data presents a similar problem, because most text mining tools only cater to the English language. This paper presents solutions for these two kinds of heterogeneity. It introduces Klint, a Web-based system that automatically creates mappings to transform knowledge from heterogeneous sources into FrameBase, which is a broad linked data schema that enables the representation of a wide range of knowledge. With Klint, a user can review and edit the mappings with a streamlined interface, which in turn allows for human-level accuracy with minimum human effort. The paper further describes how FrameBase can be extended to support multilingual labels, which can aid in extending current tools for integrating English text into FrameBase knowledge.
OriginalsprogEngelsk
TidsskriftAI Communications
Vol/bind31
Udgave nummer1
Sider (fra-til)3-18
ISSN0921-7126
DOI
StatusUdgivet - 2018

Fingerprint

Linguistics
Ontology
Labels

Citer dette

Gonzalez, Jacobo Rouces ; de Melo, Gerard ; Hose, Katja. / Addressing structural and linguistic heterogeneity in the Web. I: AI Communications. 2018 ; Bind 31, Nr. 1. s. 3-18.
@article{58e9c847d8924487bcd9c5361a24e349,
title = "Addressing structural and linguistic heterogeneity in the Web",
abstract = "An increasing number of structured knowledge bases have become available on the Web, enabling many new forms of analyses and applications. However, the fact that the data is being published by different parties with different vocabularies and ontologies means that there is a high degree of heterogeneity and no common schema. At the same time, the abundance of different human languages across unstructured data presents a similar problem, because most text mining tools only cater to the English language. This paper presents solutions for these two kinds of heterogeneity. It introduces Klint, a Web-based system that automatically creates mappings to transform knowledge from heterogeneous sources into FrameBase, which is a broad linked data schema that enables the representation of a wide range of knowledge. With Klint, a user can review and edit the mappings with a streamlined interface, which in turn allows for human-level accuracy with minimum human effort. The paper further describes how FrameBase can be extended to support multilingual labels, which can aid in extending current tools for integrating English text into FrameBase knowledge.",
author = "Gonzalez, {Jacobo Rouces} and {de Melo}, Gerard and Katja Hose",
year = "2018",
doi = "10.3233/AIC-170745",
language = "English",
volume = "31",
pages = "3--18",
journal = "AI Communications",
issn = "0921-7126",
publisher = "I O S Press",
number = "1",

}

Addressing structural and linguistic heterogeneity in the Web. / Gonzalez, Jacobo Rouces; de Melo, Gerard; Hose, Katja.

I: AI Communications, Bind 31, Nr. 1, 2018, s. 3-18.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

TY - JOUR

T1 - Addressing structural and linguistic heterogeneity in the Web

AU - Gonzalez, Jacobo Rouces

AU - de Melo, Gerard

AU - Hose, Katja

PY - 2018

Y1 - 2018

N2 - An increasing number of structured knowledge bases have become available on the Web, enabling many new forms of analyses and applications. However, the fact that the data is being published by different parties with different vocabularies and ontologies means that there is a high degree of heterogeneity and no common schema. At the same time, the abundance of different human languages across unstructured data presents a similar problem, because most text mining tools only cater to the English language. This paper presents solutions for these two kinds of heterogeneity. It introduces Klint, a Web-based system that automatically creates mappings to transform knowledge from heterogeneous sources into FrameBase, which is a broad linked data schema that enables the representation of a wide range of knowledge. With Klint, a user can review and edit the mappings with a streamlined interface, which in turn allows for human-level accuracy with minimum human effort. The paper further describes how FrameBase can be extended to support multilingual labels, which can aid in extending current tools for integrating English text into FrameBase knowledge.

AB - An increasing number of structured knowledge bases have become available on the Web, enabling many new forms of analyses and applications. However, the fact that the data is being published by different parties with different vocabularies and ontologies means that there is a high degree of heterogeneity and no common schema. At the same time, the abundance of different human languages across unstructured data presents a similar problem, because most text mining tools only cater to the English language. This paper presents solutions for these two kinds of heterogeneity. It introduces Klint, a Web-based system that automatically creates mappings to transform knowledge from heterogeneous sources into FrameBase, which is a broad linked data schema that enables the representation of a wide range of knowledge. With Klint, a user can review and edit the mappings with a streamlined interface, which in turn allows for human-level accuracy with minimum human effort. The paper further describes how FrameBase can be extended to support multilingual labels, which can aid in extending current tools for integrating English text into FrameBase knowledge.

U2 - 10.3233/AIC-170745

DO - 10.3233/AIC-170745

M3 - Journal article

VL - 31

SP - 3

EP - 18

JO - AI Communications

JF - AI Communications

SN - 0921-7126

IS - 1

ER -