A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data

Ilkcan Keles, Omar Qawasmeh, Tabea Tietz, Ludovica Marinucci, Roberto Reda, Marieke Van Erp

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

The Web of Data has grown explosively over the past few years, and as with any dataset, there are bound to be invalid statements in the data, as well as gaps. Natural Language Processing (NLP) is gaining interest to fill gaps in data by transforming (unstructured) text into structured data. However, there is currently a fundamental mismatch in approaches between Linked Data and NLP as the latter is often based on statistical methods, and the former on explicitly modelling knowledge. However, these fields can strengthen each other by joining forces. In this position paper, we argue that using linked data to validate the output of an NLP system, and using textual data to validate Linked Open Data (LOD) cloud statements is a promising research avenue. We illustrate our proposal with a proof of concept on a corpus of historical travel stories.
Original languageEnglish
Title of host publication2nd Conference on Language, Data and Knowledge (LDK 2019)
Number of pages8
Volume70
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Publication date2019
Pages13:1-13:8
ISBN (Print)978-3-95977-105-4
DOIs
Publication statusPublished - 2019
EventConference on Language, Data and Knowledge - Leipzig, Germany
Duration: 20 May 201923 May 2019
Conference number: 2nd
http://2019.ldk-conf.org/

Conference

ConferenceConference on Language, Data and Knowledge
Number2nd
CountryGermany
CityLeipzig
Period20/05/201923/05/2019
Internet address

Fingerprint

Natural language processing systems
Processing
Joining
Statistical methods

Cite this

Keles, I., Qawasmeh, O., Tietz, T., Marinucci, L., Reda, R., & Van Erp, M. (2019). A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data. In 2nd Conference on Language, Data and Knowledge (LDK 2019) (Vol. 70, pp. 13:1-13:8). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/OASIcs.LDK.2019.13
Keles, Ilkcan ; Qawasmeh, Omar ; Tietz, Tabea ; Marinucci, Ludovica ; Reda, Roberto ; Van Erp, Marieke. / A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data. 2nd Conference on Language, Data and Knowledge (LDK 2019). Vol. 70 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2019. pp. 13:1-13:8
@inproceedings{f5d82e3279c44323909020a4d20fc94d,
title = "A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data",
abstract = "The Web of Data has grown explosively over the past few years, and as with any dataset, there are bound to be invalid statements in the data, as well as gaps. Natural Language Processing (NLP) is gaining interest to fill gaps in data by transforming (unstructured) text into structured data. However, there is currently a fundamental mismatch in approaches between Linked Data and NLP as the latter is often based on statistical methods, and the former on explicitly modelling knowledge. However, these fields can strengthen each other by joining forces. In this position paper, we argue that using linked data to validate the output of an NLP system, and using textual data to validate Linked Open Data (LOD) cloud statements is a promising research avenue. We illustrate our proposal with a proof of concept on a corpus of historical travel stories.",
author = "Ilkcan Keles and Omar Qawasmeh and Tabea Tietz and Ludovica Marinucci and Roberto Reda and {Van Erp}, Marieke",
year = "2019",
doi = "10.4230/OASIcs.LDK.2019.13",
language = "English",
isbn = "978-3-95977-105-4",
volume = "70",
pages = "13:1--13:8",
booktitle = "2nd Conference on Language, Data and Knowledge (LDK 2019)",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",

}

Keles, I, Qawasmeh, O, Tietz, T, Marinucci, L, Reda, R & Van Erp, M 2019, A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data. in 2nd Conference on Language, Data and Knowledge (LDK 2019). vol. 70, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, pp. 13:1-13:8, Conference on Language, Data and Knowledge, Leipzig, Germany, 20/05/2019. https://doi.org/10.4230/OASIcs.LDK.2019.13

A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data. / Keles, Ilkcan; Qawasmeh, Omar; Tietz, Tabea; Marinucci, Ludovica; Reda, Roberto; Van Erp, Marieke.

2nd Conference on Language, Data and Knowledge (LDK 2019). Vol. 70 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2019. p. 13:1-13:8.

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

TY - GEN

T1 - A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data

AU - Keles, Ilkcan

AU - Qawasmeh, Omar

AU - Tietz, Tabea

AU - Marinucci, Ludovica

AU - Reda, Roberto

AU - Van Erp, Marieke

PY - 2019

Y1 - 2019

N2 - The Web of Data has grown explosively over the past few years, and as with any dataset, there are bound to be invalid statements in the data, as well as gaps. Natural Language Processing (NLP) is gaining interest to fill gaps in data by transforming (unstructured) text into structured data. However, there is currently a fundamental mismatch in approaches between Linked Data and NLP as the latter is often based on statistical methods, and the former on explicitly modelling knowledge. However, these fields can strengthen each other by joining forces. In this position paper, we argue that using linked data to validate the output of an NLP system, and using textual data to validate Linked Open Data (LOD) cloud statements is a promising research avenue. We illustrate our proposal with a proof of concept on a corpus of historical travel stories.

AB - The Web of Data has grown explosively over the past few years, and as with any dataset, there are bound to be invalid statements in the data, as well as gaps. Natural Language Processing (NLP) is gaining interest to fill gaps in data by transforming (unstructured) text into structured data. However, there is currently a fundamental mismatch in approaches between Linked Data and NLP as the latter is often based on statistical methods, and the former on explicitly modelling knowledge. However, these fields can strengthen each other by joining forces. In this position paper, we argue that using linked data to validate the output of an NLP system, and using textual data to validate Linked Open Data (LOD) cloud statements is a promising research avenue. We illustrate our proposal with a proof of concept on a corpus of historical travel stories.

U2 - 10.4230/OASIcs.LDK.2019.13

DO - 10.4230/OASIcs.LDK.2019.13

M3 - Article in proceeding

SN - 978-3-95977-105-4

VL - 70

SP - 13:1-13:8

BT - 2nd Conference on Language, Data and Knowledge (LDK 2019)

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -

Keles I, Qawasmeh O, Tietz T, Marinucci L, Reda R, Van Erp M. A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data. In 2nd Conference on Language, Data and Knowledge (LDK 2019). Vol. 70. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2019. p. 13:1-13:8 https://doi.org/10.4230/OASIcs.LDK.2019.13