TABOO: Detecting unstructured sensitive information using recursive neural networks

Jan Neerbek, Ira Assent, Peter Dolog

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

11 Citationer (Scopus)

Abstract

© 2017 IEEE. Leak of sensitive information from unstructured text documents is a costly problem both for government and for industrial institutions. Traditional approaches for data leak prevention are commonly based on the hypothesis that sensitive information is reflected in the presence of distinct sensitive words. However, for complex sensitive information, this hypothesis may not hold. Our TABOO system detects complex sensitive information in text documents by learning the semantic and syntactic structure of text documents. Our approach is based on natural language processing methods for paraphrase detection, and uses recursive neural networks to assign sensitivity scores to the semantic components of the sentence structure. The demonstration of TABOO focuses on interactive detection of sensitive information with the TABOO system. Users may work with real documents, alter documents or prepare free text, and subject it to information detection. TABOO allows users to work with our TABOO engine or with traditional approaches, and to compare results. Users may verify that single words can change sensitivity according to context, thereby giving hands-on experience with complex cases of sensitive information.
OriginalsprogEngelsk
TitelIEEE 33rd International Conference on Data Engineering (ICDE), 2017
Antal sider2
ForlagIEEE
Publikationsdato16 maj 2017
Sider1399-1400
ISBN (Elektronisk)978-1-5090-6543-1
DOI
StatusUdgivet - 16 maj 2017
Begivenhed33rd IEEE International Conference on Data Engineering, ICDE 2017 - San Diego, USA
Varighed: 19 apr. 201722 apr. 2017

Konference

Konference33rd IEEE International Conference on Data Engineering, ICDE 2017
Land/OmrådeUSA
BySan Diego
Periode19/04/201722/04/2017
NavnProceedings of the International Conference on Data Engineering
ISSN1063-6382

Fingeraftryk

Dyk ned i forskningsemnerne om 'TABOO: Detecting unstructured sensitive information using recursive neural networks'. Sammen danner de et unikt fingeraftryk.

Citationsformater