TABOO: Detecting unstructured sensitive information using recursive neural networks

Jan Neerbek, Ira Assent, Peter Dolog

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

11 Citations (Scopus)

Abstract

© 2017 IEEE. Leak of sensitive information from unstructured text documents is a costly problem both for government and for industrial institutions. Traditional approaches for data leak prevention are commonly based on the hypothesis that sensitive information is reflected in the presence of distinct sensitive words. However, for complex sensitive information, this hypothesis may not hold. Our TABOO system detects complex sensitive information in text documents by learning the semantic and syntactic structure of text documents. Our approach is based on natural language processing methods for paraphrase detection, and uses recursive neural networks to assign sensitivity scores to the semantic components of the sentence structure. The demonstration of TABOO focuses on interactive detection of sensitive information with the TABOO system. Users may work with real documents, alter documents or prepare free text, and subject it to information detection. TABOO allows users to work with our TABOO engine or with traditional approaches, and to compare results. Users may verify that single words can change sensitivity according to context, thereby giving hands-on experience with complex cases of sensitive information.
Original languageEnglish
Title of host publicationIEEE 33rd International Conference on Data Engineering (ICDE), 2017
Number of pages2
PublisherIEEE
Publication date16 May 2017
Pages1399-1400
ISBN (Electronic)978-1-5090-6543-1
DOIs
Publication statusPublished - 16 May 2017
Event33rd IEEE International Conference on Data Engineering, ICDE 2017 - San Diego, United States
Duration: 19 Apr 201722 Apr 2017

Conference

Conference33rd IEEE International Conference on Data Engineering, ICDE 2017
Country/TerritoryUnited States
CitySan Diego
Period19/04/201722/04/2017
SeriesProceedings of the International Conference on Data Engineering
ISSN1063-6382

Fingerprint

Dive into the research topics of 'TABOO: Detecting unstructured sensitive information using recursive neural networks'. Together they form a unique fingerprint.

Cite this