The use of categorization information in language models for question retrieval

Xin Cao, Gao Cong, Bin Cui, Christian Søndergaard Jensen, Ce Zhang

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

91 Citations (Scopus)

Abstract

Community Question Answering (CQA) has emerged as a popular type of service meeting a wide range of information needs. Such services enable users to ask and answer questions and to access existing question-answer pairs. CQA archives contain very large volumes of valuable user-generated content and have become important information resources on the Web. To make the body of knowledge accumulated in CQA archives accessible, effective and efficient question search is required. Question search in a CQA archive aims to retrieve historical questions that are relevant to new questions posed by users. This paper proposes a category-based framework for search in CQA archives. The framework embodies several new techniques that use language models to exploit categories of questions for improving question-answer search. Experiments conducted on real data from Yahoo! Answers demonstrate that the proposed techniques are effective and efficient and are capable of outperforming baseline methods significantly.
Original languageEnglish
Title of host publicationProceeding of the 18th ACM conference on Information and knowledge management
EditorsDavid Wai-Lok Cheung, Il-Yeol Song, Wesley W. Chu, Xiaohua Hu, Jimmy J Lin
Number of pages10
PublisherAssociation for Computing Machinery
Publication date2009
Pages265-274
ISBN (Electronic)978-1-60558-512-3
Publication statusPublished - 2009
EventACM Conference on Information and Knowledge Management - Hong Kong, China
Duration: 2 Nov 20096 Nov 2009
Conference number: 18

Conference

ConferenceACM Conference on Information and Knowledge Management
Number18
Country/TerritoryChina
CityHong Kong
Period02/11/200906/11/2009
SeriesConference on Information and Knowledge Management

Fingerprint

Dive into the research topics of 'The use of categorization information in language models for question retrieval'. Together they form a unique fingerprint.

Cite this