Co-clustering for Weblogs in Semantic Space

Yu Zong, Guandong Xu, Peter Dolog, Yanchun Zhang, Renjin Liu

Research output: Contribution to journalConference article in JournalResearchpeer-review

1 Citation (Scopus)
338 Downloads (Pure)

Abstract

Web clustering is an approach for aggregating web objects into various groups according to underlying relationships among them. Finding co-clusters of web objects in semantic space is an interesting topic in the context of web usage mining, which is able to capture the underlying user navigational interest and content preference simultaneously. In this paper we will present a novel web co-clustering algorithm named Co-Clustering in Semantic space (COCS) to simultaneously partition web users and pages via a latent semantic analysis approach. In COCS, we first, train the latent semantic space of weblog data by using Probabilistic Latent Semantic Analysis (PLSA) model, and then, project all weblog data objects into this semantic space with probability distribution to capture the relationship among web pages and web users, at last, propose a clustering algorithm to generate the co-cluster corresponding to each semantic factor in the latent semantic space via probability inference. The proposed approach is evaluated by experiments performed on real datasets in terms of precision and recall metrics. Experimental results have demonstrated the proposed method can effectively reveal the co-aggregates of web users and pages which are closely related.
Original languageEnglish
Book seriesLecture Notes in Computer Science
Volume6488
Pages (from-to)120-127
ISSN0302-9743
DOIs
Publication statusPublished - 12 Dec 2010
EventWeb Information Systems Engineering – WISE 2010 - Hong Kong, China
Duration: 12 Dec 201014 Dec 2010

Conference

ConferenceWeb Information Systems Engineering – WISE 2010
CountryChina
CityHong Kong
Period12/12/201014/12/2010

Fingerprint

Semantics
Clustering
Latent Semantic Analysis
Clustering Algorithm
Clustering algorithms
Web Usage Mining
Probability Space
World Wide Web
Probability distributions
Probability Distribution
Websites
Partition
Metric
Experimental Results
Experiment
Object
Experiments

Cite this

Zong, Yu ; Xu, Guandong ; Dolog, Peter ; Zhang, Yanchun ; Liu, Renjin. / Co-clustering for Weblogs in Semantic Space. In: Lecture Notes in Computer Science. 2010 ; Vol. 6488. pp. 120-127.
@inproceedings{fa9fa57b8d574a128e1af215cd5c37ec,
title = "Co-clustering for Weblogs in Semantic Space",
abstract = "Web clustering is an approach for aggregating web objects into various groups according to underlying relationships among them. Finding co-clusters of web objects in semantic space is an interesting topic in the context of web usage mining, which is able to capture the underlying user navigational interest and content preference simultaneously. In this paper we will present a novel web co-clustering algorithm named Co-Clustering in Semantic space (COCS) to simultaneously partition web users and pages via a latent semantic analysis approach. In COCS, we first, train the latent semantic space of weblog data by using Probabilistic Latent Semantic Analysis (PLSA) model, and then, project all weblog data objects into this semantic space with probability distribution to capture the relationship among web pages and web users, at last, propose a clustering algorithm to generate the co-cluster corresponding to each semantic factor in the latent semantic space via probability inference. The proposed approach is evaluated by experiments performed on real datasets in terms of precision and recall metrics. Experimental results have demonstrated the proposed method can effectively reveal the co-aggregates of web users and pages which are closely related.",
author = "Yu Zong and Guandong Xu and Peter Dolog and Yanchun Zhang and Renjin Liu",
year = "2010",
month = "12",
day = "12",
doi = "10.1007/978-3-642-17616-6_12",
language = "English",
volume = "6488",
pages = "120--127",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Physica-Verlag",

}

Co-clustering for Weblogs in Semantic Space. / Zong, Yu; Xu, Guandong; Dolog, Peter; Zhang, Yanchun; Liu, Renjin.

In: Lecture Notes in Computer Science, Vol. 6488, 12.12.2010, p. 120-127.

Research output: Contribution to journalConference article in JournalResearchpeer-review

TY - GEN

T1 - Co-clustering for Weblogs in Semantic Space

AU - Zong, Yu

AU - Xu, Guandong

AU - Dolog, Peter

AU - Zhang, Yanchun

AU - Liu, Renjin

PY - 2010/12/12

Y1 - 2010/12/12

N2 - Web clustering is an approach for aggregating web objects into various groups according to underlying relationships among them. Finding co-clusters of web objects in semantic space is an interesting topic in the context of web usage mining, which is able to capture the underlying user navigational interest and content preference simultaneously. In this paper we will present a novel web co-clustering algorithm named Co-Clustering in Semantic space (COCS) to simultaneously partition web users and pages via a latent semantic analysis approach. In COCS, we first, train the latent semantic space of weblog data by using Probabilistic Latent Semantic Analysis (PLSA) model, and then, project all weblog data objects into this semantic space with probability distribution to capture the relationship among web pages and web users, at last, propose a clustering algorithm to generate the co-cluster corresponding to each semantic factor in the latent semantic space via probability inference. The proposed approach is evaluated by experiments performed on real datasets in terms of precision and recall metrics. Experimental results have demonstrated the proposed method can effectively reveal the co-aggregates of web users and pages which are closely related.

AB - Web clustering is an approach for aggregating web objects into various groups according to underlying relationships among them. Finding co-clusters of web objects in semantic space is an interesting topic in the context of web usage mining, which is able to capture the underlying user navigational interest and content preference simultaneously. In this paper we will present a novel web co-clustering algorithm named Co-Clustering in Semantic space (COCS) to simultaneously partition web users and pages via a latent semantic analysis approach. In COCS, we first, train the latent semantic space of weblog data by using Probabilistic Latent Semantic Analysis (PLSA) model, and then, project all weblog data objects into this semantic space with probability distribution to capture the relationship among web pages and web users, at last, propose a clustering algorithm to generate the co-cluster corresponding to each semantic factor in the latent semantic space via probability inference. The proposed approach is evaluated by experiments performed on real datasets in terms of precision and recall metrics. Experimental results have demonstrated the proposed method can effectively reveal the co-aggregates of web users and pages which are closely related.

U2 - 10.1007/978-3-642-17616-6_12

DO - 10.1007/978-3-642-17616-6_12

M3 - Conference article in Journal

VL - 6488

SP - 120

EP - 127

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -