Modeling Chinese Microblogs with Five Ws for Topic Hashtags Extraction

Zhibin Zhao; Jiahong Sun; Lan Yao; Xun Wang; Jiahong Chu; Huan Liu; Ge Yu

doi:10.23919/TST.2017.7889636

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (957 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Modeling Chinese Microblogs with Five Ws for Topic Hashtags Extraction

Zhibin Zhao, Jiahong Sun, Lan Yao(

), Xun Wang, Jiahong Chu, Huan Liu, Ge Yu

College of Computer Science and Engineering, Northeastern University, Shenyang 110819, China.

Show Author Information

Abstract

Hashtags are important metadata in microblogs and are used to mark topics or index messages. However, statistics show that hashtags are absent from most microblogs. This poses great challenges for the retrieval and analysis of these tagless microblogs. In this paper, we summarize the similarity between microblogs and short-message-style news, and then propose an algorithm, named 5WTAG, for detecting microblog topics based on a model of five Ws (When, Where, Who, What, hoW). As five-W attributes are the core components in event description, it is guaranteed theoretically that 5WTAG can properly extract semantic topics from microblogs. We introduce the detailed procedure of the algorithm in this paper including spam microblog identification, microblog segmentation, and candidate hashtag construction. In addition, we propose a novel recommendation computing method for ranking candidate hashtags, which combines syntax and semantic analysis and observes the distribution of artificial topic hashtags. Finally, we conduct comprehensive experiments to verify the semantic correctness and completeness of the candidate hashtags, as well as the accuracy of the recommendation method using real data from Sina Weibo.

Keywords

hashtag microblog topic detection short-message-style news five Ws

References

【1】

Crossref Google Scholar

Tsinghua Science and Technology

Volume 22 Issue 2,
April 2017

Pages 135-148

DOI: 10.23919/TST.2017.7889636

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Zhao Z, Sun J, Yao L, et al. Modeling Chinese Microblogs with Five Ws for Topic Hashtags Extraction. Tsinghua Science and Technology, 2017, 22(2): 135-148. https://doi.org/10.23919/TST.2017.7889636

1221

Views

Downloads

Crossref

N/A

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 31 August 2016

Revised: 24 December 2016

Accepted: 26 December 2016

Published: 06 April 2017