Context-Aware Semantic Type Identification for Relational Attributes

Yue Ding; Yu-He Guo; Wei Lu; Hai-Xiang Li; Mei-Hui Zhang; Hui Li; An-Qun Pan; Xiao-Yong Du

doi:10.1007/s11390-021-1048-y

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Regular Paper

Context-Aware Semantic Type Identification for Relational Attributes

Yue Ding^{¹^,²}, Yu-He Guo^{¹^,²}, Wei Lu^{¹^,²}(

), Hai-Xiang Li^³, Mei-Hui Zhang^⁴, Hui Li^⁵, An-Qun Pan^⁶, Xiao-Yong Du^{¹^,²}

1Key Laboratory of Data Engineering and Knowledge Engineering of Ministry of Education, Renmin University of China Beijing 100872, China

2School of Information, Renmin University of China, Beijing 100872, China

3Tencent (Beijing) Technology Company Limited, Beijing 100080, China

4School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China

5College of Computer Science and Technology, Guizhou University, Guiyang 550025, China

6Tencent (Shenzhen) Technology Company Limited, Shenzhen 518057, China

Show Author Information

Abstract

Identifying semantic types for attributes in relations, known as attribute semantic type (AST) identification, plays an important role in many data analysis tasks, such as data cleaning, schema matching, and keyword search in databases. However, due to a lack of unified naming standards across prevalent information systems (a.k.a. information islands), AST identification still remains as an open problem. To tackle this problem, we propose a context-aware method to figure out the ASTs for relations in this paper. We transform the AST identification into a multi-class classification problem and propose a schema context aware (SCA) model to learn the representation from a collection of relations associated with attribute values and schema context. Based on the learned representation, we predict the AST for a given attribute from an underlying relation, wherein the predicted AST is mapped to one of the labeled ASTs. To improve the performance for AST identification, especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs, we then introduce knowledge base embeddings (a.k.a. KBVec) to enhance the above representation and construct a schema context aware model with knowledge base enhanced (SCA-KB) to get a stable and robust model. Extensive experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin, up to 6.14% and 25.17% in terms of macro average $F_{1}$ score, and up to 0.28% and 9.56% in terms of weighted $F_{1}$ score over high-quality and low-quality datasets respectively.

Keywords

attribute semantic type (AST) identification context-aware semantic embedding knowledge base embedding

Electronic Supplementary Material

Download File(s)

JCST-2010-11048-Highlights.pdf (708.8 KB)

References

【1】

Crossref Google Scholar

Journal of Computer Science and Technology

Volume 38 Issue 4,
July 2023

Pages 927-946

DOI: 10.1007/s11390-021-1048-y

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Ding Y, Guo Y-H, Lu W, et al. Context-Aware Semantic Type Identification for Relational Attributes. Journal of Computer Science and Technology, 2023, 38(4): 927-946. https://doi.org/10.1007/s11390-021-1048-y

811

Views

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 05 October 2020

Accepted: 09 June 2021

Published: 06 December 2023