Journal Home > Volume 6 , issue 1

In recent years, neural networks have been widely used in natural language processing, especially in sentence similarity modeling. Most of the previous studies focused on the current sentence, ignoring the commonsense knowledge related to the current sentence in the task of sentence similarity modeling. Commonsense knowledge can be remarkably useful for understanding the semantics of sentences. CK-Encoder, which can effectively acquire commonsense knowledge to improve the performance of sentence similarity modeling, is proposed in this paper. Specifically, the model first generates a commonsense knowledge graph of the input sentence and calculates this graph by using the graph convolution network. In addition, CKER, a framework combining CK-Encoder and sentence encoder, is introduced. Experiments on two sentence similarity tasks have demonstrated that CK-Encoder can effectively acquire commonsense knowledge to improve the capability of a model to understand sentences.


menu
Abstract
Full text
Outline
About this article

CK-Encoder: Enhanced Language Representation for Sentence Similarity

Show Author's information Tao Jiang1Fengjian Kang1Wei Guo2Wei He1Lei Liu1Xudong Lu1Yonghui Xu2( )Lizhen Cui1,2( )
School of Software, Shandong University, Jinan 250101, China
Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China

Abstract

In recent years, neural networks have been widely used in natural language processing, especially in sentence similarity modeling. Most of the previous studies focused on the current sentence, ignoring the commonsense knowledge related to the current sentence in the task of sentence similarity modeling. Commonsense knowledge can be remarkably useful for understanding the semantics of sentences. CK-Encoder, which can effectively acquire commonsense knowledge to improve the performance of sentence similarity modeling, is proposed in this paper. Specifically, the model first generates a commonsense knowledge graph of the input sentence and calculates this graph by using the graph convolution network. In addition, CKER, a framework combining CK-Encoder and sentence encoder, is introduced. Experiments on two sentence similarity tasks have demonstrated that CK-Encoder can effectively acquire commonsense knowledge to improve the capability of a model to understand sentences.

Keywords:

CK-Encoder, sentence similarity, commonsense knowledge
Received: 15 September 2021 Accepted: 26 September 2021 Published: 15 April 2022 Issue date: April 2022
References(25)
1
D. Marcheggiani and I. Titov, Encoding sentences with graph convolutional networks for semantic role labeling, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1506−1515.https://doi.org/10.18653/v1/D17-1159
2
J. Mueller and A. Thyagarajan, Siamese recurrent architectures for learning sentence similarity, in Proc. Thirtieth AAAI Conf. on Artificial Intelligence, Phoenix, AZ, USA, 2016, pp. 2786–2792.https://doi.org/10.1609/aaai.v30i1.10350
3
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 4171–4186.
4
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need. in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
5
W. J. Liu, P. Zhou, Z. Zhao, Z. R. Wang, Q. Ju, H. T. Deng, and P. Wang, K-BERT: Enabling language representation with knowledge graph, in Proc. AAAI Conf. on Artificial Intelligence, New York, NY, USA, 2019, pp. 2901–2908.https://doi.org/10.1609/aaai.v34i03.5681
6
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, presented at the 5th Int. Conf. on Learning Representations, Toulon, France, 2017.
7
B. Dolan, C. Quirk, and C. Brockett, Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources, in Proc. 20th Int. Conf. on Computational Linguistics, Geneva, Switzerland, 2004, pp. 350.https://doi.org/10.3115/1220355.1220406
8
Z. G. Wang, H. T. Mi, and A. Ittycheriah, Sentence similarity learning by lexical decomposition and composition, in Proc. COLING 2016, the 26th Int. Conf. on Computational Linguistics, Osaka, Japan, 2016, pp. 1340–1349.
9
M. Heilman and N. A. Smith, Tree edit models for recognizing textual entailments, paraphrases, and answers to questions, in Proc. Human Language Technologies: The 2010 Annu. Conf. of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA, USA, 2010, pp. 1011–1019.
10
D. Q. Chen and C. Manning, A fast and accurate dependency parser using neural networks, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 740–750.https://doi.org/10.3115/v1/D14-1082
11
Y. Q. Le, Z. J. Wang, Z. Quan, J. W. He, and B. Yao, ACV-tree: A new method for sentence similarity modeling, in Proc. Twenty-Seventh Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4137–4143.
12
H. He, K. Gimpel, and J. J. Lin, Multi-perspective sentence similarity modeling with convolutional neural networks, in Proc. 2015 Conf. on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1576–1586.https://doi.org/10.18653/v1/D15-1181
13
W. P. Yin and H. Schütze, Convolutional neural network for paraphrase identification, in Proc. 2015 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, 2015, pp. 901–911.https://doi.org/10.3115/v1/N15-1091
14
Z. Gan, Y. C. Pu, R. Henao, C. Y. Li, X. D. He, and L. Carin, Learning generic sentence representations using convolutional neural networks, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 2390–2400.https://doi.org/10.18653/v1/D17-1254
15
Q. Chen, Q. M. Hu, J. X. Huang, and L. He, CA-RNN: Using context-aligned recurrent neural networks for modeling sentence similarity, in Proc. AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 265–273.https://doi.org/10.1609/aaai.v32i1.11273
16
K. S. Tai, R. Socher, and C. D. Manning, Improved semantic representations from tree-structured long short-term memory networks, in Proc. 53rd Annu. Meeting of the Association for Computational Linguistics and the 7th Int. Joint Conf. on Natural Language Processing (Volume 1: Long Papers), Beijing, China, 2015, pp. 1556–1566.https://doi.org/10.3115/v1/P15-1150
17
H. Zhou, T. Young, M. L. Huang, H. Z. Zhao, J. F. Xu, and X. Y. Zhu, Commonsense knowledge aware conversation generation with graph attention, in Proc. Twenty-Seventh Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4623–4629.https://doi.org/10.24963/ijcai.2018/643
18
Z. Y. Zhang, X. Han, Z. Y. Liu, X. Jiang, M. S. Sun, and Q. Liu, ERNIE: Enhanced language representation with informative entities, in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 1441–1451.https://doi.org/10.18653/v1/P19-1139
19
S. Chopra, R. Hadsell, and Y. LeCun, Learning a similarity metric discriminatively, with application to face verification, in Proc. 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, pp. 539–546.
20
W. T. Yih, K. Toutanova, J. C. Platt, and C. Meek. Learning discriminative projections for text similarity measures, in Proc. Fifteenth Conf. on Computational Natural Language Learning, Portland, OR, USA, 2011, pp. 247–256.
21
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping, in Proc. 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 1735–1742, 2006.
22
J. Pennington, R. Socher, and C. D. Manning, GloVe: Global vectors for word representation, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1532–1543.https://doi.org/10.3115/v1/D14-1162
23
Q. Chen, X. D. Zhu, Z. H. Ling, S. Wei, H. Jiang, and D. Inkpen, Enhanced LSTM for natural language inference, in Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 1657–1668, 2017.https://doi.org/10.18653/v1/P17-1152
24
W. P. Yin, H. Schütze, B. Xiang, and B. Zhou, ABCNN: Attention-based convolutional neural network for modeling sentence pairs, Trans. Assoc. Comput. Linguist, vol. 4, pp. 259–272, 2016.https://doi.org/10.1162/tacl_a_00097
25
A. Severyn and A. Moschitti, Learning to rank short text pairs with convolutional deep neural networks, in Proc. 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Santiago, Chile, 2015, pp. 373–382.https://doi.org/10.1145/2766462.2767738
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 15 September 2021
Accepted: 26 September 2021
Published: 15 April 2022
Issue date: April 2022

Copyright

© The author(s) 2022

Acknowledgements

Acknowledgment

This work was supported by the Fundamental Research Funds of Shandong University.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Reprints and Permission requests may be sought directly from editorial office.

Return