Journal Home > Volume 6 , Issue 1

In recent years, neural networks have been widely used in natural language processing, especially in sentence similarity modeling. Most of the previous studies focused on the current sentence, ignoring the commonsense knowledge related to the current sentence in the task of sentence similarity modeling. Commonsense knowledge can be remarkably useful for understanding the semantics of sentences. CK-Encoder, which can effectively acquire commonsense knowledge to improve the performance of sentence similarity modeling, is proposed in this paper. Specifically, the model first generates a commonsense knowledge graph of the input sentence and calculates this graph by using the graph convolution network. In addition, CKER, a framework combining CK-Encoder and sentence encoder, is introduced. Experiments on two sentence similarity tasks have demonstrated that CK-Encoder can effectively acquire commonsense knowledge to improve the capability of a model to understand sentences.


menu
Abstract
Full text
Outline
About this article

CK-Encoder: Enhanced Language Representation for Sentence Similarity

Show Author's information Tao Jiang1Fengjian Kang1Wei Guo2Wei He1Lei Liu1Xudong Lu1Yonghui Xu2( )Lizhen Cui1,2( )
School of Software, Shandong University, Jinan 250101, China
Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China

Abstract

In recent years, neural networks have been widely used in natural language processing, especially in sentence similarity modeling. Most of the previous studies focused on the current sentence, ignoring the commonsense knowledge related to the current sentence in the task of sentence similarity modeling. Commonsense knowledge can be remarkably useful for understanding the semantics of sentences. CK-Encoder, which can effectively acquire commonsense knowledge to improve the performance of sentence similarity modeling, is proposed in this paper. Specifically, the model first generates a commonsense knowledge graph of the input sentence and calculates this graph by using the graph convolution network. In addition, CKER, a framework combining CK-Encoder and sentence encoder, is introduced. Experiments on two sentence similarity tasks have demonstrated that CK-Encoder can effectively acquire commonsense knowledge to improve the capability of a model to understand sentences.

Keywords: CK-Encoder, sentence similarity, commonsense knowledge

References(25)

1
D. Marcheggiani and I. Titov, Encoding sentences with graph convolutional networks for semantic role labeling, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1506−1515.https://doi.org/10.18653/v1/D17-1159
DOI
2
J. Mueller and A. Thyagarajan, Siamese recurrent architectures for learning sentence similarity, in Proc. Thirtieth AAAI Conf. on Artificial Intelligence, Phoenix, AZ, USA, 2016, pp. 2786–2792.https://doi.org/10.1609/aaai.v30i1.10350
DOI
3
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 4171–4186.
4
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need. in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
5
W. J. Liu, P. Zhou, Z. Zhao, Z. R. Wang, Q. Ju, H. T. Deng, and P. Wang, K-BERT: Enabling language representation with knowledge graph, in Proc. AAAI Conf. on Artificial Intelligence, New York, NY, USA, 2019, pp. 2901–2908.https://doi.org/10.1609/aaai.v34i03.5681
DOI
6
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, presented at the 5th Int. Conf. on Learning Representations, Toulon, France, 2017.
7
B. Dolan, C. Quirk, and C. Brockett, Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources, in Proc. 20th Int. Conf. on Computational Linguistics, Geneva, Switzerland, 2004, pp. 350.https://doi.org/10.3115/1220355.1220406
DOI
8
Z. G. Wang, H. T. Mi, and A. Ittycheriah, Sentence similarity learning by lexical decomposition and composition, in Proc. COLING 2016, the 26th Int. Conf. on Computational Linguistics, Osaka, Japan, 2016, pp. 1340–1349.
9
M. Heilman and N. A. Smith, Tree edit models for recognizing textual entailments, paraphrases, and answers to questions, in Proc. Human Language Technologies: The 2010 Annu. Conf. of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA, USA, 2010, pp. 1011–1019.
10
D. Q. Chen and C. Manning, A fast and accurate dependency parser using neural networks, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 740–750.https://doi.org/10.3115/v1/D14-1082
DOI
11
Y. Q. Le, Z. J. Wang, Z. Quan, J. W. He, and B. Yao, ACV-tree: A new method for sentence similarity modeling, in Proc. Twenty-Seventh Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4137–4143.
12
H. He, K. Gimpel, and J. J. Lin, Multi-perspective sentence similarity modeling with convolutional neural networks, in Proc. 2015 Conf. on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1576–1586.https://doi.org/10.18653/v1/D15-1181
DOI
13
W. P. Yin and H. Schütze, Convolutional neural network for paraphrase identification, in Proc. 2015 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, 2015, pp. 901–911.https://doi.org/10.3115/v1/N15-1091
DOI
14
Z. Gan, Y. C. Pu, R. Henao, C. Y. Li, X. D. He, and L. Carin, Learning generic sentence representations using convolutional neural networks, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 2390–2400.https://doi.org/10.18653/v1/D17-1254
DOI
15
Q. Chen, Q. M. Hu, J. X. Huang, and L. He, CA-RNN: Using context-aligned recurrent neural networks for modeling sentence similarity, in Proc. AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 265–273.https://doi.org/10.1609/aaai.v32i1.11273
DOI
16
K. S. Tai, R. Socher, and C. D. Manning, Improved semantic representations from tree-structured long short-term memory networks, in Proc. 53rd Annu. Meeting of the Association for Computational Linguistics and the 7th Int. Joint Conf. on Natural Language Processing (Volume 1: Long Papers), Beijing, China, 2015, pp. 1556–1566.https://doi.org/10.3115/v1/P15-1150
DOI
17
H. Zhou, T. Young, M. L. Huang, H. Z. Zhao, J. F. Xu, and X. Y. Zhu, Commonsense knowledge aware conversation generation with graph attention, in Proc. Twenty-Seventh Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4623–4629.https://doi.org/10.24963/ijcai.2018/643
DOI
18
Z. Y. Zhang, X. Han, Z. Y. Liu, X. Jiang, M. S. Sun, and Q. Liu, ERNIE: Enhanced language representation with informative entities, in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 1441–1451.https://doi.org/10.18653/v1/P19-1139
DOI
19
S. Chopra, R. Hadsell, and Y. LeCun, Learning a similarity metric discriminatively, with application to face verification, in Proc. 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, pp. 539–546.
20
W. T. Yih, K. Toutanova, J. C. Platt, and C. Meek. Learning discriminative projections for text similarity measures, in Proc. Fifteenth Conf. on Computational Natural Language Learning, Portland, OR, USA, 2011, pp. 247–256.
21
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping, in Proc. 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 1735–1742, 2006.
22
J. Pennington, R. Socher, and C. D. Manning, GloVe: Global vectors for word representation, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1532–1543.https://doi.org/10.3115/v1/D14-1162
DOI
23
Q. Chen, X. D. Zhu, Z. H. Ling, S. Wei, H. Jiang, and D. Inkpen, Enhanced LSTM for natural language inference, in Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 1657–1668, 2017.https://doi.org/10.18653/v1/P17-1152
DOI
24
W. P. Yin, H. Schütze, B. Xiang, and B. Zhou, ABCNN: Attention-based convolutional neural network for modeling sentence pairs, Trans. Assoc. Comput. Linguist, vol. 4, pp. 259–272, 2016.https://doi.org/10.1162/tacl_a_00097
DOI
25
A. Severyn and A. Moschitti, Learning to rank short text pairs with convolutional deep neural networks, in Proc. 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Santiago, Chile, 2015, pp. 373–382.https://doi.org/10.1145/2766462.2767738
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 15 September 2021
Accepted: 26 September 2021
Published: 15 April 2022
Issue date: April 2022

Copyright

© The author(s) 2022

Acknowledgements

Acknowledgment

This work was supported by the Fundamental Research Funds of Shandong University.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return