D. Marcheggiani and I. Titov, Encoding sentences with graph convolutional networks for semantic role labeling, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1506−1515.https://doi.org/10.18653/v1/D17-1159
J. Mueller and A. Thyagarajan, Siamese recurrent architectures for learning sentence similarity, in Proc. Thirtieth AAAI Conf. on Artificial Intelligence, Phoenix, AZ, USA, 2016, pp. 2786–2792.https://doi.org/10.1609/aaai.v30i1.10350
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 4171–4186.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need. in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
W. J. Liu, P. Zhou, Z. Zhao, Z. R. Wang, Q. Ju, H. T. Deng, and P. Wang, K-BERT: Enabling language representation with knowledge graph, in Proc. AAAI Conf. on Artificial Intelligence, New York, NY, USA, 2019, pp. 2901–2908.https://doi.org/10.1609/aaai.v34i03.5681
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, presented at the 5th Int. Conf. on Learning Representations, Toulon, France, 2017.
B. Dolan, C. Quirk, and C. Brockett, Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources, in Proc. 20th Int. Conf. on Computational Linguistics, Geneva, Switzerland, 2004, pp. 350.https://doi.org/10.3115/1220355.1220406
Z. G. Wang, H. T. Mi, and A. Ittycheriah, Sentence similarity learning by lexical decomposition and composition, in Proc. COLING 2016, the 26th Int. Conf. on Computational Linguistics, Osaka, Japan, 2016, pp. 1340–1349.
M. Heilman and N. A. Smith, Tree edit models for recognizing textual entailments, paraphrases, and answers to questions, in Proc. Human Language Technologies: The 2010 Annu. Conf. of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA, USA, 2010, pp. 1011–1019.
D. Q. Chen and C. Manning, A fast and accurate dependency parser using neural networks, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 740–750.https://doi.org/10.3115/v1/D14-1082
Y. Q. Le, Z. J. Wang, Z. Quan, J. W. He, and B. Yao, ACV-tree: A new method for sentence similarity modeling, in Proc. Twenty-Seventh Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4137–4143.
H. He, K. Gimpel, and J. J. Lin, Multi-perspective sentence similarity modeling with convolutional neural networks, in Proc. 2015 Conf. on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1576–1586.https://doi.org/10.18653/v1/D15-1181
W. P. Yin and H. Schütze, Convolutional neural network for paraphrase identification, in Proc. 2015 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, 2015, pp. 901–911.https://doi.org/10.3115/v1/N15-1091
Z. Gan, Y. C. Pu, R. Henao, C. Y. Li, X. D. He, and L. Carin, Learning generic sentence representations using convolutional neural networks, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 2390–2400.https://doi.org/10.18653/v1/D17-1254
Q. Chen, Q. M. Hu, J. X. Huang, and L. He, CA-RNN: Using context-aligned recurrent neural networks for modeling sentence similarity, in Proc. AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 265–273.https://doi.org/10.1609/aaai.v32i1.11273
K. S. Tai, R. Socher, and C. D. Manning, Improved semantic representations from tree-structured long short-term memory networks, in Proc. 53rd Annu. Meeting of the Association for Computational Linguistics and the 7th Int. Joint Conf. on Natural Language Processing (Volume 1: Long Papers), Beijing, China, 2015, pp. 1556–1566.https://doi.org/10.3115/v1/P15-1150
H. Zhou, T. Young, M. L. Huang, H. Z. Zhao, J. F. Xu, and X. Y. Zhu, Commonsense knowledge aware conversation generation with graph attention, in Proc. Twenty-Seventh Int. Joint Conf. on Artificial Intelligence, Stockholm, Sweden, 2018, pp. 4623–4629.https://doi.org/10.24963/ijcai.2018/643
Z. Y. Zhang, X. Han, Z. Y. Liu, X. Jiang, M. S. Sun, and Q. Liu, ERNIE: Enhanced language representation with informative entities, in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 1441–1451.https://doi.org/10.18653/v1/P19-1139
S. Chopra, R. Hadsell, and Y. LeCun, Learning a similarity metric discriminatively, with application to face verification, in Proc. 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, pp. 539–546.
W. T. Yih, K. Toutanova, J. C. Platt, and C. Meek. Learning discriminative projections for text similarity measures, in Proc. Fifteenth Conf. on Computational Natural Language Learning, Portland, OR, USA, 2011, pp. 247–256.
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping, in Proc. 2006 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 1735–1742, 2006.
J. Pennington, R. Socher, and C. D. Manning, GloVe: Global vectors for word representation, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1532–1543.https://doi.org/10.3115/v1/D14-1162
Q. Chen, X. D. Zhu, Z. H. Ling, S. Wei, H. Jiang, and D. Inkpen, Enhanced LSTM for natural language inference, in Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 1657–1668, 2017.https://doi.org/10.18653/v1/P17-1152
W. P. Yin, H. Schütze, B. Xiang, and B. Zhou, ABCNN: Attention-based convolutional neural network for modeling sentence pairs, Trans. Assoc. Comput. Linguist, vol. 4, pp. 259–272, 2016.https://doi.org/10.1162/tacl_a_00097
A. Severyn and A. Moschitti, Learning to rank short text pairs with convolutional deep neural networks, in Proc. 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Santiago, Chile, 2015, pp. 373–382.https://doi.org/10.1145/2766462.2767738