Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network

Xiaoliang Xu; Tong Gao; Yuxiang Wang; Xinle Xuan

doi:10.26599/TST.2020.9010063

Tsinghua Science and Technology 2022, 27(1): 79-90 https://doi.org/10.26599/TST.2020.9010063

Open Access | Issue | Published: 17 August 2021

Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network

Show Author's Information Hide Author's Information Xiaoliang Xu, Tong Gao, Yuxiang Wang(

), Xinle Xuan

Department of Computer Science and Engineering, Hangzhou Dianzi University, Hangzhou 310018, China

Hangzhou Sanhui Digital Information Technology Co., Ltd, Hangzhou 310018, China

Keywords:

neural network, attention mechanism, temporal relation extraction, graph attention network

Cite this article:

Xu X, Gao T, Wang Y, et al. Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network. Tsinghua Science and Technology, 2022, 27(1): 79-90. https://doi.org/10.26599/TST.2020.9010063

Download citation

EndNote(RIS)

BibTeX

730

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Event temporal relation extraction is an important part of natural language processing. Many models are being used in this task with the development of deep learning. However, most of the existing methods cannot accurately obtain the degree of association between different tokens and events, and event-related information cannot be effectively integrated. In this paper, we propose an event information integration model that integrates event information through multilayer bidirectional long short-term memory (Bi-LSTM) and attention mechanism. Although the above scheme can improve the extraction performance, it can still be further optimized. To further improve the performance of the previous scheme, we propose a novel relational graph attention network that incorporates edge attributes. In this approach, we first build a semantic dependency graph through dependency parsing, model a semantic graph that considers the edges’ attributes by using top-k attention mechanisms to learn hidden semantic contextual representations, and finally predict event temporal relations. We evaluate proposed models on the TimeBank-Dense dataset. Compared to previous baselines, the Micro-F1 scores obtained by our models improve by 3.9% and 14.5%, respectively.

Full text

Abstract

Full text

Outline

About this article

Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network

Show Author's information Hide Author's Information Xiaoliang Xu, Tong Gao, Yuxiang Wang(

), Xinle Xuan

Department of Computer Science and Engineering, Hangzhou Dianzi University, Hangzhou 310018, China

Hangzhou Sanhui Digital Information Technology Co., Ltd, Hangzhou 310018, China

Abstract

Keywords: neural network, attention mechanism, temporal relation extraction, graph attention network

References(37)

[1]

Y. J. Zhang, P. F. Li, and G. D. Zhou, Classifying temporal relations between events by deep biLSTM, in Proc. 2018 Int. Conf. on Asian Language Processing, Bandung, Indonesia, 2018, pp. 267-272.

DOI

[2]

L. Derczynski and R. Gaizauskas, Using signals to improve automatic classification of temporal relations, arXiv preprint arXiv: 1203.5055, 2012.

Google Scholar

[3]

G. A. Miller, Wordnet: A lexical database for English, Commun. ACM, vol. 38, no. 11, pp. 39-41, 1995.

DOI Google Scholar

[4]

I. Mani, M. Verhagen, B. Wellner, C. M. Lee, and J. Pustejovsky, Machine learning of temporal relations, in Proc. 21st Int. Conf. on Computational Linguistics and the 44th Annu. Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006, pp. 753-760.

DOI

[5]

F. Cheng and Y. Miyao, Classifying temporal relations by bidirectional LSTM over dependency paths, in Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 1-6.

DOI

[6]

Y. L. Meng, A. Rumshisky, and A. Romanov, Temporal information extraction for question answering using syntactic dependencies in an LSTM-based architecture, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 887-896.

DOI

[7]

P. K. Choubey and R. H. Huang, A sequential model for classifying temporal relations between intra-sentence events, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1796-1802.

DOI

[8]

C. H. Zhang, M. Zhou, X. Han, Z. Hu, and Y. Ji, Knowledge graph embedding for hyper-relational data, Tsinghua Science and Technology, vol. 22, no. 2, pp. 185-197, 2017.

DOI Google Scholar

[9]

M. Verhagen, R. Gaizauskas, F. Schilder, M. Hepple, G. Katz, and J. Pustejovsky, Semeval-2007 task 15: Tempeval temporal relation identification, in Proc. 4th Int. Workshop on Semantic Evaluations, Prague, Czech Republic, 2007, pp. 75-80.

DOI

[10]

M. Verhagen, R. Saurí, T. Caselli, and J. Pustejovsky, Semeval-2010 task 13: Tempeval-2, in Proc. 5th Int. Workshop on Semantic Evaluation, Uppsala, Sweden, 2010, pp. 57-62.

[11]

N. UzZaman, H. Llorens, L. Derczynski, J. Allen, M. Verhagen, and J. Pustejovsky, SemEval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations, in Proc. 2nd Joint Conf. on Lexical and Computational Semantics (*SEM), Volume 2: Proc. 7th Int. Workshop on Semantic Evaluation (SemEval 2013), Atlanta, GA, USA, 2013, pp. 1-9.

[12]

N. Chambers, S. Wang, and D. Jurafsky, Classifying temporal relations between events, in Proc. 45th Annu. Meeting of the Association for Computational Linguistics Companion Volume Proc. Demo and Poster Sessions, Prague, Czech Republic, 2007, pp. 173-176.

DOI

[13]

A. Leeuwenberg and M. F. Moens, Structured learning for temporal relation extraction from clinical records, in Proc. 15th Conf. of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, 2017, pp. 1150-1158.

DOI

[14]

N. Chambers, Navytime: Event and Time Ordering from Raw Text. Annapolis, MD, USA: Naval Academy, 2013.

[15]

M. Miwa and M. Bansal, End-to-end relation extraction using LSTMs on sequences and tree structures, in Proc. 54th Annu. Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, 1105-1116.

DOI

[16]

X. Han, B. Y. Li, and Z. R. Wang, An attention-based neural framework for uncertainty identification on social media texts, Tsinghua Science and Technology, vol. 25, no. 1, pp. 117-126, 2020.

DOI Google Scholar

[17]

R. Y. Xin, J. Zhang, and Y. T. Shao, Complex network classification with convolutional neural network, Tsinghua Science and Technology, vol. 25, no. 4, pp. 447-457, 2020.

DOI Google Scholar

[18]

J. Tourille, O. Ferret, A. Névéol, and X. Tannier, Neural architecture for temporal relation extraction: A bi-LSTM approach for detecting narrative containers, in Proc 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 224-230.

DOI

[19]

N. Laokulrat, M. Miwa, Y. Tsuruoka, and T. Chikayama, Uttime: Temporal relation classification using deep syntactic features, in Proc. 2nd Joint Conf. on Lexical and Computational Semantics (*SEM), Volume 2: Proc. 7th International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, GA, USA, 2013, pp. 88-92.

[20]

M. Gori, G. Monfardini, and F. Scarselli, A new model for learning in graph domains, in Proc. 2005 IEEE Int. Joint Conf. on Neural Networks, Montreal, Canada, 2005, pp. 729-734.

[21]

M. Henaff, J. Bruna, and Y. LeCun, Deep convolutional networks on graph-structured data, arXiv preprint arXiv: 1506.05163, 2015.

Google Scholar

[22]

M. Defferrard, X. Bresson, and P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, in Proc. 30th Int. Conf. on Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 3844-3852.

[23]

T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2016.

Google Scholar

[24]

D. Marcheggiani and I. Titov, Encoding sentences with graph convolutional networks for semantic role labeling, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1506-1515.

DOI

[25]

Y. H. Zhang, P. Qi, and C. D. Manning, Graph convolution over pruned dependency trees improves relation extraction, in Proc. 2018 Conf. on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018, pp. 2205-2215.

DOI

[26]

P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, Graph attention networks, arXiv preprint arXiv: 1710.10903, 2017.

Google Scholar

[27]

D. Busbridge, D. Sherburn, P. Cavallo, and N. Y. Hammerla, Relational graph attention networks, arXiv preprint arXiv: 1904.05811, 2019.

Google Scholar

[28]

Y. Xu, L. L. Mou, G. Li, Y. C. Chen, H. Peng, and Z. Jin, Classifying relations via long short term memory networks along shortest dependency paths, in Proc. 2015 Conf. on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1785-1794.

DOI

[29]

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, Minneapolis, Minnesota, 2019, pp. 4171-4186.

[30]

X. Liu, Z. C. Luo, and H. Y. Huang, Jointly multiple events extraction via attention-based graph information aggregation, in Proc. 2018 Conf. on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018, pp. 1247-1256.

DOI

[31]

P. Zhou, Z. Y. Qi, S. C. Zheng, J. M. Xu, H. Y. Bao, and B. Xu, Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling, Proc. 26th Int. Conf. on Computational Linguistics: Technical Papers, Osaka, Japan, 2016, pp. 3485-3495.

[32]

H. Zhu, Y. K. Lin, Z. Y. Liu, J. Fu, T. S. Chua, and M. S. Sun, Graph neural networks with generated parameters for relation extraction, in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 1331-1339.

DOI

[33]

A. Santoro, D. Raposo, D. G. T. Barrett, M. Malinowski, R. Pascanu, P. Battaglia, and T. Lillicrap, A simple neural network module for relational reasoning. in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 4974-4983.

[34]

N. Chambers, T. C. B. McDowell, and S. Bethard, Dense event ordering with a multi-pass architecture, Trans. Assoc Comput Linguist, vol. 2, pp. 273-284, 2014.

DOI Google Scholar

[35]

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv: 1412.6980, 2014.

Google Scholar

[36]

X. Glorot, A. Bordes, and Y. Bengio, Deep sparse rectifier neural networks, in Proc. 14th Int. Conf. on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 2011, pp. 315-323.

[37]

Y. L. Meng and A. Rumshisky, Context-aware neural model for temporal information extraction, in Proc. 56th Annu. Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 2018, pp. 527-536.

DOI

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 01 December 2020

Accepted: 16 December 2020

Published: 17 August 2021

Issue date: February 2022

Copyright

Acknowledgements

This work was supported by the National key Research & Development Program of China (No. 2017YFC0820503); the National Natural Science Foundation of China (No. 62072149); the National Social Science Foundation of China (No. 19ZDA348); the Primary Research & Development Plan of Zhejiang (No. 2021C03156); and the Public Welfare Research Program of Zhejiang (No. LGG19F020017).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).