Publications
Sort:
Open Access Issue
Adversarial Training for Supervised Relation Extraction
Tsinghua Science and Technology 2022, 27 (3): 610-618
Published: 13 November 2021
Downloads:57

Most supervised methods for relation extraction (RE) involve time-consuming human annotation. Distant supervision for RE is an efficient method to obtain large corpora that contains thousands of instances and various relations. However, the existing approaches rely heavily on knowledge bases (e.g., Freebase), thereby introducing data noise. Various relations and noisy labeling instances make the issue difficult to solve. In this study, we propose a model based on a piecewise convolution neural network with adversarial training. Inspired by generative adversarial networks, we adopt a heuristic algorithm to identify noisy datasets and apply adversarial training to RE. Experiments on the extended dataset of SemEval-2010 Task 8 show that our model can obtain more accurate training data for RE and significantly outperforms several competitive baseline models. Our model has an F1 score of 89.61%.

Open Access Issue
Residuals-Based Deep Least Square Support Vector Machine with Redundancy Test Based Model Selection to Predict Time Series
Tsinghua Science and Technology 2019, 24 (6): 706-715
Published: 05 December 2019
Downloads:14

In this paper, we propose a novel Residuals-Based Deep Least Squares Support Vector Machine (RBD-LSSVM). In the RBD-LSSVM, multiple LSSVMs are sequentially connected. The second LSSVM uses the fitting residuals of the first LSSVM as input time series, and the third LSSVM trains the residuals of the second, and so on. The original time series is the input of the first LSSVM. Additionally, to obtain the best hyper-parameters for the RBD-LSSVM, we propose a model validation method based on redundancy test using Omni-Directional Correlation Function (ODCF). This method is based on the fact when a model is appropriate for a given time series, there should be no information or correlation in the residuals. We propose the use of ODCF as a statistic to detect nonlinear correlation between two random variables. Thus, we can select hyper-parameters without encountering overfitting, which cannot be avoided by only cross validation using the validation set. We conducted experiments on two time series: annual sunspot number series and monthly Total Column Ozone (TCO) series in New Delhi. Analysis of the prediction results and comparisons with recent and past studies demonstrate the promising performance of the proposed RBD-LSSVM approach with redundancy test based model selection method for modeling and predicting nonlinear time series.

total 2