Open Access Issue
Inductive Relation Prediction by Disentangled Subgraph Structure
Tsinghua Science and Technology 2024, 29 (5): 1566-1579
Published: 02 May 2024
Abstract PDF (3.5 MB) Collect

Currently, most existing inductive relation prediction approaches are based on subgraph structures, with subgraph features extracted using graph neural networks to predict relations. However, subgraphs may contain disconnected regions, which usually represent different semantic ranges. Because not all semantic information about the regions is helpful in relation prediction, we propose a relation prediction model based on a disentangled subgraph structure and implement a feature updating approach based on relevant semantic aggregation. To indirectly achieve the disentangled subgraph structure from a semantic perspective, the mapping of entity features into different semantic spaces and the aggregation of related semantics on each semantic space are updated. The disentangled model can focus on features having higher semantic relevance in the prediction, thus addressing a problem with existing approaches, which ignore the semantic differences in different subgraph structures. Furthermore, using a gated recurrent neural network, this model enhances the features of entities by sorting them by distance and extracting the path information in the subgraphs. Experimentally, it is shown that when there are numerous disconnected regions in the subgraph, our model outperforms existing mainstream models in terms of both Area Under the Curve-Precision-Recall (AUC-PR) and Hits@10. Experiments prove that semantic differences in the knowledge graph can be effectively distinguished and verify the effectiveness of this method.

Open Access Issue
Denoising Graph Inference Network for Document-Level Relation Extraction
Big Data Mining and Analytics 2023, 6 (2): 248-262
Published: 26 January 2023
Abstract PDF (4.8 MB) Collect

Relation Extraction (RE) is to obtain a predefined relation type of two entities mentioned in a piece of text, e.g., a sentence-level or a document-level text. Most existing studies suffer from the noise in the text, and necessary pruning is of great importance. The conventional sentence-level RE task addresses this issue by a denoising method using the shortest dependency path to build a long-range semantic dependency between entity pairs. However, this kind of denoising method is scarce in document-level RE. In this work, we explicitly model a denoised document-level graph based on linguistic knowledge to capture various long-range semantic dependencies among entities. We first formalize a Syntactic Dependency Tree forest (SDT-forest) by introducing the syntax and discourse dependency relation. Then, the Steiner tree algorithm extracts a mention-level denoised graph, Steiner Graph (SG), removing linguistically irrelevant words from the SDT-forest. We then devise a slide residual attention to highlight word-level evidence on text and SG. Finally, the classification is established on the SG to infer the relations of entity pairs. We conduct extensive experiments on three public datasets. The results evidence that our method is beneficial to establish long-range semantic dependency and can improve the classification performance with longer texts.

Open Access Issue
Disseminating Authorized Content via Data Analysis in Opportunistic Social Networks
Big Data Mining and Analytics 2019, 2 (1): 12-24
Published: 15 October 2018
Abstract PDF (1.2 MB) Collect

Authorized content is a type of content that can be generated only by a certain Content Provider (CP). The content copies delivered to a user may bring rewards to the CP if the content is adopted by the user. The overall reward obtained by the CP depends on the user’s degree of interest in the content and the user’s role in disseminating the content copies. Thus, to maximize the reward, the content provider is motivated to disseminate the authorized content to the most interested users. In this paper, we study how to effectively disseminate the authorized content in Interest-centric Opportunistic Social Networks (IOSNs) such that the reward is maximized. We first derive Social Connection Pattern (SCP) data to handle the challenging opportunistic connections in IOSNs and statistically analyze the interest distribution of the users contacted or connected. The SCP is used to predict the interests of possible contactors and connectors. Then, we propose our SCP-based Dissemination (SCPD) algorithm to calculate the optimum number of content copies to disseminate when two users meet. Our dataset based simulation shows that our SCPD algorithm is effective and efficient to disseminate the authorized content in IOSNs.

Open Access Issue
Location Prediction on Trajectory Data: A Review
Big Data Mining and Analytics 2018, 1 (2): 108-127
Published: 12 April 2018
Abstract PDF (5.4 MB) Collect

Location prediction is the key technique in many location based services including route navigation, dining location recommendations, and traffic planning and control, to mention a few. This survey provides a comprehensive overview of location prediction, including basic definitions and concepts, algorithms, and applications. First, we introduce the types of trajectory data and related basic concepts. Then, we review existing location-prediction methods, ranging from temporal-pattern-based prediction to spatiotemporal-pattern-based prediction. We also discuss and analyze the advantages and disadvantages of these algorithms and briefly summarize current applications of location prediction in diverse fields. Finally, we identify the potential challenges and future research directions in location prediction.

Total 4