The ability to recommend candidate locations for service facility placement is crucial for the success of urban planning. Whether a location is suitable for establishing new facilities is largely determined by its potential popularity. However, it is a non-trivial task to predict popularity of candidate locations due to three significant challenges: 1) the spatio-temporal behavior correlations of urban dwellers, 2) the spatial correlations between candidate locations and existing facilities, and 3) the temporal auto-correlations of locations themselves. To this end, we propose a novel semi-supervised learning model, Spatio-Temporal Graph Convolutional and Recurrent Networks (STGCRN), aiming for popularity prediction and location recommendation. Specifically, we first partition the urban space into spatial neighborhood regions centered by locations, extract the corresponding features, and develop the location correlation graph. Next, a contextual graph convolution module based on the attention mechanism is introduced to incorporate local and global spatial correlations among locations. A recurrent neural network is proposed to capture temporal dependencies between locations. Furthermore, we adopt a location popularity approximation block to estimate the missing popularity from both the spatial and temporal domains. Finally, the overall implicit characteristics are concatenated and then fed into the recurrent neural network to obtain the ultimate popularity. The extensive experiments on two real-world datasets demonstrate the superiority of the proposed model compared with state-of-the-art baselines.
- Article type
- Year
- Co-author
Inductive knowledge graph embedding (KGE) aims to embed unseen entities in emerging knowledge graphs (KGs). The major recent studies of inductive KGE embed unseen entities by aggregating information from their neighboring entities and relations with graph neural networks (GNNs). However, these methods rely on the existing neighbors of unseen entities and suffer from two common problems: data sparsity and feature smoothing. Firstly, the data sparsity problem means unseen entities usually emerge with few triplets containing insufficient information. Secondly, the effectiveness of the features extracted from original KGs will degrade when repeatedly propagating these features to represent unseen entities in emerging KGs, which is termed feature smoothing problem. To tackle the two problems, we propose a novel model entitled Meta-Learning Based Memory Graph Convolutional Network (MMGCN) consisting of three different components: 1) the two-layer information transforming module (TITM) developed to effectively transform information from original KGs to emerging KGs; 2) the hyper-relation feature initializing module (HFIM) proposed to extract type-level features shared between KGs and obtain a coarse-grained representation for each entity with these features; and 3) the meta-learning training module (MTM) designed to simulate the few-shot emerging KGs and train the model in a meta-learning framework. The extensive experiments conducted on the few-shot link prediction task for emerging KGs demonstrate the superiority of our proposed model MMGCN compared with state-of-the-art methods.
Author name disambiguation (AND) is a central task in academic search, which has received more attention recently accompanied by the increase of authors and academic publications. To tackle the AND problem, existing studies have proposed various approaches based on different types of information, such as raw document features (e.g., co-authors, titles, and keywords), the fusion feature (e.g., a hybrid publication embedding based on multiple raw document features), the local structural information (e.g., a publication's neighborhood information on a graph), and the global structural information (e.g., interactive information between a node and others on a graph). However, there has been no work taking all the above-mentioned information into account and taking full advantage of the contributions of each raw document feature for the AND problem so far. To fill the gap, we propose a novel framework named EAND (Towards Effective Author Name Disambiguation by Hybrid Attention). Specifically, we design a novel feature extraction model, which consists of three hybrid attention mechanism layers, to extract key information from the global structural information and the local structural information that are generated from six similarity graphs constructed based on different similarity coefficients, raw document features, and the fusion feature. Each hybrid attention mechanism layer contains three key modules: a local structural perception, a global structural perception, and a feature extractor. Additionally, the mean absolute error function in the joint loss function is used to introduce the structural information loss of the vector space. Experimental results on two real-world datasets demonstrate that EAND achieves superior performance, outperforming state-of-the-art methods by at least +2.74% in terms of the micro-F1 score and +3.31% in terms of the macro-F1 score.
Entity linking is a new technique in recommender systems to link users’ interaction behaviors in different domains, for the purpose of improving the performance of the recommendation task. Linking-based cross-domain recommendation aims to alleviate the data sparse problem by utilizing the domain-sharable knowledge from auxiliary domains. However, existing methods fail to prevent domain-specific features to be transferred, resulting in suboptimal results. In this paper, we aim to address this issue by proposing an adversarial transfer learning based model ATLRec, which effectively captures domain-sharable features for cross-domain recommendation. In ATLRec, we leverage adversarial learning to generate representations of user-item interactions in both the source and the target domains, such that the discriminator cannot identify which domain they belong to, for the purpose of obtaining domain-sharable features. Meanwhile each domain learns its domain-specific features by a private feature extractor. The recommendation of each domain considers both domain-specific and domain-sharable features. We further adopt an attention mechanism to learn item latent factors of both domains by utilizing the shared users with interaction history, so that the representations of all items can be learned sufficiently in a shared space, even when few or even no items are shared by different domains. By this method, we can represent all items from the source and the target domains in a shared space, for the purpose of better linking items in different domains and capturing cross-domain item-item relatedness to facilitate the learning of domain-sharable knowledge. The proposed model is evaluated on various real-world datasets and demonstrated to outperform several state-of-the-art single-domain and cross-domain recommendation methods in terms of recommendation accuracy.
Entity linking (EL) is the task of determining the identity of textual entity mentions given a predefined knowledge base (KB). Plenty of existing efforts have been made on this task using either “local” information (contextual information of the mention in the text), or “global” information (relations among candidate entities). However, either local or global information might be insufficient especially when the given text is short. To get richer local and global information for entity linking, we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines (WSE). Based on the intuition above, two novel attempts are made. The first one adds web-searched results into an embedding-based method to expand the mention’s local information, where we try two different methods to help generate high-quality web contexts: one is to apply the attention mechanism and the other is to use the abstract extraction method. The second one uses the web contexts to extend the global information, i.e., finding and utilizing more extra relevant mentions from the web contexts with a graph-based model. Finally, we combine the two models we propose to use both extended local and global information from the extra web contexts. Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.
Linking user accounts belonging to the same user across different platforms with location data has received significant attention, due to the popularization of GPS-enabled devices and the wide range of applications benefiting from user account linkage (e.g., cross-platform user profiling and recommendation). Different from most existing studies which only focus on user account linkage across two platforms, we propose a novel model ULMP (i.e., user account linkage across multiple platforms), with the goal of effectively and efficiently linking user accounts across multiple platforms with location data. Despite of the practical significance brought by successful user linkage across multiple platforms, this task is very challenging compared with the ones across two platforms. The major challenge lies in the fact that the number of user combinations shows an explosive growth with the increase of the number of platforms. To tackle the problem, a novel method GTkNN is first proposed to prune the search space by efficiently retrieving top-k candidate user accounts indexed with well-designed spatial and temporal index structures. Then, in the pruned space, a match score based on kernel density estimation combining both spatial and temporal information is designed to retrieve the linked user accounts. The extensive experiments conducted on four real-world datasets demonstrate the superiority of the proposed model ULMP in terms of both effectiveness and efficiency compared with the state-of-art methods.
With the popularity of storing large data graph in cloud, the emergence of subgraph pattern matching on a remote cloud has been inspired. Typically, subgraph pattern matching is defined in terms of subgraph isomorphism, which is an NP-complete problem and sometimes too strict to find useful matches in certain applications. And how to protect the privacy of data graphs in subgraph pattern matching without undermining matching results is an important concern. Thus, we propose a novel framework to achieve the privacy-preserving subgraph pattern matching in cloud. In order to protect the structural privacy in data graphs, we firstly develop a k-automorphism model based method. Additionally, we use a cost-model based label generalization method to protect label privacy in both data graphs and pattern graphs. During the generation of the k-automorphic graph, a large number of noise edges or vertices might be introduced to the original data graph. Thus, we use the outsourced graph, which is only a subset of a k-automorphic graph, to answer the subgraph pattern matching. The efficiency of the pattern matching process can be greatly improved in this way. Extensive experiments on real-world datasets demonstrate the high efficiency of our framework.
As a fundamental operation in LBS (location-based services), the trajectory similarity of moving objects has been extensively studied in recent years. However, due to the increasing volume of moving object trajectories and the demand of interactive query performance, the trajectory similarity queries are now required to be processed on massive datasets in a real-time manner. Existing work has proposed distributed or parallel solutions to enable large-scale trajectory similarity processing. However, those techniques cannot be directly adapted to the real-time scenario as it is likely to generate poor balancing performance when workload variance occurs on the incoming trajectory stream. In this paper, we propose a new workload partitioning framework, ART (Adaptive Framework for Real-Time Trajectory Similarity), which introduces practical algorithms to support dynamic workload assignment for RTTS (real-time trajectory similarity). Our proposal includes a processing model tailored for the RTTS scenario, a load balancing framework to maximize throughput, and an adaptive data partition manner designed to cut off unnecessary network cost. Based on this, our model can handle the large-scale trajectory similarity in an on-line scenario, which achieves scalability, effectiveness, and efficiency by a single shot. Empirical studies on synthetic data and real-world stream applications validate the usefulness of our proposal and prove the huge advantage of our approach over state-of-the-art solutions in the literature.
With the development and prevalence of online social networks, there is an obvious tendency that people are willing to attend and share group activities with friends or acquaintances. This motivates the study on group recommendation, which aims to meet the needs of a group of users, instead of only individual users. However, how to aggregate different preferences of different group members is still a challenging problem: 1) the choice of a member in a group is influenced by various factors, e.g., personal preference, group topic, and social relationship; 2) users have different influences when in different groups. In this paper, we propose a generative geo-social group recommendation model (GSGR) to recommend points of interest (POIs) for groups. Specifically, GSGR well models the personal preference impacted by geographical information, group topics, and social influence for recommendation. Moreover, when making recommendations, GSGR aggregates the preferences of group members with different weights to estimate the preference score of a group to a POI. Experimental results on two datasets show that GSGR is effective in group recommendation and outperforms the state-of-the-art methods.
A point of interest (POI) is a specific point location that someone may find useful. With the development of urban modernization, a large number of functional organized POI groups (FOPGs), such as shopping malls, electronic malls, and snacks streets, are springing up in the city. They have a great influence on people’s lives. We aim to discover functional organized POI groups for spatial keyword recommendation because FOPGs-based recommendation is superior to POIs-based recommendation in efficiency and flexibility. To discover FOPGs, we design clustering algorithms to obtain organized POI groups (OPGs) and utilize OPGs-LDA (Latent Dirichlet Allocation) model to reveal functions of OPGs for further recommendation. To the best of our knowledge, we are the first to study functional organized POI groups which have important applications in urban planning and social marketing.