Scholar - SciOpen

Survey Issue

Controllability and Its Applications to Biological Networks

Lin Wu, Min Li, Jian-Xin Wang, Fang-Xiang Wu

Journal of Computer Science and Technology 2019, 34(1): 16-34

Published: 18 January 2019

Abstract Collect Collected

Biological elements usually exert their functions through interactions with others to form various types of biological networks. The ability of controlling the dynamics of biological networks is of enormous benefits to pharmaceutical and medical industry as well as scientific research. Though there are many mathematical methods for steering dynamic systems towards desired states, the methods are usually not feasible for applying to complex biological networks. The difficulties come from the lack of accurate model that can capture the dynamics of interactions between biological elements and the fact that many mathematical methods are computationally intractable for large-scale networks. Recently, a concept in control theory — controllability, has been applied to investigate the dynamics of complex networks. In this article, recent advances on the controllability of complex networks and applications to biological networks are reviewed. Developing dynamic models is the prior concern for analyzing dynamics of biological networks. First, we introduce a widely used dynamic model for investigating controllability of complex networks. Then recent studies of theorems and algorithms for having complex biological networks controllable in general or specific application scenarios are reviewed. Finally, applications to real biological networks manifest that investigating the controllability of biological networks can shed lights on many critical physiological or medical problems, such as revealing biological mechanisms and identifying drug targets, from a systematic perspective.

Regular Paper Issue

Decoding the Structural Keywords in Protein Structure Universe

Wessam Elhefnawy, Min Li, Jian-Xin Wang, Yaohang Li

Journal of Computer Science and Technology 2019, 34(1): 3-15

Published: 18 January 2019

Abstract Collect Collected

Although the protein sequence-structure gap continues to enlarge due to the development of high-throughput sequencing tools, the protein structure universe tends to be complete without proteins with novel structural folds deposited in the protein data bank (PDB) recently. In this work, we identify a protein structural dictionary (Frag-K) composed of a set of backbone fragments ranging from 4 to 20 residues as the structural “keywords” that can effectively distinguish between major protein folds. We firstly apply randomized spectral clustering and random forest algorithms to construct representative and sensitive protein fragment libraries from a large scale of high-quality, non-homologous protein structures available in PDB. We analyze the impacts of clustering cut-offs on the performance of the fragment libraries. Then, the Frag-K fragments are employed as structural features to classify protein structures in major protein folds defined by SCOP (Structural Classification of Proteins). Our results show that a structural dictionary with ~400 4- to 20-residue Frag-K fragments is capable of classifying major SCOP folds with high accuracy.

Open Access Issue

A Survey of Matrix Completion Methods for Recommendation Systems

Andy Ramlatchan, Mengyun Yang, Quan Liu, Min Li, Jianxin Wang, Yaohang Li

Big Data Mining and Analytics 2018, 1(4): 308-323

Published: 02 July 2018

Abstract

PDF (354 KB) Collect Collected

Downloads：280

In recent years, the recommendation systems have become increasingly popular and have been used in a broad variety of applications. Here, we investigate the matrix completion techniques for the recommendation systems that are based on collaborative filtering. The collaborative filtering problem can be viewed as predicting the favorability of a user with respect to new items of commodities. When a rating matrix is constructed with users as rows, items as columns, and entries as ratings, the collaborative filtering problem can then be modeled as a matrix completion problem by filling out the unknown elements in the rating matrix. This article presents a comprehensive survey of the matrix completion methods used in recommendation systems. We focus on the mathematical models for matrix completion and the corresponding computational algorithms as well as their characteristics and potential issues. Several applications other than the traditional user-item association prediction are also discussed.

Open Access Issue

A Feature Selection Method for Prediction Essential Protein

Jiancheng Zhong, Jianxin Wang, Wei Peng, Zhen Zhang, Min Li

Tsinghua Science and Technology 2015, 20(5): 491-499

Published: 13 October 2015

Abstract

PDF (428 KB) Collect Collected

Downloads：137

Essential proteins are vital to the survival of a cell. There are various features related to the essentiality of proteins, such as biological and topological features. Many computational methods have been developed to identify essential proteins by using these features. However, it is still a big challenge to design an effective method that is able to select suitable features and integrate them to predict essential proteins. In this work, we first collect 26 features, and use SVM-RFE to select some of them to create a feature space for predicting essential proteins, and then remove the features that share the biological meaning with other features in the feature space according to their Pearson Correlation Coefficients (PCC). The experiments are carried out on S. cerevisiae data. Six features are determined as the best subset of features. To assess the prediction performance of our method, we further compare it with some machine learning methods, such as SVM, Naive Bayes, Bayes Network, and NBTree when inputting the different number of features. The results show that those methods using the 6 features outperform that using other features, which confirms the effectiveness of our feature selection method for essential protein prediction.

Open Access Issue

Kernelization in Parameterized Computation: A Survey

Qilong Feng, Qian Zhou, Wenjun Li, Jianxin Wang

Tsinghua Science and Technology 2014, 19(4): 338-345

Published: 30 July 2014

Abstract

PDF (237.2 KB) Collect Collected

Downloads：65

Parameterized computation is a new method dealing with NP-hard problems, which has attracted a lot of attentions in theoretical computer science. As a practical preprocessing method for NP-hard problems, kernelizaiton in parameterized computation has recently become an active research area. In this paper, we discuss several kernelizaiton techniques, such as crown decomposition, planar graph vertex partition, randomized methods, and kernel lower bounds, which have been used widely in the kernelization of many hard problems.

Open Access Issue

Distances Between Phylogenetic Trees: A Survey

Feng Shi, Qilong Feng, Jianer Chen, Lusheng Wang, Jianxin Wang

Tsinghua Science and Technology 2013, 18(5): 490-499

Published: 03 October 2013

Abstract

PDF (270.3 KB) Collect Collected

Downloads：48

Phylogenetic trees have been widely used in the study of evolutionary biology for representing the tree-like evolution of a collection of species. However, different data sets and different methods often lead to the construction of different phylogenetic trees for the same set of species. Therefore, comparing these trees to determine similarities or, equivalently, dissimilarities, becomes the fundamental issue. Typically, Tree Bisection and Reconnection (TBR) and Subtree Prune and Regraft (SPR) distances have been proposed to facilitate the comparison between different phylogenetic trees. In this paper, we give a survey on the aspects of computational complexity, fixed-parameter algorithms, and approximation algorithms for computing the TBR and SPR distances of phylogenetic trees.

Open Access Issue

Mining Protein Complexes from PPI Networks Using the Minimum Vertex Cut

Xiaojun Ding, Weiping Wang, Xiaoqing Peng, Jianxin Wang

Tsinghua Science and Technology 2012, 17(6): 674-681

Published: 05 December 2012

Abstract

PDF (1 MB) Collect Collected

Downloads：4

Evidence shows that biological systems are composed of separable functional modules. Identifying protein complexes is essential for understanding the principles of cellular functions. Many methods have been proposed to mine protein complexes from protein-protein interaction networks. However, the performances of these algorithms are not good enough since the protein-protein interactions detected from experiments are not complete and have noise. This paper presents an analysis of the topological properties of protein complexes to show that although proteins from the same complex are more highly connected than proteins from different complexes, many protein complexes are not very dense (density ≥ 0.8). A method is then given to mine protein complexes that are relatively dense (density ≥ 0.4). In the first step, a topology property is used to identify proteins that are probably in a same complex. Then, a possible boundary is calculated based on a minimum vertex cut for the protein complex. The final complex is formed by the proteins within the boundary. The method is validated on a yeast protein-protein interaction network. The results show that this method has better performance in terms of sensitivity and specificity compared with other methods. The functional consistency is also good.