Journal Home > Volume 7 , Issue 3

High-throughput proteomics based on mass spectrometry (MS) analysis has permeated biomedical science and propelled numerous research projects. pFind 3 is a database search engine for high-speed and in-depth proteomics data analysis. pFind 3 features a swift open search workflow that is adept at uncovering less obvious information such as unexpected modifications or mutations that would have gone unnoticed using a conventional data analysis pipeline. In this protocol, we provide step-by-step instructions to help users mastering various types of data analysis using pFind 3 in conjunction with pParse for data pre-processing and if needed, pQuant for quantitation. This streamlined pParse-pFind-pQuant workflow offers exceptional sensitivity, precision, and speed. It can be easily implemented in any laboratory in need of identifying peptides, proteins, or post-translational modifications, or of quantitation based on 15N-labeling, SILAC-labeling, or TMT/iTRAQ labeling.


menu
Abstract
Full text
Outline
Electronic supplementary material
About this article

How to use open-pFind in deep proteomics data analysis?— A protocol for rigorous identification and quantitation of peptides and proteins from mass spectrometry data

Show Author's information Guangcan Shao1,2,3Yong Cao1,2,3( )Zhenlin Chen4,5Chao Liu4,6Shangtong Li2,3Hao Chi4( )Meng-Qiu Dong2,3( )
School of Life Sciences, Peking University, Beijing 100871, China
National Institute of Biological Sciences, Beijing, Beijing 102206, China
Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing 102206, China
Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), University of CAS, Institute of Computing Technology, CAS, Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100049, China
Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, School of Medicine and Engineering, Beihang University, Beijing 100191, China

Abstract

High-throughput proteomics based on mass spectrometry (MS) analysis has permeated biomedical science and propelled numerous research projects. pFind 3 is a database search engine for high-speed and in-depth proteomics data analysis. pFind 3 features a swift open search workflow that is adept at uncovering less obvious information such as unexpected modifications or mutations that would have gone unnoticed using a conventional data analysis pipeline. In this protocol, we provide step-by-step instructions to help users mastering various types of data analysis using pFind 3 in conjunction with pParse for data pre-processing and if needed, pQuant for quantitation. This streamlined pParse-pFind-pQuant workflow offers exceptional sensitivity, precision, and speed. It can be easily implemented in any laboratory in need of identifying peptides, proteins, or post-translational modifications, or of quantitation based on 15N-labeling, SILAC-labeling, or TMT/iTRAQ labeling.

Keywords: Mass spectrometry, Protein identification, Search engine, Open-pFind, Quantitation

References(15)

Aebersold R, Mann M (2016) Mass-spectrometric exploration of proteome structure and function. Nature 537: 347−355

Chen C, Hou J, TannerJJ, Cheng J (2020) Bioinformatics methods for mass spectrometry-based proteomics data analysis. Int J Mol Sci 21: 2873. https://doi.org/10.3390/ijms21082873

Chi H, Liu C, Yang H, Zeng WF, Wu L, Zhou WJ, Wang RM, Niu XN, Ding YH, Zhang Y, Wang ZW, Chen ZL, Sun RX, Liu T, Tan GM, Dong MQ, Xu P, Zhang PH, He SM (2018) Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine. Nat Biotechnol 36: 1059−1061

Cong Y, Motamedchaboki K, Misal SA, Liang Y, Guise AJ, Truong T, Huguet R, Plowey ED, Zhu Y, Lopez-Ferrer D, Kelly RT (2021) Ultrasensitive single-cell proteomics workflow identifies >1000 protein groups per mammalian cell. Chem Sci 12: 1001−1006

Conrads TP, Alving K, Veenstra TD, Belov ME, Anderson GA, Anderson DJ, Lipton MS, Pasa-Tolic L, Udseth HR, Chrisler WB, Thrall BD, Smith RD (2001) Quantitative analysis of bacterial and mammalian proteomes using a combination of cysteine affinity tags and 15N-metabolic labeling. Anal Chem 73: 2132−2139

Creasy DM, Cottrell JS (2004) Unimod: protein modifications for mass spectrometry. Proteomics 4: 1534−1536

Hoopmann MR, Moritz RL (2013) Current algorithmic solutions for peptide-based proteomics data generation and identification. Curr Opin Biotechnol 24: 31−38

Huesgen PF, Lange PF, Rogers LD, Solis N, Eckhard U, Kleifeld O, Goulas T, Gomis-Ruth FX, Overall CM (2015) LysargiNase mirrors trypsin for protein C-terminal and methylation-site identification. Nat methods 12: 55−58

Ma J, Chen T, Wu S, Yang C, Bai M, Shu K, Li K, Zhang G, Jin Z, He F (2019) iProX: an integrated proteome resource. Nucleic Acids Res 47: D1211−D1217

Meier F, Brunner AD, Frank M, Ha A, Bludau I, Voytik E, Kaspar-Schoenefeld S, Lubeck M, Raether O, Bache N, Aebersold R, Collins BC, Röst HL, Mann M (2020) diaPASEF: parallel accumulation-serial fragmentation combined with data-independent acquisition. Nature methods 17: 1229−1236

Milo R (2013) What is the total number of protein molecules per cell volume? A call to rethink some published values. BioEssays 35: 1050−1055

Muller JB, Geyer PE, Colaco AR, Treit PV, Strauss MT, Oroshi M, Doll S, Virreira Winter S, Bader JM, Kohler N, Theis F, Santos A, Mann M (2020) The proteome landscape of the kingdoms of life. Nature 582: 592−596

Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1: 376−386

Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J, Schmidt G, Neumann T, Johnstone R, Mohammed AK, Hamon C (2003) Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Anal Chem 75: 1895−1904

Valikangas T, Suomi T, Elo LL (2018) A comprehensive evaluation of popular proteomics software workflows for label-free proteome quantification and imputation. Brief Bioinform 19: 1344−1355

File
2021-004-DMQ-Suppl.pdf (2.3 MB)
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 07 February 2021
Accepted: 05 April 2021
Published: 07 July 2021
Issue date: June 2021

Copyright

© The Author(s) 2021

Acknowledgements

Acknowledgements

The authors would like to thank Yong-Hong Yan, Dr. Li Tao, and Yue Zhao of NIBS, Beijing and Dr. Ming Ding of China Pharmaceutical University for providing sample datasets to this protocol. The authors also thank the National Natural Science Foundation of China (grant 21675153), the Beijing Municipal Science and Technology Commission, and the Ministry of Science and Technology of China for research funding.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Return