Proteogenomics and its applications in biology and precision medicine


  • Phan Van Chi Institute of Biotechnology, VAST
  • Le Thi Bich Thao Institute of Biotechnology, VAST



Proteogenomics, Genomics, Transcriptomics, Proteomics, Next-generation sequencing (NGS), Mass spectrometry (MS)


In this review, we briefly discuss proteogenomics, the integration of proteomics with genomics and transcriptomics, whereby the underlying technologies are next-generation sequencing (NGS) and mass spectrometry (MS) with processing the resulting data, an emerging field that promises to accelerate fundamental research related to transcription and translation, as well as its applicability. By combining genomic and proteomic information, scientists are achieving new results due to a more complete and unified understanding of complex molecular biological processes. Part of this review introduces some of the results of using proteogenomics in solving problems such as annotation, gene/genome re-annotation, including editing of open reading frames (ORFs), or improving a process to detect new genes in a number of different organisms, including humans. In particular, the paper also discusses the potential of proteogenomics through research achievements on human genome/proteome in precision medicine, especially in projects on phylogenetic and diagnostic research. and cancer treatment. The challenges and future of proteogenomics are also discussed and documented.


Download data is not yet available.

Author Biographies

Phan Van Chi, Institute of Biotechnology, VAST

Protein Biochemistry

Le Thi Bich Thao, Institute of Biotechnology, VAST

Protein Biochemistry


Aebersold R, Mann M (2016) Mass-spectrometric exploration of proteome structure and function. Nature. 537(7620):347-355. doi:10.1038/nature19949.

Armengaud J, Pible O, Gaillard JC, Cian A, Gantois N, Tan KSW, Chabe M, Viscogliosi E (2017). Proteogenomic Insights into the Intestinal Parasite Blastocystis sp. Subtype 4 Isolate WR1. Proteomics. 17(21):10.1002/pmic.201700211. doi:10.1002/pmic.201700211.

Binetti M, Lauro A, Vaccari S, Cervellera M, Tonini V (2020). Proteogenomic biomarkers in colorectal cancers: clinical applications. Expert Rev Proteomics. 17(5):355-363. doi:10.1080/14789450.2020.1782202

Cesnik AJ, Shortreed MR, Sheynkman GM, Frey BL, Smith LM (2016). Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy. J Proteome Res. 15(3):800-808. doi:10.1021/acs.jproteome.5b00817.

Chapman B, Bellgard M (2017). Plant Proteogenomics: Improvements to the Grapevine Genome Annotation. Proteomics 17(21):10, doi 10.1002/pmic.201700197.

Chen MX, Zhu FY, Gao B, Ma KL, Zhang Y, Fernie AR, Chen X, Dai L, Ye NH, Zhang X, Tian Y, Zhang D, Xiao S, Zhang J, Liu YG (2020). Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation. Plant Physiol. 182(3):1510-1526. doi:10.1104/pp.19.00430.

Clark DJ, Dhanasekaran SM, Petralia F, et al (2020). Integrated Proteogenomic Characterization of Clear Cell Renal Cell Carcinoma [published correction appears in Cell. 179(4):964-983.e31. doi:10.1016/j.cell.2019.10.007.

Datta KK, Madugundu AK, Gowda H (2016). Proteogenomic Methods to Improve Genome Annotation. Methods Mol Biol. 1410:77-89. doi:10.1007/978-1-4939-3524-6_5.

Datta KK, Patil AH, Patel K, G. Dey, Madugundu AK, Renuse S, Kaviyil JE, Sekhar R, Arunima A, Daswani B, Kaur I, Mohanty J, Sinha R, Jaiswal S, Sivapriya S, Sonnathi Y, Chattoo BB, Gowda H, Ravikumar R, Prasad TSK (2016). Proteogenomics of Candida tropicalis-An Opportunistic Pathogen with Importance for Global Health. OMICS 20, 239.

Dimitrakopoulos L, Prassas I, Berns EMJJ, Foekens JA, Diamandis EP, Charames GS (2017). Variant peptide detection utilizing mass spectrometry: laying the foundations for proteogenomic identification and validation. Clin Chem Lab Med. 55(9):1291-1304. doi:10.1515/cclm-2016-0947.

Giess A, Jonckheere V, Ndah E, Chyżyńska K, Van Damme P, Valen E (2017). Ribosome signatures aid bacterial translation initiation site identification. BMC Biol. 15(1):76. doi:10.1186/s12915-017-0416-0.

González-Gomariz J, Guruceaga E, López-Sánchez M, Segura V (2019). Proteogenomics in the context of the Human Proteome Project (HPP). Expert Rev Proteomics. 16(3):267-275. doi:10.1080/14789450.2019.1571916

Hwang H, Park GW, Park JY, Lee HK, Lee JY, Jeong JE, Park SKR, Yates 3rd JR, Kwon KH, Park YM, Lee HJ, Paik YK, Kim JY, Yoo JS (2017). Next Generation Proteomic Pipeline for Chromosome-Based Proteomic Research Using NeXtProt and GENCODE Databases. J Proteome Res. 16(12):4425-4434. doi:10.1021/acs.jproteome.7b00223.

Kendrick N (2016). A gene's mRNA level does not usually predict its protein level.

Kuznetsova KG, Kliuchnikova AA, Ilina IU, Chernobrovkin AL, Novikova SE, Farafonova TE, Karpov DS, Ivanov MV, Goncharov AO, Ilgisonis EV, Voronko OE, Nasaev SS, Zgoda VG, Zubarev RA, Gorshkov MV, and Moshkovskii SA (2018). Proteogenomics of adenosine-to-inosine RNA editing in the fruit fly. J. Proteome Res. 17: 3889-3903. doi:

Lobas AA, Karpov DS, Kopylov AT, Solovyeva EM, Ivanov MV, IlinaIY, Lazarev N, Kuznetsova KG, Ilgisonis EV, Zgoda VG, Góhkov MV, Moshkovskii SA (2016). Exome-based proteogenomics of HEK-293 human cell line: Coding genomic variants identified at the level of shotgun proteome. Proteomics. 16(14):1980-1991. doi:10.1002/pmic.201500349

Low TY, Mohtar MA, Ang MY, Jamal R (2019). Connecting Proteomics to Next-Generation Sequencing: Proteogenomics and Its Current Applications in Biology. Proteomics. 19(10):e1800235. doi:10.1002/pmic.201800235.

Ma Y, Huang T, Zhong X et al. (2018). Proteogenomic characterization and comprehensive integrative genomic analysis of human colorectal cancer liver metastasis. Mol Cancer 17, 139.

McAfee A, Harpur BA, Michaud S, Beavis RC, Kent CF, Zayed A, Foster LJ (2016). Toward an Upgraded Honey Bee (Apis mellifera L.) Genome Annotation Using Proteogenomics. Proteome Res. 15: 411.

Mahadevan C, Krishnan A, Saraswathy GG, Surendran A, Jaleel A, Sakuntala M (2016). Transcriptome-Assisted Label-Free Quantitative Proteomics Analysis Reveals Novel Insights into Piper nigrum-Phytophthora capsici Phytopathosystem. Front. Plant Sci. 7:785.

Mao Y, Yang Y, Liu Y, Yan Y, Du Z, Han Y, Song Y, Zhou L, Cui Y, Yang R (2016). Reannotation of Yersinia pestis Strain 91001 Based on Omics Data. Am. J. Trop. Med. Hyg. 95:562.

Mitchell NM, Sherrard AL, Dasari S, Magee DM, Grys TE, Lake DF (2018). Proteogenomic Re-Annotation of Coccidioides posadasii Strain Silveira. Proteomics 18(1):10, doi 10.1002/pmic.201700173.

Mertins P, Tang LC, Krug K, Clark DJ, Gritsenko MA, Chen L, Clauser KR, Clauss TR, Shah P, Gillette MA, Petyuk VA, Thomas SN, Mani DR, Mundt F, Mổe RJ, Hu Y, Zhao R, Schnaubelt M, Keshishian H, Monroe ME, Zhang Z, Udeshi ND, Mani D, Davies SR, Townsend RR, Chan DW, Smith RD, Zhang H, Liu T, Carr SA (2018). Reproducible workflow for multiplexed deep-scale proteome and phosphoproteome analysis of tumor tissues by liquid chromatography-mass spectrometry. Nat Protoc. 13(7):1632-1661. doi:10.1038/s41596-018-0006-9.

Mun D, Bhin J, Kim S, Kim H, Jung J, Jung Y, Jang Y, Park J, Kim H, Jung Y, Lee H, Bae J, Back S, Kim S, Kim J, Park H, Li H, Hwang K, Park Y, Yook J, Kim B, Kwon S, Ryu S, Park D, Jeon T, Kim D, Lee J, Han S, Song K, Park D, Park J, Rodriguez H, Kim J, Lee H, Kim K, Yang E, Kim H, Paek E, Lee S, Lee S and Hwang D (2019). Proteogenomic Characterization of Human Early-Onset Gastric Cancer. Cancer Cell, 35(1): 111-124.e10.

Nassa G, Giurato G, Cimmino G, Rizzo F, Ravo M, Salvati A, Nyman TA, Zhu Y, Vesterlund M, Lehtio J, Golino P, Weisz A, Tarallo R (2018). Splicing of platelet resident pre-mRNAs upon activation by physiological stimuli results in functionally relevant proteome modifications. Sci Rep. 8(1): 498. doi:10.1038/s41598-017-18985-5.

Nesvizhskii A (2014). Proteogenomics: concepts, applications and computational strategies. Nature Methods. 11(11):1114-1125.

Nishimura T and Nakamura H (2016). Developments for Personalized Medicine of Lung Cancer Subtypes: Mass Spectrometry-Based Clinical Proteogenomic Analysis of Oncogenic Mutations. Adv Exp Med Biol. 926:115-137. doi: 10.1007/978-3-319-42316-6_8.

Omasits U, Varadarajan AR, Schmid M, Goetze S, Melidis D, Bourqui M, Nikolayeva O, Quebatte M, Patrignani A, Dehio C, Frey JE, Robinson MD, Wollscheid B, Ahrens CH (2017). An integrative strategy to identify the entire protein coding potential of prokaryotic genomes by proteogenomics. Genome Res. 27: 2083. 27(12):2083-2095. doi:10.1101/gr.218255.116.

Omenn GS, Lane L, Lundberg EK, Overall CM, Deutsch EW (2017). Progress on the HUPO Draft Human Proteome: Metrics of the Human Proteome Project. J Proteome Res. 16(12):4281-4287. doi:10.1021/acs.jproteome.7b00375.

Omenn GS, Lane L, Overall CM, Corrales FJ, Schwenk JM, Paik YK, Van Eyk JE, Liu S, Pennington S, Snyder M, Baker MS, Deutsch EW (2019). Progress on Identifying and Characterizing the Human Proteome: 2019 Metrics from the HUPO Human Proteome Project. J Proteome Res. 18(12):4098-4107. doi:10.1021/acs.jproteome.9b00434

Poulos RC, Hains PG, Shah R, Jucas N, Xavier D, Manda SS, Anees A, Koh JMS, Mahboob S, Wittman M, William SG, Sykes EK, Hecker M, Dausmann M, Wouters MA, Ashman K, Yang J, Wild PJ, deFazio A, Balleine RL, Tully B, Aebersold R, Speed TP, Liu Y, Reddel RR, Rbinson PJ & Zhong Q (2020). Strategies to enable large-scale proteomics for reproducible research. Nat Commun 11, 3793.

Rang J, He H, Wang T, Ding X, Zuo M, Quan M, Sun Y, Yu Z, Hu S, Xia L(2015). Comparative analysis of genomics and proteomics in Bacillus thuringiensis 4.0718. PLoS One 10(3):e0119065. doi:10.1371/journal.pone.0119065.

Robinson SD, Safavi-Hemami H, Raghuraman S, Imperial JS, Papenfuss AT, Teichert RW, Purcell AW, Olivera BM, Norton RS (2015). Discovery by proteogenomics and characterization of an RF-amide neuropeptide from cone snail venom. J Proteomics 114:38-47. doi:10.1016/j.jprot.2014.11.003.

Ruggles KV, Krug K, Wang X, Clauser K R, Wang J, Payne SH, Fenyö D, Zhang B, & Mani DR (2017). Methods, Tools and Current Perspectives in Proteogenomics. Mol. Cell. Proteom. 16(6), 959-981.

Satpathy S, Jaehnig EJ, Krug K, Kim BJ, Saltzman AB, Chan DW, Holloway KR, Anurag M, Huang C, Singh P, Gao A, Namai N, Dou Y, Wen B, Vasaikar SV, Mutch D, Watson MA, Ma C, Ademuyiwa FO, Rimawi MF, Schiff R, Hoog J, Jacobs S, Malovannaya A, Hyslop T, Clauser KR, Mani DR, Perou CM, Miles G, Zhang B, Gillette MA, Carr SA, Ellis MJ (2020). Microscaled proteogenomic methods for precision oncology. Nat Commun. 11(1):532. doi:10.1038/s41467-020-14381-2.

Schubert OT, Röst HL, Collins BC, Rosenberger G, Aebersold R (2017). Quantitative proteomics: challenges and opportunities in basic and applied research. Nat Protoc. 12(7):1289-1294. doi:10.1038/nprot.2017.040

Syme RA, Tan KC, Hane JK, Dodhia K, Stoll T, Hastie M, Furuki E, Ellwood RS, Williams AH, Tan YF, Testa AC, Gorman JJ, Oliver RP (2016). Comprehensive Annotation of the Parastagonospora nodorum Reference Genome Using Next-Generation Genomics, Transcriptomics and Proteogenomics. PLoS One 11(2):e0147221. https// doi:10.1371/journal.pone.0147221.

Timp W và Tim G (2020). Beyond mass spectrometry, the next step in proteomics. Sci. Adv. 6: eaax8978. doi: 10.1126/sciadv.aax8978.

Vasaikar S, Huang C, Wang X, et al (2019). Proteogenomic Analysis of Human Colon Cancer Reveals New Therapeutic Opportunities. Cell. 177(4):1035-1049.e19. doi:10.1016/j.cell.2019.03.030

Wingo TS, Duong DM, Zhou M, Dammer EB, Wu H, Cutler DJ, Lah JJ, Levey AI, Seyfried NT (2017). Integrating Next-Generation Genomic Sequencing and Mass Spectrometry To Estimate Allele-Specific Protein Abundance in Human Brain. J Proteome Res. 16(9):3336-3347. doi:10.1021/acs.jproteome.7b00324.

Woo S, Cha S, Bonissone S, Na S, Tabb D, Pevzner P and Bafna V (2015). Advanced Proteogenomic Analysis Reveals Multiple Peptide Mutations and Complex Immunoglobulin Peptides in Colon Cancer. J. Proteome Res. 14(9): 3555-3567.

Ye X, Tang X, Wang X, Che J, Wu M, Liang J, Qian Q, Li J, You Z, Zhang Y, Wang S, Zhong B (2019). Improving Silkworm Genome Annotation Using a Proteogenomics Approach. J Proteome Res. 2019;18(8):3009-3019. doi:10.1021/acs.jproteome.8b00965.

Zhang B, Wang J, Wang X, Zhu J, Liu Q, Shi Z, Chambers M, Zimmerman L, Shaddox K, Kim S, Davies S, Wang S, Wang P, Kinsinger C, Rivers R, Rodriguez H, Townsend R, Ellis, M, Carr S, Tabb D, Coffey R, Slebos R and Liebler D (2014). Proteogenomic characterization of human colon and rectal cancer. Nature. 513(7518): 382-387.

Zhu Y, Engstrom PG, Tellgren-Roth C, Baudo CD, Kennell JC, Sun S, Billmyre RB, Schroder MS, Andersson A, Holm T, Sigurgeirsson B, Wu G, Sankaranarayanan SR, Siddharthan R, Sanyal K, Lundeberg J, Nystedt B, Boekhout T, Dawson TLJ, Heitman J, Scheynius A, Lehtio J (2017). Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis. Nucleic Acids Res. 45(5):2629-2643. doi: 10.1093/nar/gkx006.