Network approaches for identification of human genetic disease genes

Minh-Tan Nguyen, Tien-Dzung Tran
Author affiliations

Authors

  • Minh-Tan Nguyen Center of Information - Library, Hanoi University of Industry, 298 Cau Dien Street, Bac Tu Liem District, Ha Noi, Viet Nam
  • Tien-Dzung Tran Department of Software Engineering, Faculty of Information Technology, Hanoi University of Industry, 298 Cau Dien Street, Bac Tu Liem District, Ha Noi, Viet Nam

DOI:

https://doi.org/10.15625/2525-2518/17026

Keywords:

disease gene, (3-Mercaptopropyl) triethoxysilane, biological network, hierarchical closeness

Abstract

The identification of genes causing a genetic disease is still an important issue in the biomedical field because the list of disease genes is still incomplete while it determines the early diagnosis and treatment of fatal genetic diseases such as autism, cancer, drug resistance, and secondary hypertension. Genes associated with a particular disease or similar diseases tend to reside in the same region in a biological network and their location on the network can be predicted. Many network analysis methods have been proposed to solve this problem so far. This review first helps readers access and master the basic concepts of biological networks, disease genes, and their properties. Then, the main content is devoted to the analysis and evaluation of analytical methods recently used to find disease genes on two networks: protein-protein interaction (PPI) and cellular signaling network (CSN). We reported typical problems of identification of primary genes that cause genetic diseases and modern techniques that were widely used for solving those problems. For each technique, we also represented key algorithms so that the audience can exactly implement them for their experiments. In particular, we evaluated the performance of these algorithms in prediction of disease genes and suggested the context for their usage. Finally, the implications of the methods are discussed and some future research directions are proposed. Taken together, disease genes can often be identified from network data by two approaches: network-based methods and machine learning-based methods, and the network-based approach

Downloads

Download data is not yet available.

References

Simon C. and Farndon P. - What Causes Genetic Disorders? InnovAiT 1 (8) ( 2008) 544-553. DOI: https://doi.org/10.1093/innovait/inn087

Schram F. R. and P. K. L. Ng - What is Cancer?, Journal of Crustacean Biology 32 (4) (2012) 665-672. DOI: https://doi.org/10.1163/193724012X640650

Globocan W. - Estimated cancer incidence, mortality and prevalence worldwide in 2012. Int Agency Res. Cancer (2012) 43-50.

Duc-Tinh Pham, M. T. N., Ha-Nam Nguyen, Tien-Dzung Tran - Analyzing cancer data in North Vietnam by complex network technique, Journal of Science and Technology: Issue on Information and Communications Technology 19 (12.2) (2021). DOI: https://doi.org/10.31130/ict-ud.2021.140

Braithwaite D., Demb J., and Henderson L. - American Cancer Society: cancer facts and figures 2016, Atlanta, GA: American Cancer Society, 2016, p. 53

Tran T. D. and Pham D. T. - Identification of anticancer drug target genes using an outside competitive dynamics model on cancer signaling networks, Scientific Reports 11 (1) ( 2021) 14-21 DOI: https://doi.org/10.1038/s41598-021-93336-z

Tran T. D. and Kwon Y. K. - Hierarchical closeness-based properties reveal cancer survivability and biomarker genes in molecular signaling networks, PLOS ONE 13 (6) (2018) e0199109. DOI: https://doi.org/10.1371/journal.pone.0199109

Turner F. S., Clutterbuck D. R., and Semple C. A. M. - POCUS: mining genomic sequence annotation to predict disease genes, Genome Biology 4 (11) ( 2003) R75. DOI: https://doi.org/10.1186/gb-2003-4-11-r75

Adie E. A., et al. - SUSPECTS: enabling fast and effective prioritization of positional candidates, Bioinformatics 22 (6) ( 2006) 773-774. DOI: https://doi.org/10.1093/bioinformatics/btk031

Aerts S., et al. - Gene prioritization through genomic data fusion, Nature Biotechnology 24 (5) (2006) 537-544. DOI: https://doi.org/10.1038/nbt1203

Chen J., et al. - Improved human disease candidate gene prioritization using mouse phenotype, BMC Bioinformatics 8 (1) ( 2007) 392. DOI: https://doi.org/10.1186/1471-2105-8-392

Cáceres J. J. and Paccanaro A. - Disease gene prediction for molecularly uncharacterized diseases, PLOS Computational Biology 15 (7) (2019) e1007078. DOI: https://doi.org/10.1371/journal.pcbi.1007078

Adie E. A., et al. - Speeding disease gene discovery by sequence based candidate prioritization, BMC Bioinformatics 6 (1) ( 2005) 55. DOI: https://doi.org/10.1186/1471-2105-6-S3-P1

Kuncheva L. I. - Editing for the k-nearest neighbors rule by a genetic algorithm, Pattern Recognition Letters 16 (8) ( 1995) 809-814. DOI: https://doi.org/10.1016/0167-8655(95)00047-K

Moore J. H., et al. - A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility, Journal of Theoretical Biology 241 (2) ( 2006) 252-261. DOI: https://doi.org/10.1016/j.jtbi.2005.11.036

Khan J., et al. - Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine 7 (6) ( 2001) 673-679. DOI: https://doi.org/10.1038/89044

Guyon I., et al. - Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning 46 (1) (2002) 389-422. DOI: https://doi.org/10.1023/A:1012487302797

Jiang R., et al. - A random forest approach to the detection of epistatic interactions in case-control studies, BMC Bioinformatics 10 (1) ( 2009) S65. DOI: https://doi.org/10.1186/1471-2105-10-S1-S65

Papadimitriou S., et al. - Predicting disease-causing variant combinations, Proceedings of the National Academy of Sciences 116 (24) ( 2019) 11878-11887. DOI: https://doi.org/10.1073/pnas.1815601116

Shu J., et al. - Disease gene prediction with privileged information and heteroscedastic dropout, Bioinformatics 37 (Supplement_1) (2021) i410-i417. DOI: https://doi.org/10.1093/bioinformatics/btab310

Le D. H., Xuan Hoai N., and Kwon Y. K. - A Comparative Study of Classification-Based Machine Learning Methods for Novel Disease Gene Prediction, In: Knowledge and Systems Engineering, Cham: Springer International Publishing, 2015. DOI: https://doi.org/10.1007/978-3-319-11680-8_46

Tran T. D. and Kwon Y. K. - The relationship between modularity and robustness in signalling networks, Journal of The Royal Society Interface 10 (88) (2013) 20130771. DOI: https://doi.org/10.1098/rsif.2013.0771

Tran T. D. and Kwon Y. K. - Hierarchical closeness efficiently predicts disease genes in a directed signaling network, Computational Biology and Chemistry 53 (2014) 191-197. DOI: https://doi.org/10.1016/j.compbiolchem.2014.08.023

Le D. H. and Kwon Y. K. - Neighbor-favoring weight reinforcement to improve random walk-based disease gene prioritization, Computational Biology and Chemistry 44 (2013) 1-8. DOI: https://doi.org/10.1016/j.compbiolchem.2013.01.001

Kim J. R., et al. - Reduction of Complex Signaling Networks to a Representative Kernel, Science Signaling 4 (175) (2011) ra35-ra35. DOI: https://doi.org/10.1126/scisignal.2001390

Köhler S., et al. - Walking the Interactome for Prioritization of Candidate Disease Genes, The American Journal of Human Genetics 82 (4) (2008) 949-958. DOI: https://doi.org/10.1016/j.ajhg.2008.02.013

Lei X. and Bian C. - Integrating random walk with restart and k-Nearest Neighbor to identify novel circRNA-disease association, Scientific Reports 10 (1) (2020) 1943. DOI: https://doi.org/10.1038/s41598-020-59040-0

Li A., et al. - A novel miRNA-disease association prediction model using dual random walk with restart and space projection federated method, PLOS ONE 16 (6) (2021) e0252971. DOI: https://doi.org/10.1371/journal.pone.0252971

Joodaki M., et al. - A scalable random walk with restart on heterogeneous networks with Apache Spark for ranking disease-related genes through type-II fuzzy data fusion, Journal of Biomedical Informatics 115 (2021) 103688. DOI: https://doi.org/10.1016/j.jbi.2021.103688

Lee I., et al. - Prioritizing candidate disease genes by network-based boosting of genome-wide association data, Genome research 21 (7) (2011) 1109-1121. DOI: https://doi.org/10.1101/gr.118992.110

Hwang S., et al. - HumanNet v2: human gene networks for disease research, Nucleic Acids Research 47 (D1) (2018) D573-D580. DOI: https://doi.org/10.1093/nar/gky1126

Kim C. Y., et al. - HumanNet v3: an improved database of human gene networks for disease research, Nucleic Acids Research 50 (D1) (2021) D632-D639. DOI: https://doi.org/10.1093/nar/gkab1048

Linghu B., et al. - Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network, Genome Biology 10 (9) (2009) R91. DOI: https://doi.org/10.1186/gb-2009-10-9-r91

Szklarczyk D., et al. - The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored, Nucleic Acids Research 39 (suppl_1) (2010) D561-D568. DOI: https://doi.org/10.1093/nar/gkq973

Le D. H. - Machine learning-based approaches for disease gene prediction, Briefings in Functional Genomics 19 (5-6) (2020) 350-363. DOI: https://doi.org/10.1093/bfgp/elaa013

Ata S. K., et al. - Recent advances in network-based methods for disease gene prediction, Briefings in Bioinformatics 22 (4) (2020). DOI: https://doi.org/10.1093/bib/bbaa303

Zhang X. M., et al. - Graph Neural Networks and Their Current Applications in Bioinformatics, Frontiers in Genetics 12 (2021). DOI: https://doi.org/10.3389/fgene.2021.690049

Downloads

Published

31-08-2022

How to Cite

[1]
M.-T. Nguyen and T.-D. Tran, “Network approaches for identification of human genetic disease genes”, Vietnam J. Sci. Technol., vol. 60, no. 4, pp. 700–712, Aug. 2022.

Issue

Section

Electronics - Telecommunication