Network approaches for identification of human genetic disease genes

Minh-Tan Nguyen; Tien-Dzung Tran

doi:10.15625/2525-2518/17026

Author affiliations

Authors

Minh-Tan Nguyen Center of Information - Library, Hanoi University of Industry, 298 Cau Dien Street, Bac Tu Liem District, Ha Noi, Viet Nam
Tien-Dzung Tran Department of Software Engineering, Faculty of Information Technology, Hanoi University of Industry, 298 Cau Dien Street, Bac Tu Liem District, Ha Noi, Viet Nam

DOI:

https://doi.org/10.15625/2525-2518/17026

Keywords:

disease gene, (3-Mercaptopropyl) triethoxysilane, biological network, hierarchical closeness

Abstract

The identification of genes causing a genetic disease is still an important issue in the biomedical field because the list of disease genes is still incomplete while it determines the early diagnosis and treatment of fatal genetic diseases such as autism, cancer, drug resistance, and secondary hypertension. Genes associated with a particular disease or similar diseases tend to reside in the same region in a biological network and their location on the network can be predicted. Many network analysis methods have been proposed to solve this problem so far. This review first helps readers access and master the basic concepts of biological networks, disease genes, and their properties. Then, the main content is devoted to the analysis and evaluation of analytical methods recently used to find disease genes on two networks: protein-protein interaction (PPI) and cellular signaling network (CSN). We reported typical problems of identification of primary genes that cause genetic diseases and modern techniques that were widely used for solving those problems. For each technique, we also represented key algorithms so that the audience can exactly implement them for their experiments. In particular, we evaluated the performance of these algorithms in prediction of disease genes and suggested the context for their usage. Finally, the implications of the methods are discussed and some future research directions are proposed. Taken together, disease genes can often be identified from network data by two approaches: network-based methods and machine learning-based methods, and the network-based approach

Downloads

Download data is not yet available.

References

Simon C. and Farndon P. - What Causes Genetic Disorders? InnovAiT 1 (8) ( 2008) 544-553.

Schram F. R. and P. K. L. Ng - What is Cancer?, Journal of Crustacean Biology 32 (4) (2012) 665-672.

Globocan W. - Estimated cancer incidence, mortality and prevalence worldwide in 2012. Int Agency Res. Cancer (2012) 43-50.

Duc-Tinh Pham, M. T. N., Ha-Nam Nguyen, Tien-Dzung Tran - Analyzing cancer data in North Vietnam by complex network technique, Journal of Science and Technology: Issue on Information and Communications Technology 19 (12.2) (2021).

Braithwaite D., Demb J., and Henderson L. - American Cancer Society: cancer facts and figures 2016, Atlanta, GA: American Cancer Society, 2016, p. 53

Tran T. D. and Pham D. T. - Identification of anticancer drug target genes using an outside competitive dynamics model on cancer signaling networks, Scientific Reports 11 (1) ( 2021) 14-21

Tran T. D. and Kwon Y. K. - Hierarchical closeness-based properties reveal cancer survivability and biomarker genes in molecular signaling networks, PLOS ONE 13 (6) (2018) e0199109.

Turner F. S., Clutterbuck D. R., and Semple C. A. M. - POCUS: mining genomic sequence annotation to predict disease genes, Genome Biology 4 (11) ( 2003) R75.

Adie E. A., et al. - SUSPECTS: enabling fast and effective prioritization of positional candidates, Bioinformatics 22 (6) ( 2006) 773-774.

Aerts S., et al. - Gene prioritization through genomic data fusion, Nature Biotechnology 24 (5) (2006) 537-544.

Chen J., et al. - Improved human disease candidate gene prioritization using mouse phenotype, BMC Bioinformatics 8 (1) ( 2007) 392.

Cáceres J. J. and Paccanaro A. - Disease gene prediction for molecularly uncharacterized diseases, PLOS Computational Biology 15 (7) (2019) e1007078.

Adie E. A., et al. - Speeding disease gene discovery by sequence based candidate prioritization, BMC Bioinformatics 6 (1) ( 2005) 55.

Kuncheva L. I. - Editing for the k-nearest neighbors rule by a genetic algorithm, Pattern Recognition Letters 16 (8) ( 1995) 809-814.

Moore J. H., et al. - A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility, Journal of Theoretical Biology 241 (2) ( 2006) 252-261.

Khan J., et al. - Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine 7 (6) ( 2001) 673-679.

Guyon I., et al. - Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning 46 (1) (2002) 389-422.

Jiang R., et al. - A random forest approach to the detection of epistatic interactions in case-control studies, BMC Bioinformatics 10 (1) ( 2009) S65.

Papadimitriou S., et al. - Predicting disease-causing variant combinations, Proceedings of the National Academy of Sciences 116 (24) ( 2019) 11878-11887.

Shu J., et al. - Disease gene prediction with privileged information and heteroscedastic dropout, Bioinformatics 37 (Supplement_1) (2021) i410-i417.

Le D. H., Xuan Hoai N., and Kwon Y. K. - A Comparative Study of Classification-Based Machine Learning Methods for Novel Disease Gene Prediction, In: Knowledge and Systems Engineering, Cham: Springer International Publishing, 2015.

Tran T. D. and Kwon Y. K. - The relationship between modularity and robustness in signalling networks, Journal of The Royal Society Interface 10 (88) (2013) 20130771.

Tran T. D. and Kwon Y. K. - Hierarchical closeness efficiently predicts disease genes in a directed signaling network, Computational Biology and Chemistry 53 (2014) 191-197.

Le D. H. and Kwon Y. K. - Neighbor-favoring weight reinforcement to improve random walk-based disease gene prioritization, Computational Biology and Chemistry 44 (2013) 1-8.

Kim J. R., et al. - Reduction of Complex Signaling Networks to a Representative Kernel, Science Signaling 4 (175) (2011) ra35-ra35.

Köhler S., et al. - Walking the Interactome for Prioritization of Candidate Disease Genes, The American Journal of Human Genetics 82 (4) (2008) 949-958.

Lei X. and Bian C. - Integrating random walk with restart and k-Nearest Neighbor to identify novel circRNA-disease association, Scientific Reports 10 (1) (2020) 1943.

Li A., et al. - A novel miRNA-disease association prediction model using dual random walk with restart and space projection federated method, PLOS ONE 16 (6) (2021) e0252971.

Joodaki M., et al. - A scalable random walk with restart on heterogeneous networks with Apache Spark for ranking disease-related genes through type-II fuzzy data fusion, Journal of Biomedical Informatics 115 (2021) 103688.

Lee I., et al. - Prioritizing candidate disease genes by network-based boosting of genome-wide association data, Genome research 21 (7) (2011) 1109-1121.

Hwang S., et al. - HumanNet v2: human gene networks for disease research, Nucleic Acids Research 47 (D1) (2018) D573-D580.

Kim C. Y., et al. - HumanNet v3: an improved database of human gene networks for disease research, Nucleic Acids Research 50 (D1) (2021) D632-D639.

Linghu B., et al. - Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network, Genome Biology 10 (9) (2009) R91.

Szklarczyk D., et al. - The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored, Nucleic Acids Research 39 (suppl_1) (2010) D561-D568.

Le D. H. - Machine learning-based approaches for disease gene prediction, Briefings in Functional Genomics 19 (5-6) (2020) 350-363.

Ata S. K., et al. - Recent advances in network-based methods for disease gene prediction, Briefings in Bioinformatics 22 (4) (2020).

Zhang X. M., et al. - Graph Neural Networks and Their Current Applications in Bioinformatics, Frontiers in Genetics 12 (2021).