Comparing receptor binding properties of SARS-CoV-2 and of SARS-CoV virus by using unsupervised machine learning models

Toan The Nguyen, Duc Nguyen-Manh, Ly Nguyen Hai, Cao Cong Phuong, Hien Lai Thi Thu, Anh Phan Duc, Cuong Nguyen Tien, Agata Kranjc
Author affiliations

Authors

  • Toan The Nguyen Vietnam National University, Hanoi https://orcid.org/0000-0002-6331-2453
  • Duc Nguyen-Manh CCFE, United Kingdom Atomic Energy Authority, OX14 3DB, Abingdon, UK
  • Ly Nguyen Hai Digital Materials Science laboratory, VinUniversity, Vinhomes Ocean Park, Gia Lam District, Hanoi, Vietnam https://orcid.org/0009-0001-7676-2170
  • Cao Cong Phuong Faculty of Physics, VNU University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi, Vietnam
  • Hien Lai Thi Thu Key Laboratory for Multiscale Simulation of Complex Systems, VNU University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi, Vietnam https://orcid.org/0000-0003-0091-6732
  • Anh Phan Duc Faculty of Materials Science and Engineering, Phenikaa Institute for Advanced Study, Phenikaa University, Hanoi, 12116, Vietnam
  • Cuong Nguyen Tien Faculty of Physics, VNU University of Science, 334 Nguyen Trai, Thanh Xuan, Hanoi, Vietnam https://orcid.org/0000-0003-4039-529X
  • Agata Kranjc Laboratoire de Biochimie Théorique, UPR 9080 CNRS, Université de Paris, 13 rue Pierre et Marie Curie, F-75005 Paris, France

DOI:

https://doi.org/10.15625/0868-3166/19607

Keywords:

Coronaviruses, Human ACE2, Unsupervised machine learning, protein-protein interaction, variable autoencoder, Molecular dynamics, enhanced sampling

Abstract

This work continues our recent molecular dynamics investigation of the three systems of the human ACE2 receptor interacting with the viral RBDs of SARS-CoV virus and two variants of SARS-CoV-2 viruses. The simulations are extended and analyzed using unsupervised machine learning models to give complementary descriptions of hidden features of the viral binding mechanism. Specifically, the principle component analysis (PCA) and the variational autoencoder (VAE) models are employed, both are classified as dimensionality reduction approaches with different focuses. The results support the molecular dynamics results that the two variants of SARS-CoV-2 bind stronger and more stable to the human ACE2 receptor than SARS-CoV virus does. Moreover, stronger bindings affect the structure of the human receptor, making it fluctuate more, a sensitive feature which is hard to detect using standard analyses. Unexpectedly, it is found that the VAE model can learn and arrange randomly shuffled protein structures obtained from molecular dynamics in time order in the latent space representation. This result potentially has promising application in computational biomolecules. One could use this VAE model to jump forward in time during a molecular dynamics simulation, and to enhance the sampling of protein configuration space.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

C. Wang, P. Horby, F. Hayden and G. Gao, A novel coronavirus outbreak of global health concern, The Lancet

(2020) 470.

S. Belouzard, J. Millet, B. Licitra and G. Whittaker, Mechanisms of coronavirus cell entry mediated by the viral spike protein, Viruses 4 (2012) 1011.

S. Sieczkarski and G. Whittaker, Dissecting virus entry via endocytosis, The Journal of general virology 83 (2002) 1535.

W. Li, M. Moore, N. Vasilieva, J. Sui, S. Wong, M. Berne et al., Angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus, Nature 426 (2003) 450.

P. Zhou, X. Yang, X.-G. Wang, B. Hu, L. Zhang, W. Zhang et al., A pneumonia outbreak associated with a new coronavirus of probable bat origin, Nature 579 (2020) 270.

J. Lan, J. Ge, J. Yu, S. Shan, H. Zhou, S. Fan et al., Structure of the sars-cov-2 spike receptor-binding domain bound to the ace2 receptor, Nature 581 (2020) 1.

Y. Wan, J. Shang, R. Graham, R. Baric and F. Li, Receptor recognition by novel coronavirus from wuhan: An analysis based on decade-long structural studies of sars, Journal of Virology 94 (2020) e00127.

W. Tai, L. He, X. Zhang, J. Pu, D. Voronin, S. Jiang et al., Characterization of the receptor-binding domain (rbd) of 2019 novel coronavirus: implication for development of rbd protein as a viral attachment inhibitor and vaccine, Cellular & Molecular Immunology 17 (2020) 1.

M. Hoffmann, H. Kleine-Weber, S. Schroeder, N. Krüger, T. Herrler, S. Erichsen et al., Sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor, Cell 181 (2020) 271.

H. T. Lai, L. H. Nguyen, A. D. Phan, A. Kranjc, T. T. Nguyen and D. Nguyen-Manh, A comparative study of receptor interactions between sars-cov and sars-cov-2 from molecular modeling, Journal of molecular modeling 28 (2022) 305.

K. Andersen, A. Rambaut, W. Lipkin, E. Holmes and R. Garry, The proximal origin of sars-cov-2, Nature Medicine 26 (2020) 1.

M. T. Degiacomi, Coupling molecular dynamics and deep learning to mine protein conformational space, Structure 27 (2019) 1034.

J. D. Thompson, T. J. Gibson and D. G. Higgins, Multiple sequence alignment using clustalw and clustalx, Current protocols in bioinformatics (2003) 2.

D. W. Mount, Using blosum in sequence alignments, Cold Spring Harbor Protocols 2008 (2008) pdb.

A. Šali and T. L. Blundell, Comparative protein modelling by satisfaction of spatial restraints, Journal of Molecular Biology 234 (1993) 779.

E. Krieger, S. B. Nabuurs and G. Vriend, Homology modeling, Methods of biochemical analysis 44 (2003) 509.

H. T. Lai, D. M. Nguyen, T. T. Nguyen et al., Homology modeling of mouse nlrp3 nacht protein domain and molecular dynamics simulation of its atp binding properties, International Journal of Modern Physics C (IJMPC) 31 (2020) 1.

H. J. Berendsen, D. van der Spoel and R. van Drunen, Gromacs: a message-passing parallel molecular dynamics implementation, Computer physics communications 91 (1995) 43.

J. Huang and A. D. MacKerell Jr, Charmm36 all-atom additive protein force field: Validation based on comparison to nmr data, Journal of computational chemistry 34 (2013) 2135.

K. N. Kirschner, A. B. Yongye, S. M. Tschampel, J. González-Outeiriño, C. R. Daniels, B. L. Foley et al., Glycam06: a generalizable biomolecular force field. carbohydrates, Journal of computational chemistry 29 (2008) 622.

Y. Sun and P. A. Kollman, Hydrophobic solvation of methane and nonbond parameters of the tip3p water model, Journal of computational chemistry 16 (1995) 1164.

W. G. Hoover, Canonical dynamics: Equilibrium phase-space distributions, Phys. Rev. A 31 (1985) 1695.

S. Nosé, A molecular dynamics method for simulations in the canonical ensemble, Mol. Phys. 52 (1984) 255.

M. Parrinello and A. Rahman, Polymorphic transitions in single crystals: A new molecular dynamics method, J.

Appl. Phys. 52 (1981) 7182.

T. Darden, D. York and L. Pedersen, Particle mesh ewald: An nlog (n) method for ewald sums in large systems,

The Journal of chemical physics 98 (1993) 10089.

B. Hess, H. Bekker, H. J. Berendsen and J. G. Fraaije, Lincs: a linear constraint solver for molecular simulations, J. Comput. Chem. 18 (1997) 1463.

D. P. Kingma and M. Welling, “Auto-encoding variational bayes.” 2013.

N. Michaud-Agrawal, E. J. Denning, T. B. Woolf and O. Beckstein, Mdanalysis: a toolkit for the analysis of molecular dynamics simulations, Journal of computational chemistry 32 (2011) 2319.

F. Chollet et al., Keras: Deep learning library for theano and tensorflow, URL: https://keras. io/k 7 (2015) T1.

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015.

J. Shang, G. Ye, K. Shi, Y. Wan, C. Luo, H. Aihara et al., Structural basis of receptor recognition by sars-cov-2, Nature 581 (2020) 221.

A. C. Walls, Y.-J. Park, M. A. Tortorici, A. Wall, A. T. McGuire and D. Veesler, Structure, function, and antigenicity of the sars-cov-2 spike glycoprotein, Cell

C. Wang, P. Horby, F. Hayden and G. Gao, A novel coronavirus outbreak of global health concern, The Lancet

(2020) 470.

S. Belouzard, J. Millet, B. Licitra and G. Whittaker, Mechanisms of coronavirus cell entry mediated by the viral spike protein, Viruses 4 (2012) 1011.

S. Sieczkarski and G. Whittaker, Dissecting virus entry via endocytosis, The Journal of general virology 83 (2002) 1535.

W. Li, M. Moore, N. Vasilieva, J. Sui, S. Wong, M. Berne et al., Angiotensin-converting enzyme 2 is a functional receptor for the sars coronavirus, Nature 426 (2003) 450.

P. Zhou, X. Yang, X.-G. Wang, B. Hu, L. Zhang, W. Zhang et al., A pneumonia outbreak associated with a new coronavirus of probable bat origin, Nature 579 (2020) 270.

J. Lan, J. Ge, J. Yu, S. Shan, H. Zhou, S. Fan et al., Structure of the sars-cov-2 spike receptor-binding domain bound to the ace2 receptor, Nature 581 (2020) 1.

Y. Wan, J. Shang, R. Graham, R. Baric and F. Li, Receptor recognition by novel coronavirus from wuhan: An analysis based on decade-long structural studies of sars, Journal of Virology 94 (2020) e00127.

W. Tai, L. He, X. Zhang, J. Pu, D. Voronin, S. Jiang et al., Characterization of the receptor-binding domain (rbd) of 2019 novel coronavirus: implication for development of rbd protein as a viral attachment inhibitor and vaccine, Cellular & Molecular Immunology 17 (2020) 1.

M. Hoffmann, H. Kleine-Weber, S. Schroeder, N. Krüger, T. Herrler, S. Erichsen et al., Sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor, Cell 181 (2020) 271.

H. T. Lai, L. H. Nguyen, A. D. Phan, A. Kranjc, T. T. Nguyen and D. Nguyen-Manh, A comparative study of receptor interactions between sars-cov and sars-cov-2 from molecular modeling, Journal of molecular modeling 28 (2022) 305.

K. Andersen, A. Rambaut, W. Lipkin, E. Holmes and R. Garry, The proximal origin of sars-cov-2, Nature Medicine 26 (2020) 1.

M. T. Degiacomi, Coupling molecular dynamics and deep learning to mine protein conformational space, Structure 27 (2019) 1034.

J. D. Thompson, T. J. Gibson and D. G. Higgins, Multiple sequence alignment using clustalw and clustalx, Current protocols in bioinformatics (2003) 2.

D. W. Mount, Using blosum in sequence alignments, Cold Spring Harbor Protocols 2008 (2008) pdb.

A. Šali and T. L. Blundell, Comparative protein modelling by satisfaction of spatial restraints, Journal of Molecular Biology 234 (1993) 779.

E. Krieger, S. B. Nabuurs and G. Vriend, Homology modeling, Methods of biochemical analysis 44 (2003) 509.

H. T. Lai, D. M. Nguyen, T. T. Nguyen et al., Homology modeling of mouse nlrp3 nacht protein domain and molecular dynamics simulation of its atp binding properties, International Journal of Modern Physics C (IJMPC) 31 (2020) 1.

H. J. Berendsen, D. van der Spoel and R. van Drunen, Gromacs: a message-passing parallel molecular dynamics implementation, Computer physics communications 91 (1995) 43.

J. Huang and A. D. MacKerell Jr, Charmm36 all-atom additive protein force field: Validation based on comparison to nmr data, Journal of computational chemistry 34 (2013) 2135.

K. N. Kirschner, A. B. Yongye, S. M. Tschampel, J. González-Outeiriño, C. R. Daniels, B. L. Foley et al., Glycam06: a generalizable biomolecular force field. carbohydrates, Journal of computational chemistry 29 (2008) 622.

Y. Sun and P. A. Kollman, Hydrophobic solvation of methane and nonbond parameters of the tip3p water model, Journal of computational chemistry 16 (1995) 1164.

W. G. Hoover, Canonical dynamics: Equilibrium phase-space distributions, Phys. Rev. A 31 (1985) 1695.

S. Nosé, A molecular dynamics method for simulations in the canonical ensemble, Mol. Phys. 52 (1984) 255.

M. Parrinello and A. Rahman, Polymorphic transitions in single crystals: A new molecular dynamics method, J.

Appl. Phys. 52 (1981) 7182.

T. Darden, D. York and L. Pedersen, Particle mesh ewald: An nlog (n) method for ewald sums in large systems,

The Journal of chemical physics 98 (1993) 10089.

B. Hess, H. Bekker, H. J. Berendsen and J. G. Fraaije, Lincs: a linear constraint solver for molecular simulations, J. Comput. Chem. 18 (1997) 1463.

D. P. Kingma and M. Welling, “Auto-encoding variational bayes.” 2013.

N. Michaud-Agrawal, E. J. Denning, T. B. Woolf and O. Beckstein, Mdanalysis: a toolkit for the analysis of molecular dynamics simulations, Journal of computational chemistry 32 (2011) 2319.

F. Chollet et al., Keras: Deep learning library for theano and tensorflow, URL: https://keras. io/k 7 (2015) T1.

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015.

J. Shang, G. Ye, K. Shi, Y. Wan, C. Luo, H. Aihara et al., Structural basis of receptor recognition by sars-cov-2, Nature 581 (2020) 221.

A. C. Walls, Y.-J. Park, M. A. Tortorici, A. Wall, A. T. McGuire and D. Veesler, Structure, function, and antigenicity of the sars-cov-2 spike glycoprotein, Cell 181 (2020) 281.

D. Wrapp, N. Wang, K. S. Corbett, J. A. Goldsmith, C.-L. Hsieh, O. Abiona et al., Cryo-em structure of the 2019-ncov spike in the prefusion conformation, Science 367 (2020) 1260.

Downloads

Published

11-06-2024

How to Cite

[1]
T. T. Nguyen, “Comparing receptor binding properties of SARS-CoV-2 and of SARS-CoV virus by using unsupervised machine learning models”, Comm. Phys., vol. 34, no. 2, p. 99, Jun. 2024.

Issue

Section

Papers

Funding data

Received 11-12-2023
Accepted 14-05-2024
Published 11-06-2024