• Le Sy Vinh VNU University of Engineering and Technology



Amino acid substitution model, whole genomes, maximum likelihood estimation methods, phylogenomics


Modeling amino acid substitution process is a core task in bioinformatics. New advanced sequencing technologies have generated huge datasets including whole genomes from various species. Estimating amino acid substitution models from whole genome datasets provides us unprecedented opportunities to accurately investigate relationships among species. In this paper, we review state-of-the-art computational methods to estimate amino acid substitution models from large datasets. We also describe a comprehensive pipeline to practically estimate amino acid models from whole genome datasets. Finally, we apply amino acid substitution models to build phylogenomic trees from bird and plant genome datasets. We compare our newly reconstructed phylogenomic trees and published ones and discuss new findings.


Whelan, Simon, and Nick Goldman. 2001. "A General Empirical Model of Protein Evolution Derived from Multiple Protein Families Using a Maximum-Likelihood Approach." Molecular Biology and Evolution 18 (5): 691–99.

