Complete human mtDNA genome sequences revealed haplotype features of the Hmong-Mien language family in Vietnam


  • Dinh Huong Thao Institute of Genome Research, VAST, Vietnam
  • Nong Van Hai Institute of Genome Research, VAST, Vietnam
  • Nguyen Thuy Duong Institute of Genome Research, VAST, Vietnam



Dao, Hmong, Pathen, mtDNA


Vietnam is the homeland of 54 different ethnicities that belong to 5 major language families of the world, including Austroasiatic, Tai-Kadai, Hmong-Mien, Sino-Tibetan and Austronesian. Hmong-Mien, an ethnolinguistic family, presumably stemmed from Southern China and later spread to the Southeast Asia region. In this study, we analyzed the mitochondrial DNA sequences taken from 120 males belonging Hmong-Mien (HM) language family in Vietnam: Dao, Hmong, and Pathen, revealing 352 unique variants. Dao has the most number of polymorphisms (230 unique SNPs occurring 1469 times), followed by Hmong (181 unique SNPs occurring 1367 times) and Pathen (159 SNPs occurring 1243 times). Genetic variations within each population and among Hmong-Mien speakers were further measured by computations of haplotype diversity (H), nucleotide diversity (π) and fixation index (FST). There are nine major haplogroups (A, B, C, D, F, G, M, N9, and R) detected, with F and B making up over half of each population (Hmong: 56.09% (23/41), Pathen: 58.33% (21/36), Dao: 62.79% (27/43)). Haplotype classification was further divided into 30 haplogroups, of which 80% of them were specific to a single minority. Dao remained the most genetically diverse (H=0.957), while Pathen was the most homogeneous population (H=0.900). In terms of genetic distance, Dao and Hmong were more distinguished from each other, while Hmong and Pathen were more related. Complete mtDNA sequences of Viet HM speakers increased the mtDNA depository, improving the understanding of the genetic structure underlying this language family.


