Analysis of the genetic structure of the Malay population: Ancestry-informative marker SNPs in the Malay of Peninsular Malaysia

•A total of eight Malay sub-ethnic groups were used to study the genetic structure of the Malay population and to construct a panel of AIM SNPs for Malay population.•This study utilized PCA, ipPCA and ADMIXTURE algorithms to explore the genetic structure pattern of the Malay population.•The AIMs pan...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Forensic science international : genetics 2017-09, Vol.30, p.152-159
Hauptverfasser: Yahya, Padillah, Sulong, Sarina, Harun, Azian, Wan Isa, Hatin, Ab Rajab, Nur-Shafawati, Wangkumhang, Pongsakorn, Wilantho, Alisa, Ngamphiw, Chumpol, Tongsima, Sissades, Zilfalil, Bin Alwi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•A total of eight Malay sub-ethnic groups were used to study the genetic structure of the Malay population and to construct a panel of AIM SNPs for Malay population.•This study utilized PCA, ipPCA and ADMIXTURE algorithms to explore the genetic structure pattern of the Malay population.•The AIMs panel for Malays in this study was constructed using In, and KNN algorithm to teach the classification models. Malay, the main ethnic group in Peninsular Malaysia, is represented by various sub-ethnic groups such as Melayu Banjar, Melayu Bugis, Melayu Champa, Melayu Java, Melayu Kedah Melayu Kelantan, Melayu Minang and Melayu Patani. Using data retrieved from the MyHVP (Malaysian Human Variome Project) database, a total of 135 individuals from these sub-ethnic groups were profiled using the Affymetrix GeneChip Mapping Xba 50-K single nucleotide polymorphism (SNP) array to identify SNPs that were ancestry-informative markers (AIMs) for Malays of Peninsular Malaysia. Prior to selecting the AIMs, the genetic structure of Malays was explored with reference to 11 other populations obtained from the Pan-Asian SNP Consortium database using principal component analysis (PCA) and ADMIXTURE. Iterative pruning principal component analysis (ipPCA) was further used to identify sub-groups of Malays. Subsequently, we constructed an AIMs panel for Malays using the informativeness for assignment (In) of genetic markers, and the K-nearest neighbor classifier (KNN) was used to teach the classification models. A model of 250 SNPs ranked by In, correctly classified Malay individuals with an accuracy of up to 90%. The identified panel of SNPs could be utilized as a panel of AIMs to ascertain the specific ancestry of Malays, which may be useful in disease association studies, biomedical research or forensic investigation purposes.
ISSN:1872-4973
1878-0326
DOI:10.1016/j.fsigen.2017.07.005