A machine-learning method for biobank-scale genetic prediction of blood group antigens

A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2024-03, Vol.20 (3), p.e1011977-e1011977
Hauptverfasser: Hyvärinen, Kati, Haimila, Katri, Moslemi, Camous, Biobank, Blood Service, Olsson, Martin L, Ostrowski, Sisse R, Pedersen, Ole B, Erikstrup, Christian, Partanen, Jukka, Ritari, Jarmo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1011977
container_issue 3
container_start_page e1011977
container_title PLoS computational biology
container_volume 20
creator Hyvärinen, Kati
Haimila, Katri
Moslemi, Camous
Biobank, Blood Service
Olsson, Martin L
Ostrowski, Sisse R
Pedersen, Ole B
Erikstrup, Christian
Partanen, Jukka
Ritari, Jarmo
description A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational methods for determining antigens from genomic data. We describe here a classification method for determining RBC antigens from genotyping array data. Random forest models for 39 RBC antigens in 14 blood group systems and for human platelet antigen (HPA)-1 were trained and tested using genotype and RBC antigen and HPA-1 typing data available for 1,192 blood donors in the Finnish Blood Service Biobank. The algorithm and models were further evaluated using a validation cohort of 111,667 Danish blood donors. In the Finnish test data set, the median (interquartile range [IQR]) balanced accuracy for 39 models was 99.9 (98.9-100)%. We were able to replicate 34 out of 39 Finnish models in the Danish cohort and the median (IQR) balanced accuracy for classifications was 97.1 (90.1-99.4)%. When applying models trained with the Danish cohort, the median (IQR) balanced accuracy for the 40 Danish models in the Danish test data set was 99.3 (95.1-99.8)%. The RBC antigen and HPA-1 prediction models demonstrated high overall accuracies suitable for probabilistic determination of blood groups and HPA-1 at biobank-scale. Furthermore, population-specific training cohort increased the accuracies of the models. This stand-alone and freely available method is applicable for research and screening for antigen-negative blood donors.
doi_str_mv 10.1371/journal.pcbi.1011977
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_3069178907</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A788368189</galeid><doaj_id>oai_doaj_org_article_9bf2af96ac444de7992edcf44801af34</doaj_id><sourcerecordid>A788368189</sourcerecordid><originalsourceid>FETCH-LOGICAL-c680t-ae0301b25d3c881af542f902d2b458c491db9cea782fd07377ee25d7cf2834eb3</originalsourceid><addsrcrecordid>eNqVk12P1CAUhhujcdfRf2C0iTd6MSMUWuDKTDZ-TDLRxK9bAvTQYexAF1o__r3Mzuxmx-yNaUgJfd4HOM0piqcYLTBh-PU2TNGrfjEY7RYYYSwYu1ec47omc0Zqfv_W_Kx4lNIWoTwVzcPijPAaV0Kw8-L7stwps3Ee5j2o6J3vyh2Mm9CWNsRSu6CV_zFPRvVQduBhdKYcIrTOjC74MthS9yHTXQzTUCo_ukylx8UDq_oET47vWfHt3duvFx_m60_vVxfL9dw0HI1zBYggrKu6JYZzrGxNKytQ1Vaa1txQgVstDCjGK9siRhgDyDAztuKEgiaz4vnBO_QhyWNJkiSoEZhxkSOzYnUg2qC2cohup-IfGZSTVwshdlLFfKkepNC2UlY0ylBKW2BCVNAaSylH-WiEZtf64Eq_YJj0ia2fhjx0HjKB5AIrLXgtMautpMIoqXHNpTXGIkE5VwJl3Zvj4Se9yzuBH6PqT6ynX7zbyC78lBgJ3ghBsuHl0RDD5QRplDuXDPS98hCmJCvBKEJNxUVGX_yD3l2tI9Xl_y2dtyFvbPZSuWSck4bjK9fiDio_LeycCR6sy-sngVcngcyM8Hvs1JSSXH35_B_sx1OWHlgTQ0oR7E3xMJL7Lrm-pNx3iTx2SY49u134m9B1W5C_96oNjw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3069178907</pqid></control><display><type>article</type><title>A machine-learning method for biobank-scale genetic prediction of blood group antigens</title><source>DOAJ Directory of Open Access Journals</source><source>SWEPUB Freely available online</source><source>Public Library of Science (PLoS) Journals Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Hyvärinen, Kati ; Haimila, Katri ; Moslemi, Camous ; Biobank, Blood Service ; Olsson, Martin L ; Ostrowski, Sisse R ; Pedersen, Ole B ; Erikstrup, Christian ; Partanen, Jukka ; Ritari, Jarmo</creator><contributor>Lu, Yang</contributor><creatorcontrib>Hyvärinen, Kati ; Haimila, Katri ; Moslemi, Camous ; Biobank, Blood Service ; Olsson, Martin L ; Ostrowski, Sisse R ; Pedersen, Ole B ; Erikstrup, Christian ; Partanen, Jukka ; Ritari, Jarmo ; Lu, Yang</creatorcontrib><description>A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational methods for determining antigens from genomic data. We describe here a classification method for determining RBC antigens from genotyping array data. Random forest models for 39 RBC antigens in 14 blood group systems and for human platelet antigen (HPA)-1 were trained and tested using genotype and RBC antigen and HPA-1 typing data available for 1,192 blood donors in the Finnish Blood Service Biobank. The algorithm and models were further evaluated using a validation cohort of 111,667 Danish blood donors. In the Finnish test data set, the median (interquartile range [IQR]) balanced accuracy for 39 models was 99.9 (98.9-100)%. We were able to replicate 34 out of 39 Finnish models in the Danish cohort and the median (IQR) balanced accuracy for classifications was 97.1 (90.1-99.4)%. When applying models trained with the Danish cohort, the median (IQR) balanced accuracy for the 40 Danish models in the Danish test data set was 99.3 (95.1-99.8)%. The RBC antigen and HPA-1 prediction models demonstrated high overall accuracies suitable for probabilistic determination of blood groups and HPA-1 at biobank-scale. Furthermore, population-specific training cohort increased the accuracies of the models. This stand-alone and freely available method is applicable for research and screening for antigen-negative blood donors.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1011977</identifier><identifier>PMID: 38512997</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Accuracy ; Algorithms ; Antigens ; Biobanks ; Biology and Life Sciences ; Blood &amp; organ donations ; Blood banks ; Blood donors ; Blood groups ; Blood transfusion ; Classification ; Clinical Medicine ; Computer and Information Sciences ; Datasets ; Erythrocytes ; Forecasts and trends ; Genotype &amp; phenotype ; Genotypes ; Genotyping ; Hematologi ; Hematology ; Immunization ; Klinisk medicin ; Machine learning ; Medical and Health Sciences ; Medicin och hälsovetenskap ; Medicine and Health Sciences ; Methods ; People and Places ; Prediction models ; Research and Analysis Methods ; Risk reduction ; Technology application ; Transfusion</subject><ispartof>PLoS computational biology, 2024-03, Vol.20 (3), p.e1011977-e1011977</ispartof><rights>Copyright: © 2024 Hyvärinen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</rights><rights>COPYRIGHT 2024 Public Library of Science</rights><rights>2024 Hyvärinen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2024 Hyvärinen et al 2024 Hyvärinen et al</rights><rights>2024 Hyvärinen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c680t-ae0301b25d3c881af542f902d2b458c491db9cea782fd07377ee25d7cf2834eb3</cites><orcidid>0000-0001-5288-3851 ; 0000-0001-7905-9774 ; 0000-0001-6681-4734 ; 0000-0003-4605-2837</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10986993/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10986993/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,552,727,780,784,864,885,2102,2928,23866,27924,27925,53791,53793,79600,79601</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38512997$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://lup.lub.lu.se/record/891ab985-175f-49ca-b158-fccf09488a90$$DView record from Swedish Publication Index$$Hfree_for_read</backlink></links><search><contributor>Lu, Yang</contributor><creatorcontrib>Hyvärinen, Kati</creatorcontrib><creatorcontrib>Haimila, Katri</creatorcontrib><creatorcontrib>Moslemi, Camous</creatorcontrib><creatorcontrib>Biobank, Blood Service</creatorcontrib><creatorcontrib>Olsson, Martin L</creatorcontrib><creatorcontrib>Ostrowski, Sisse R</creatorcontrib><creatorcontrib>Pedersen, Ole B</creatorcontrib><creatorcontrib>Erikstrup, Christian</creatorcontrib><creatorcontrib>Partanen, Jukka</creatorcontrib><creatorcontrib>Ritari, Jarmo</creatorcontrib><title>A machine-learning method for biobank-scale genetic prediction of blood group antigens</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational methods for determining antigens from genomic data. We describe here a classification method for determining RBC antigens from genotyping array data. Random forest models for 39 RBC antigens in 14 blood group systems and for human platelet antigen (HPA)-1 were trained and tested using genotype and RBC antigen and HPA-1 typing data available for 1,192 blood donors in the Finnish Blood Service Biobank. The algorithm and models were further evaluated using a validation cohort of 111,667 Danish blood donors. In the Finnish test data set, the median (interquartile range [IQR]) balanced accuracy for 39 models was 99.9 (98.9-100)%. We were able to replicate 34 out of 39 Finnish models in the Danish cohort and the median (IQR) balanced accuracy for classifications was 97.1 (90.1-99.4)%. When applying models trained with the Danish cohort, the median (IQR) balanced accuracy for the 40 Danish models in the Danish test data set was 99.3 (95.1-99.8)%. The RBC antigen and HPA-1 prediction models demonstrated high overall accuracies suitable for probabilistic determination of blood groups and HPA-1 at biobank-scale. Furthermore, population-specific training cohort increased the accuracies of the models. This stand-alone and freely available method is applicable for research and screening for antigen-negative blood donors.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Antigens</subject><subject>Biobanks</subject><subject>Biology and Life Sciences</subject><subject>Blood &amp; organ donations</subject><subject>Blood banks</subject><subject>Blood donors</subject><subject>Blood groups</subject><subject>Blood transfusion</subject><subject>Classification</subject><subject>Clinical Medicine</subject><subject>Computer and Information Sciences</subject><subject>Datasets</subject><subject>Erythrocytes</subject><subject>Forecasts and trends</subject><subject>Genotype &amp; phenotype</subject><subject>Genotypes</subject><subject>Genotyping</subject><subject>Hematologi</subject><subject>Hematology</subject><subject>Immunization</subject><subject>Klinisk medicin</subject><subject>Machine learning</subject><subject>Medical and Health Sciences</subject><subject>Medicin och hälsovetenskap</subject><subject>Medicine and Health Sciences</subject><subject>Methods</subject><subject>People and Places</subject><subject>Prediction models</subject><subject>Research and Analysis Methods</subject><subject>Risk reduction</subject><subject>Technology application</subject><subject>Transfusion</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>D8T</sourceid><sourceid>DOA</sourceid><recordid>eNqVk12P1CAUhhujcdfRf2C0iTd6MSMUWuDKTDZ-TDLRxK9bAvTQYexAF1o__r3Mzuxmx-yNaUgJfd4HOM0piqcYLTBh-PU2TNGrfjEY7RYYYSwYu1ec47omc0Zqfv_W_Kx4lNIWoTwVzcPijPAaV0Kw8-L7stwps3Ee5j2o6J3vyh2Mm9CWNsRSu6CV_zFPRvVQduBhdKYcIrTOjC74MthS9yHTXQzTUCo_ukylx8UDq_oET47vWfHt3duvFx_m60_vVxfL9dw0HI1zBYggrKu6JYZzrGxNKytQ1Vaa1txQgVstDCjGK9siRhgDyDAztuKEgiaz4vnBO_QhyWNJkiSoEZhxkSOzYnUg2qC2cohup-IfGZSTVwshdlLFfKkepNC2UlY0ylBKW2BCVNAaSylH-WiEZtf64Eq_YJj0ia2fhjx0HjKB5AIrLXgtMautpMIoqXHNpTXGIkE5VwJl3Zvj4Se9yzuBH6PqT6ynX7zbyC78lBgJ3ghBsuHl0RDD5QRplDuXDPS98hCmJCvBKEJNxUVGX_yD3l2tI9Xl_y2dtyFvbPZSuWSck4bjK9fiDio_LeycCR6sy-sngVcngcyM8Hvs1JSSXH35_B_sx1OWHlgTQ0oR7E3xMJL7Lrm-pNx3iTx2SY49u134m9B1W5C_96oNjw</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Hyvärinen, Kati</creator><creator>Haimila, Katri</creator><creator>Moslemi, Camous</creator><creator>Biobank, Blood Service</creator><creator>Olsson, Martin L</creator><creator>Ostrowski, Sisse R</creator><creator>Pedersen, Ole B</creator><creator>Erikstrup, Christian</creator><creator>Partanen, Jukka</creator><creator>Ritari, Jarmo</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>ADTPV</scope><scope>AGCHP</scope><scope>AOWAS</scope><scope>D8T</scope><scope>D95</scope><scope>ZZAVC</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-5288-3851</orcidid><orcidid>https://orcid.org/0000-0001-7905-9774</orcidid><orcidid>https://orcid.org/0000-0001-6681-4734</orcidid><orcidid>https://orcid.org/0000-0003-4605-2837</orcidid></search><sort><creationdate>20240301</creationdate><title>A machine-learning method for biobank-scale genetic prediction of blood group antigens</title><author>Hyvärinen, Kati ; Haimila, Katri ; Moslemi, Camous ; Biobank, Blood Service ; Olsson, Martin L ; Ostrowski, Sisse R ; Pedersen, Ole B ; Erikstrup, Christian ; Partanen, Jukka ; Ritari, Jarmo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c680t-ae0301b25d3c881af542f902d2b458c491db9cea782fd07377ee25d7cf2834eb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Antigens</topic><topic>Biobanks</topic><topic>Biology and Life Sciences</topic><topic>Blood &amp; organ donations</topic><topic>Blood banks</topic><topic>Blood donors</topic><topic>Blood groups</topic><topic>Blood transfusion</topic><topic>Classification</topic><topic>Clinical Medicine</topic><topic>Computer and Information Sciences</topic><topic>Datasets</topic><topic>Erythrocytes</topic><topic>Forecasts and trends</topic><topic>Genotype &amp; phenotype</topic><topic>Genotypes</topic><topic>Genotyping</topic><topic>Hematologi</topic><topic>Hematology</topic><topic>Immunization</topic><topic>Klinisk medicin</topic><topic>Machine learning</topic><topic>Medical and Health Sciences</topic><topic>Medicin och hälsovetenskap</topic><topic>Medicine and Health Sciences</topic><topic>Methods</topic><topic>People and Places</topic><topic>Prediction models</topic><topic>Research and Analysis Methods</topic><topic>Risk reduction</topic><topic>Technology application</topic><topic>Transfusion</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hyvärinen, Kati</creatorcontrib><creatorcontrib>Haimila, Katri</creatorcontrib><creatorcontrib>Moslemi, Camous</creatorcontrib><creatorcontrib>Biobank, Blood Service</creatorcontrib><creatorcontrib>Olsson, Martin L</creatorcontrib><creatorcontrib>Ostrowski, Sisse R</creatorcontrib><creatorcontrib>Pedersen, Ole B</creatorcontrib><creatorcontrib>Erikstrup, Christian</creatorcontrib><creatorcontrib>Partanen, Jukka</creatorcontrib><creatorcontrib>Ritari, Jarmo</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>SwePub</collection><collection>SWEPUB Lunds universitet full text</collection><collection>SwePub Articles</collection><collection>SWEPUB Freely available online</collection><collection>SWEPUB Lunds universitet</collection><collection>SwePub Articles full text</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hyvärinen, Kati</au><au>Haimila, Katri</au><au>Moslemi, Camous</au><au>Biobank, Blood Service</au><au>Olsson, Martin L</au><au>Ostrowski, Sisse R</au><au>Pedersen, Ole B</au><au>Erikstrup, Christian</au><au>Partanen, Jukka</au><au>Ritari, Jarmo</au><au>Lu, Yang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A machine-learning method for biobank-scale genetic prediction of blood group antigens</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2024-03-01</date><risdate>2024</risdate><volume>20</volume><issue>3</issue><spage>e1011977</spage><epage>e1011977</epage><pages>e1011977-e1011977</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>A key element for successful blood transfusion is compatibility of the patient and donor red blood cell (RBC) antigens. Precise antigen matching reduces the risk for immunization and other adverse transfusion outcomes. RBC antigens are encoded by specific genes, which allows developing computational methods for determining antigens from genomic data. We describe here a classification method for determining RBC antigens from genotyping array data. Random forest models for 39 RBC antigens in 14 blood group systems and for human platelet antigen (HPA)-1 were trained and tested using genotype and RBC antigen and HPA-1 typing data available for 1,192 blood donors in the Finnish Blood Service Biobank. The algorithm and models were further evaluated using a validation cohort of 111,667 Danish blood donors. In the Finnish test data set, the median (interquartile range [IQR]) balanced accuracy for 39 models was 99.9 (98.9-100)%. We were able to replicate 34 out of 39 Finnish models in the Danish cohort and the median (IQR) balanced accuracy for classifications was 97.1 (90.1-99.4)%. When applying models trained with the Danish cohort, the median (IQR) balanced accuracy for the 40 Danish models in the Danish test data set was 99.3 (95.1-99.8)%. The RBC antigen and HPA-1 prediction models demonstrated high overall accuracies suitable for probabilistic determination of blood groups and HPA-1 at biobank-scale. Furthermore, population-specific training cohort increased the accuracies of the models. This stand-alone and freely available method is applicable for research and screening for antigen-negative blood donors.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>38512997</pmid><doi>10.1371/journal.pcbi.1011977</doi><tpages>e1011977</tpages><orcidid>https://orcid.org/0000-0001-5288-3851</orcidid><orcidid>https://orcid.org/0000-0001-7905-9774</orcidid><orcidid>https://orcid.org/0000-0001-6681-4734</orcidid><orcidid>https://orcid.org/0000-0003-4605-2837</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2024-03, Vol.20 (3), p.e1011977-e1011977
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_3069178907
source DOAJ Directory of Open Access Journals; SWEPUB Freely available online; Public Library of Science (PLoS) Journals Open Access; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Accuracy
Algorithms
Antigens
Biobanks
Biology and Life Sciences
Blood & organ donations
Blood banks
Blood donors
Blood groups
Blood transfusion
Classification
Clinical Medicine
Computer and Information Sciences
Datasets
Erythrocytes
Forecasts and trends
Genotype & phenotype
Genotypes
Genotyping
Hematologi
Hematology
Immunization
Klinisk medicin
Machine learning
Medical and Health Sciences
Medicin och hälsovetenskap
Medicine and Health Sciences
Methods
People and Places
Prediction models
Research and Analysis Methods
Risk reduction
Technology application
Transfusion
title A machine-learning method for biobank-scale genetic prediction of blood group antigens
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T14%3A10%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20machine-learning%20method%20for%20biobank-scale%20genetic%20prediction%20of%20blood%20group%20antigens&rft.jtitle=PLoS%20computational%20biology&rft.au=Hyv%C3%A4rinen,%20Kati&rft.date=2024-03-01&rft.volume=20&rft.issue=3&rft.spage=e1011977&rft.epage=e1011977&rft.pages=e1011977-e1011977&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1011977&rft_dat=%3Cgale_plos_%3EA788368189%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3069178907&rft_id=info:pmid/38512997&rft_galeid=A788368189&rft_doaj_id=oai_doaj_org_article_9bf2af96ac444de7992edcf44801af34&rfr_iscdi=true