GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics

We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The international journal of high performance computing applications 2023-11, Vol.37 (6), p.683-705
Hauptverfasser: Zvyagin, Maxim, Brace, Alexander, Hippe, Kyle, Deng, Yuntian, Zhang, Bin, Bohorquez, Cindy Orozco, Clyde, Austin, Kale, Bharat, Perez-Rivera, Danilo, Ma, Heng, Mann, Carla M., Irvin, Michael, Ozgulbas, Defne G., Vassilieva, Natalia, Pauloski, James Gregory, Ward, Logan, Hayot-Sasson, Valerie, Emani, Murali, Foreman, Sam, Xie, Zhen, Lin, Diangen, Shukla, Maulik, Nie, Weili, Romero, Josh, Dallago, Christian, Vahdat, Arash, Xiao, Chaowei, Gibbs, Thomas, Foster, Ian, Davis, James J., Papka, Michael E., Brettin, Thomas, Stevens, Rick, Anandkumar, Anima, Vishwanath, Venkatram, Ramanathan, Arvind
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 705
container_issue 6
container_start_page 683
container_title The international journal of high performance computing applications
container_volume 37
creator Zvyagin, Maxim
Brace, Alexander
Hippe, Kyle
Deng, Yuntian
Zhang, Bin
Bohorquez, Cindy Orozco
Clyde, Austin
Kale, Bharat
Perez-Rivera, Danilo
Ma, Heng
Mann, Carla M.
Irvin, Michael
Ozgulbas, Defne G.
Vassilieva, Natalia
Pauloski, James Gregory
Ward, Logan
Hayot-Sasson, Valerie
Emani, Murali
Foreman, Sam
Xie, Zhen
Lin, Diangen
Shukla, Maulik
Nie, Weili
Romero, Josh
Dallago, Christian
Vahdat, Arash
Xiao, Chaowei
Gibbs, Thomas
Foster, Ian
Davis, James J.
Papka, Michael E.
Brettin, Thomas
Stevens, Rick
Anandkumar, Anima
Vishwanath, Venkatram
Ramanathan, Arvind
description We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole-genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.
doi_str_mv 10.1177/10943420231201154
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2888591595</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_10943420231201154</sage_id><sourcerecordid>2888591595</sourcerecordid><originalsourceid>FETCH-LOGICAL-c312t-5a6a2166e2085c3a8080dea1e7fd93c91c2caccf3cfc05529e21f08453f9048f3</originalsourceid><addsrcrecordid>eNp1UE1LAzEUDKJgrf4AbwueU_PysZt4K0Wr0CK46nUJ2ZfSsrupm26h_96UCh7E0xt4M_PmDSG3wCYARXEPzEghOeMCOANQ8oyMoJBAuZb5ecJpT4-ES3IV44YxlkuhRmQ5x65cLONDlkBokUZnG8wa260Gu8KsDTU2Metxj7bJyulbSWfhk_IM96EZduvQ2f6Q1YfOtmsXr8mFt03Em585Jh9Pj--zZ7p4nb_MpgvqUrwdVTa3HPIcOdPKCauZZjVawMLXRjgDjjvrnBfOO6YUN8jBMy2V8IZJ7cWY3J18t334GjDuqk0Y-i6drLjWWhlQRiUWnFiuDzH26Kttv25T3gpYdWyt-tNa0kxOmpi-_3X9X_ANvsNqqw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2888591595</pqid></control><display><type>article</type><title>GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics</title><source>SAGE Complete</source><source>Alma/SFX Local Collection</source><creator>Zvyagin, Maxim ; Brace, Alexander ; Hippe, Kyle ; Deng, Yuntian ; Zhang, Bin ; Bohorquez, Cindy Orozco ; Clyde, Austin ; Kale, Bharat ; Perez-Rivera, Danilo ; Ma, Heng ; Mann, Carla M. ; Irvin, Michael ; Ozgulbas, Defne G. ; Vassilieva, Natalia ; Pauloski, James Gregory ; Ward, Logan ; Hayot-Sasson, Valerie ; Emani, Murali ; Foreman, Sam ; Xie, Zhen ; Lin, Diangen ; Shukla, Maulik ; Nie, Weili ; Romero, Josh ; Dallago, Christian ; Vahdat, Arash ; Xiao, Chaowei ; Gibbs, Thomas ; Foster, Ian ; Davis, James J. ; Papka, Michael E. ; Brettin, Thomas ; Stevens, Rick ; Anandkumar, Anima ; Vishwanath, Venkatram ; Ramanathan, Arvind</creator><creatorcontrib>Zvyagin, Maxim ; Brace, Alexander ; Hippe, Kyle ; Deng, Yuntian ; Zhang, Bin ; Bohorquez, Cindy Orozco ; Clyde, Austin ; Kale, Bharat ; Perez-Rivera, Danilo ; Ma, Heng ; Mann, Carla M. ; Irvin, Michael ; Ozgulbas, Defne G. ; Vassilieva, Natalia ; Pauloski, James Gregory ; Ward, Logan ; Hayot-Sasson, Valerie ; Emani, Murali ; Foreman, Sam ; Xie, Zhen ; Lin, Diangen ; Shukla, Maulik ; Nie, Weili ; Romero, Josh ; Dallago, Christian ; Vahdat, Arash ; Xiao, Chaowei ; Gibbs, Thomas ; Foster, Ian ; Davis, James J. ; Papka, Michael E. ; Brettin, Thomas ; Stevens, Rick ; Anandkumar, Anima ; Vishwanath, Venkatram ; Ramanathan, Arvind</creatorcontrib><description>We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole-genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.</description><identifier>ISSN: 1094-3420</identifier><identifier>EISSN: 1741-2846</identifier><identifier>DOI: 10.1177/10943420231201154</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Artificial intelligence ; Evolution ; Gene sequencing ; Genomes ; Graphics processing units ; Large language models ; Severe acute respiratory syndrome coronavirus 2 ; Training ; Viral diseases</subject><ispartof>The international journal of high performance computing applications, 2023-11, Vol.37 (6), p.683-705</ispartof><rights>The Author(s) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c312t-5a6a2166e2085c3a8080dea1e7fd93c91c2caccf3cfc05529e21f08453f9048f3</citedby><cites>FETCH-LOGICAL-c312t-5a6a2166e2085c3a8080dea1e7fd93c91c2caccf3cfc05529e21f08453f9048f3</cites><orcidid>0000-0002-6547-6902 ; 0000-0002-1622-5488 ; 0000-0002-7316-3922 ; 0000-0002-6778-8563 ; 0000-0003-2129-5269</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1177/10943420231201154$$EPDF$$P50$$Gsage$$H</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1177/10943420231201154$$EHTML$$P50$$Gsage$$H</linktohtml><link.rule.ids>314,776,780,21798,27901,27902,43597,43598</link.rule.ids></links><search><creatorcontrib>Zvyagin, Maxim</creatorcontrib><creatorcontrib>Brace, Alexander</creatorcontrib><creatorcontrib>Hippe, Kyle</creatorcontrib><creatorcontrib>Deng, Yuntian</creatorcontrib><creatorcontrib>Zhang, Bin</creatorcontrib><creatorcontrib>Bohorquez, Cindy Orozco</creatorcontrib><creatorcontrib>Clyde, Austin</creatorcontrib><creatorcontrib>Kale, Bharat</creatorcontrib><creatorcontrib>Perez-Rivera, Danilo</creatorcontrib><creatorcontrib>Ma, Heng</creatorcontrib><creatorcontrib>Mann, Carla M.</creatorcontrib><creatorcontrib>Irvin, Michael</creatorcontrib><creatorcontrib>Ozgulbas, Defne G.</creatorcontrib><creatorcontrib>Vassilieva, Natalia</creatorcontrib><creatorcontrib>Pauloski, James Gregory</creatorcontrib><creatorcontrib>Ward, Logan</creatorcontrib><creatorcontrib>Hayot-Sasson, Valerie</creatorcontrib><creatorcontrib>Emani, Murali</creatorcontrib><creatorcontrib>Foreman, Sam</creatorcontrib><creatorcontrib>Xie, Zhen</creatorcontrib><creatorcontrib>Lin, Diangen</creatorcontrib><creatorcontrib>Shukla, Maulik</creatorcontrib><creatorcontrib>Nie, Weili</creatorcontrib><creatorcontrib>Romero, Josh</creatorcontrib><creatorcontrib>Dallago, Christian</creatorcontrib><creatorcontrib>Vahdat, Arash</creatorcontrib><creatorcontrib>Xiao, Chaowei</creatorcontrib><creatorcontrib>Gibbs, Thomas</creatorcontrib><creatorcontrib>Foster, Ian</creatorcontrib><creatorcontrib>Davis, James J.</creatorcontrib><creatorcontrib>Papka, Michael E.</creatorcontrib><creatorcontrib>Brettin, Thomas</creatorcontrib><creatorcontrib>Stevens, Rick</creatorcontrib><creatorcontrib>Anandkumar, Anima</creatorcontrib><creatorcontrib>Vishwanath, Venkatram</creatorcontrib><creatorcontrib>Ramanathan, Arvind</creatorcontrib><title>GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics</title><title>The international journal of high performance computing applications</title><description>We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole-genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.</description><subject>Artificial intelligence</subject><subject>Evolution</subject><subject>Gene sequencing</subject><subject>Genomes</subject><subject>Graphics processing units</subject><subject>Large language models</subject><subject>Severe acute respiratory syndrome coronavirus 2</subject><subject>Training</subject><subject>Viral diseases</subject><issn>1094-3420</issn><issn>1741-2846</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp1UE1LAzEUDKJgrf4AbwueU_PysZt4K0Wr0CK46nUJ2ZfSsrupm26h_96UCh7E0xt4M_PmDSG3wCYARXEPzEghOeMCOANQ8oyMoJBAuZb5ecJpT4-ES3IV44YxlkuhRmQ5x65cLONDlkBokUZnG8wa260Gu8KsDTU2Metxj7bJyulbSWfhk_IM96EZduvQ2f6Q1YfOtmsXr8mFt03Em585Jh9Pj--zZ7p4nb_MpgvqUrwdVTa3HPIcOdPKCauZZjVawMLXRjgDjjvrnBfOO6YUN8jBMy2V8IZJ7cWY3J18t334GjDuqk0Y-i6drLjWWhlQRiUWnFiuDzH26Kttv25T3gpYdWyt-tNa0kxOmpi-_3X9X_ANvsNqqw</recordid><startdate>202311</startdate><enddate>202311</enddate><creator>Zvyagin, Maxim</creator><creator>Brace, Alexander</creator><creator>Hippe, Kyle</creator><creator>Deng, Yuntian</creator><creator>Zhang, Bin</creator><creator>Bohorquez, Cindy Orozco</creator><creator>Clyde, Austin</creator><creator>Kale, Bharat</creator><creator>Perez-Rivera, Danilo</creator><creator>Ma, Heng</creator><creator>Mann, Carla M.</creator><creator>Irvin, Michael</creator><creator>Ozgulbas, Defne G.</creator><creator>Vassilieva, Natalia</creator><creator>Pauloski, James Gregory</creator><creator>Ward, Logan</creator><creator>Hayot-Sasson, Valerie</creator><creator>Emani, Murali</creator><creator>Foreman, Sam</creator><creator>Xie, Zhen</creator><creator>Lin, Diangen</creator><creator>Shukla, Maulik</creator><creator>Nie, Weili</creator><creator>Romero, Josh</creator><creator>Dallago, Christian</creator><creator>Vahdat, Arash</creator><creator>Xiao, Chaowei</creator><creator>Gibbs, Thomas</creator><creator>Foster, Ian</creator><creator>Davis, James J.</creator><creator>Papka, Michael E.</creator><creator>Brettin, Thomas</creator><creator>Stevens, Rick</creator><creator>Anandkumar, Anima</creator><creator>Vishwanath, Venkatram</creator><creator>Ramanathan, Arvind</creator><general>SAGE Publications</general><general>SAGE PUBLICATIONS, INC</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6547-6902</orcidid><orcidid>https://orcid.org/0000-0002-1622-5488</orcidid><orcidid>https://orcid.org/0000-0002-7316-3922</orcidid><orcidid>https://orcid.org/0000-0002-6778-8563</orcidid><orcidid>https://orcid.org/0000-0003-2129-5269</orcidid></search><sort><creationdate>202311</creationdate><title>GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics</title><author>Zvyagin, Maxim ; Brace, Alexander ; Hippe, Kyle ; Deng, Yuntian ; Zhang, Bin ; Bohorquez, Cindy Orozco ; Clyde, Austin ; Kale, Bharat ; Perez-Rivera, Danilo ; Ma, Heng ; Mann, Carla M. ; Irvin, Michael ; Ozgulbas, Defne G. ; Vassilieva, Natalia ; Pauloski, James Gregory ; Ward, Logan ; Hayot-Sasson, Valerie ; Emani, Murali ; Foreman, Sam ; Xie, Zhen ; Lin, Diangen ; Shukla, Maulik ; Nie, Weili ; Romero, Josh ; Dallago, Christian ; Vahdat, Arash ; Xiao, Chaowei ; Gibbs, Thomas ; Foster, Ian ; Davis, James J. ; Papka, Michael E. ; Brettin, Thomas ; Stevens, Rick ; Anandkumar, Anima ; Vishwanath, Venkatram ; Ramanathan, Arvind</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c312t-5a6a2166e2085c3a8080dea1e7fd93c91c2caccf3cfc05529e21f08453f9048f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial intelligence</topic><topic>Evolution</topic><topic>Gene sequencing</topic><topic>Genomes</topic><topic>Graphics processing units</topic><topic>Large language models</topic><topic>Severe acute respiratory syndrome coronavirus 2</topic><topic>Training</topic><topic>Viral diseases</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zvyagin, Maxim</creatorcontrib><creatorcontrib>Brace, Alexander</creatorcontrib><creatorcontrib>Hippe, Kyle</creatorcontrib><creatorcontrib>Deng, Yuntian</creatorcontrib><creatorcontrib>Zhang, Bin</creatorcontrib><creatorcontrib>Bohorquez, Cindy Orozco</creatorcontrib><creatorcontrib>Clyde, Austin</creatorcontrib><creatorcontrib>Kale, Bharat</creatorcontrib><creatorcontrib>Perez-Rivera, Danilo</creatorcontrib><creatorcontrib>Ma, Heng</creatorcontrib><creatorcontrib>Mann, Carla M.</creatorcontrib><creatorcontrib>Irvin, Michael</creatorcontrib><creatorcontrib>Ozgulbas, Defne G.</creatorcontrib><creatorcontrib>Vassilieva, Natalia</creatorcontrib><creatorcontrib>Pauloski, James Gregory</creatorcontrib><creatorcontrib>Ward, Logan</creatorcontrib><creatorcontrib>Hayot-Sasson, Valerie</creatorcontrib><creatorcontrib>Emani, Murali</creatorcontrib><creatorcontrib>Foreman, Sam</creatorcontrib><creatorcontrib>Xie, Zhen</creatorcontrib><creatorcontrib>Lin, Diangen</creatorcontrib><creatorcontrib>Shukla, Maulik</creatorcontrib><creatorcontrib>Nie, Weili</creatorcontrib><creatorcontrib>Romero, Josh</creatorcontrib><creatorcontrib>Dallago, Christian</creatorcontrib><creatorcontrib>Vahdat, Arash</creatorcontrib><creatorcontrib>Xiao, Chaowei</creatorcontrib><creatorcontrib>Gibbs, Thomas</creatorcontrib><creatorcontrib>Foster, Ian</creatorcontrib><creatorcontrib>Davis, James J.</creatorcontrib><creatorcontrib>Papka, Michael E.</creatorcontrib><creatorcontrib>Brettin, Thomas</creatorcontrib><creatorcontrib>Stevens, Rick</creatorcontrib><creatorcontrib>Anandkumar, Anima</creatorcontrib><creatorcontrib>Vishwanath, Venkatram</creatorcontrib><creatorcontrib>Ramanathan, Arvind</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The international journal of high performance computing applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zvyagin, Maxim</au><au>Brace, Alexander</au><au>Hippe, Kyle</au><au>Deng, Yuntian</au><au>Zhang, Bin</au><au>Bohorquez, Cindy Orozco</au><au>Clyde, Austin</au><au>Kale, Bharat</au><au>Perez-Rivera, Danilo</au><au>Ma, Heng</au><au>Mann, Carla M.</au><au>Irvin, Michael</au><au>Ozgulbas, Defne G.</au><au>Vassilieva, Natalia</au><au>Pauloski, James Gregory</au><au>Ward, Logan</au><au>Hayot-Sasson, Valerie</au><au>Emani, Murali</au><au>Foreman, Sam</au><au>Xie, Zhen</au><au>Lin, Diangen</au><au>Shukla, Maulik</au><au>Nie, Weili</au><au>Romero, Josh</au><au>Dallago, Christian</au><au>Vahdat, Arash</au><au>Xiao, Chaowei</au><au>Gibbs, Thomas</au><au>Foster, Ian</au><au>Davis, James J.</au><au>Papka, Michael E.</au><au>Brettin, Thomas</au><au>Stevens, Rick</au><au>Anandkumar, Anima</au><au>Vishwanath, Venkatram</au><au>Ramanathan, Arvind</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics</atitle><jtitle>The international journal of high performance computing applications</jtitle><date>2023-11</date><risdate>2023</risdate><volume>37</volume><issue>6</issue><spage>683</spage><epage>705</epage><pages>683-705</pages><issn>1094-3420</issn><eissn>1741-2846</eissn><abstract>We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole-genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/10943420231201154</doi><tpages>23</tpages><orcidid>https://orcid.org/0000-0002-6547-6902</orcidid><orcidid>https://orcid.org/0000-0002-1622-5488</orcidid><orcidid>https://orcid.org/0000-0002-7316-3922</orcidid><orcidid>https://orcid.org/0000-0002-6778-8563</orcidid><orcidid>https://orcid.org/0000-0003-2129-5269</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1094-3420
ispartof The international journal of high performance computing applications, 2023-11, Vol.37 (6), p.683-705
issn 1094-3420
1741-2846
language eng
recordid cdi_proquest_journals_2888591595
source SAGE Complete; Alma/SFX Local Collection
subjects Artificial intelligence
Evolution
Gene sequencing
Genomes
Graphics processing units
Large language models
Severe acute respiratory syndrome coronavirus 2
Training
Viral diseases
title GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T11%3A34%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=GenSLMs:%20Genome-scale%20language%20models%20reveal%20SARS-CoV-2%20evolutionary%20dynamics&rft.jtitle=The%20international%20journal%20of%20high%20performance%20computing%20applications&rft.au=Zvyagin,%20Maxim&rft.date=2023-11&rft.volume=37&rft.issue=6&rft.spage=683&rft.epage=705&rft.pages=683-705&rft.issn=1094-3420&rft.eissn=1741-2846&rft_id=info:doi/10.1177/10943420231201154&rft_dat=%3Cproquest_cross%3E2888591595%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2888591595&rft_id=info:pmid/&rft_sage_id=10.1177_10943420231201154&rfr_iscdi=true