Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings

Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a spe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yip, Jia Qi, Ng, Dianwen, Ma, Bin, Pervushin, Konstantin, Chng, Eng Siong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Yip, Jia Qi
Ng, Dianwen
Ma, Bin
Pervushin, Konstantin
Chng, Eng Siong
description Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.
doi_str_mv 10.48550/arxiv.2208.00935
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2208_00935</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2208_00935</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-2ce1e83fcaf8cac5f0264eca131992ac39e18c3a63a40598d8c6fb2154da6d43</originalsourceid><addsrcrecordid>eNotz8tOwzAQhWFvWKDCA7DCL5DgSxzsZRTKTQUkwj6aju1qpMSp4lDB2wOlq7P5daSPsSspysoaI25g_qJDqZSwpRBOm3P23IyUJt4ged4OkDNFQlhoSpwSV3f89eWdd_uAywz8QPBbTp95IeQd7RIMfD1ug_eUdvmCnUUYcrg87Yp19-uP9rHYvD08tc2mgPrWFAqDDFZHhGgR0ESh6iogSC2dU4DaBWlRQ62hEsZZb7GOWyVN5aH2lV6x6__Xo6XfzzTC_N3_mfqjSf8ANU9GSQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</title><source>arXiv.org</source><creator>Yip, Jia Qi ; Ng, Dianwen ; Ma, Bin ; Pervushin, Konstantin ; Chng, Eng Siong</creator><creatorcontrib>Yip, Jia Qi ; Ng, Dianwen ; Ma, Bin ; Pervushin, Konstantin ; Chng, Eng Siong</creatorcontrib><description>Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.</description><identifier>DOI: 10.48550/arxiv.2208.00935</identifier><language>eng</language><subject>Quantitative Biology - Quantitative Methods</subject><creationdate>2022-08</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2208.00935$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2208.00935$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yip, Jia Qi</creatorcontrib><creatorcontrib>Ng, Dianwen</creatorcontrib><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Pervushin, Konstantin</creatorcontrib><creatorcontrib>Chng, Eng Siong</creatorcontrib><title>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</title><description>Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.</description><subject>Quantitative Biology - Quantitative Methods</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8tOwzAQhWFvWKDCA7DCL5DgSxzsZRTKTQUkwj6aju1qpMSp4lDB2wOlq7P5daSPsSspysoaI25g_qJDqZSwpRBOm3P23IyUJt4ged4OkDNFQlhoSpwSV3f89eWdd_uAywz8QPBbTp95IeQd7RIMfD1ug_eUdvmCnUUYcrg87Yp19-uP9rHYvD08tc2mgPrWFAqDDFZHhGgR0ESh6iogSC2dU4DaBWlRQ62hEsZZb7GOWyVN5aH2lV6x6__Xo6XfzzTC_N3_mfqjSf8ANU9GSQ</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Yip, Jia Qi</creator><creator>Ng, Dianwen</creator><creator>Ma, Bin</creator><creator>Pervushin, Konstantin</creator><creator>Chng, Eng Siong</creator><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20220801</creationdate><title>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</title><author>Yip, Jia Qi ; Ng, Dianwen ; Ma, Bin ; Pervushin, Konstantin ; Chng, Eng Siong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-2ce1e83fcaf8cac5f0264eca131992ac39e18c3a63a40598d8c6fb2154da6d43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Quantitative Biology - Quantitative Methods</topic><toplevel>online_resources</toplevel><creatorcontrib>Yip, Jia Qi</creatorcontrib><creatorcontrib>Ng, Dianwen</creatorcontrib><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Pervushin, Konstantin</creatorcontrib><creatorcontrib>Chng, Eng Siong</creatorcontrib><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yip, Jia Qi</au><au>Ng, Dianwen</au><au>Ma, Bin</au><au>Pervushin, Konstantin</au><au>Chng, Eng Siong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</atitle><date>2022-08-01</date><risdate>2022</risdate><abstract>Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.</abstract><doi>10.48550/arxiv.2208.00935</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2208.00935
ispartof
issn
language eng
recordid cdi_arxiv_primary_2208_00935
source arXiv.org
subjects Quantitative Biology - Quantitative Methods
title Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T03%3A03%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Amino%20Acid%20Classification%20in%202D%20NMR%20Spectra%20via%20Acoustic%20Signal%20Embeddings&rft.au=Yip,%20Jia%20Qi&rft.date=2022-08-01&rft_id=info:doi/10.48550/arxiv.2208.00935&rft_dat=%3Carxiv_GOX%3E2208_00935%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true