Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings

Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a spe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yip, Jia Qi, Ng, Dianwen, Ma, Bin, Pervushin, Konstantin, Chng, Eng Siong
Format:	Artikel
Sprache:	eng
Schlagworte:	Quantitative Biology - Quantitative Methods
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yip, Jia Qi Ng, Dianwen Ma, Bin Pervushin, Konstantin Chng, Eng Siong
description	Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.
doi_str_mv	10.48550/arxiv.2208.00935
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2208_00935</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2208_00935</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-2ce1e83fcaf8cac5f0264eca131992ac39e18c3a63a40598d8c6fb2154da6d43</originalsourceid><addsrcrecordid>eNotz8tOwzAQhWFvWKDCA7DCL5DgSxzsZRTKTQUkwj6aju1qpMSp4lDB2wOlq7P5daSPsSspysoaI25g_qJDqZSwpRBOm3P23IyUJt4ged4OkDNFQlhoSpwSV3f89eWdd_uAywz8QPBbTp95IeQd7RIMfD1ug_eUdvmCnUUYcrg87Yp19-uP9rHYvD08tc2mgPrWFAqDDFZHhGgR0ESh6iogSC2dU4DaBWlRQ62hEsZZb7GOWyVN5aH2lV6x6__Xo6XfzzTC_N3_mfqjSf8ANU9GSQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</title><source>arXiv.org</source><creator>Yip, Jia Qi ; Ng, Dianwen ; Ma, Bin ; Pervushin, Konstantin ; Chng, Eng Siong</creator><creatorcontrib>Yip, Jia Qi ; Ng, Dianwen ; Ma, Bin ; Pervushin, Konstantin ; Chng, Eng Siong</creatorcontrib><description>Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.</description><identifier>DOI: 10.48550/arxiv.2208.00935</identifier><language>eng</language><subject>Quantitative Biology - Quantitative Methods</subject><creationdate>2022-08</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2208.00935$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2208.00935$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yip, Jia Qi</creatorcontrib><creatorcontrib>Ng, Dianwen</creatorcontrib><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Pervushin, Konstantin</creatorcontrib><creatorcontrib>Chng, Eng Siong</creatorcontrib><title>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</title><description>Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.</description><subject>Quantitative Biology - Quantitative Methods</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8tOwzAQhWFvWKDCA7DCL5DgSxzsZRTKTQUkwj6aju1qpMSp4lDB2wOlq7P5daSPsSspysoaI25g_qJDqZSwpRBOm3P23IyUJt4ged4OkDNFQlhoSpwSV3f89eWdd_uAywz8QPBbTp95IeQd7RIMfD1ug_eUdvmCnUUYcrg87Yp19-uP9rHYvD08tc2mgPrWFAqDDFZHhGgR0ESh6iogSC2dU4DaBWlRQ62hEsZZb7GOWyVN5aH2lV6x6__Xo6XfzzTC_N3_mfqjSf8ANU9GSQ</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Yip, Jia Qi</creator><creator>Ng, Dianwen</creator><creator>Ma, Bin</creator><creator>Pervushin, Konstantin</creator><creator>Chng, Eng Siong</creator><scope>ALC</scope><scope>GOX</scope></search><sort><creationdate>20220801</creationdate><title>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</title><author>Yip, Jia Qi ; Ng, Dianwen ; Ma, Bin ; Pervushin, Konstantin ; Chng, Eng Siong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-2ce1e83fcaf8cac5f0264eca131992ac39e18c3a63a40598d8c6fb2154da6d43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Quantitative Biology - Quantitative Methods</topic><toplevel>online_resources</toplevel><creatorcontrib>Yip, Jia Qi</creatorcontrib><creatorcontrib>Ng, Dianwen</creatorcontrib><creatorcontrib>Ma, Bin</creatorcontrib><creatorcontrib>Pervushin, Konstantin</creatorcontrib><creatorcontrib>Chng, Eng Siong</creatorcontrib><collection>arXiv Quantitative Biology</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yip, Jia Qi</au><au>Ng, Dianwen</au><au>Ma, Bin</au><au>Pervushin, Konstantin</au><au>Chng, Eng Siong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings</atitle><date>2022-08-01</date><risdate>2022</risdate><abstract>Nuclear Magnetic Resonance (NMR) is used in structural biology to experimentally determine the structure of proteins, which is used in many areas of biology and is an important part of drug development. Unfortunately, NMR data can cost thousands of dollars per sample to collect and it can take a specialist weeks to assign the observed resonances to specific chemical groups. There has thus been growing interest in the NMR community to use deep learning to automate NMR data annotation. Due to similarities between NMR and audio data, we propose that methods used in acoustic signal processing can be applied to NMR as well. Using a simulated amino acid dataset, we show that by swapping out filter banks with a trainable convolutional encoder, acoustic signal embeddings from speaker verification models can be used for amino acid classification in 2D NMR spectra by treating each amino acid as a unique speaker. On an NMR dataset comparable in size with of 46 hours of audio, we achieve a classification performance of 97.7% on a 20-class problem. We also achieve a 23% relative improvement by using an acoustic embedding model compared to an existing NMR-based model.</abstract><doi>10.48550/arxiv.2208.00935</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2208.00935
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2208_00935
source	arXiv.org
subjects	Quantitative Biology - Quantitative Methods
title	Amino Acid Classification in 2D NMR Spectra via Acoustic Signal Embeddings
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T03%3A03%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Amino%20Acid%20Classification%20in%202D%20NMR%20Spectra%20via%20Acoustic%20Signal%20Embeddings&rft.au=Yip,%20Jia%20Qi&rft.date=2022-08-01&rft_id=info:doi/10.48550/arxiv.2208.00935&rft_dat=%3Carxiv_GOX%3E2208_00935%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true