Fast GPU 3D diffeomorphic image registration

3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of parallel and distributed computing 2021-03, Vol.149 (C), p.149-162
Hauptverfasser: Brunn, Malte, Himthani, Naveen, Biros, George, Mehl, Miriam, Mang, Andreas
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 162
container_issue C
container_start_page 149
container_title Journal of parallel and distributed computing
container_volume 149
creator Brunn, Malte
Himthani, Naveen
Biros, George
Mehl, Miriam
Mang, Andreas
description 3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architectures. Despite the importance of image registration, only a few implementations of large deformation diffeomorphic registration packages support GPUs. Our contributions are new algorithms to significantly reduce the run time of the two main computational kernels in CLAIRE: calculation of derivatives and scattered-data interpolation. We deploy (i) highly-optimized, mixed-precision GPU-kernels for the evaluation of scattered-data interpolation, (ii) replace Fast-Fourier-Transform (FFT)-based first-order derivatives with optimized 8th-order finite differences, and (iii) compare with state-of-the-art CPU and GPU implementations. As a highlight, we demonstrate that we can register 2563 clinical images in less than 6 s on a single NVIDIA Tesla V100. This amounts to over 20× speed-up over the current version of CLAIRE and over 30× speed-up over existing GPU implementations. •The LDDMM software CLAIRE is ported to GPU.•Compute intensive kernels are optimized.•A mixed-precision approach with Fast-Fourier-Transforms and finite differences is used.•Hardware acceleration is used for linear and cubic interpolations.•Clinical images can be registered in less than 6 seconds.
doi_str_mv 10.1016/j.jpdc.2020.11.006
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7769216</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S074373152030407X</els_id><sourcerecordid>2474467774</sourcerecordid><originalsourceid>FETCH-LOGICAL-c482t-d92b516cf1fb79bd87ce737048d8e7528e5730c40e19ed7bf21cb6036dad549f3</originalsourceid><addsrcrecordid>eNp9kUFv1DAQhS0EotvCH-CAIk49kO2M7diOhJBQaUulSnCgZyuxJ7te7caLna3Ev6-jLRVcOFkaf_PmzTzG3iEsEVBdbJabvXdLDrwUcAmgXrAFQqtqMNK8ZAvQUtRaYHPCTnPeACA22rxmJ0IIA1q1C_bxustTdfPjvhJfKx-GgeIupv06uCrsuhVViVYhT6mbQhzfsFdDt8309uk9Y_fXVz8vv9V3329uL7_c1U4aPtW-5X2Dyg049LrtvdGOtNAgjTekG26o0QKcBMKWvO4Hjq5XIJTvfCPbQZyxz0fd_aHfkXc0FgNbu0_FUvptYxfsvz9jWNtVfLC67MRRFYEPR4GYp2CzCxO5tYvjSG6yaBqQQhbo_GlKir8OlCe7C9nRdtuNFA_ZcqmlVFrrGeVH1KWYc6Lh2QuCnbOwGztnYecsLKItWZSm939v8dzy5_gF-HQEqNzyIVCandLoyIc0G_Ux_E__EcVrmc4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2474467774</pqid></control><display><type>article</type><title>Fast GPU 3D diffeomorphic image registration</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Brunn, Malte ; Himthani, Naveen ; Biros, George ; Mehl, Miriam ; Mang, Andreas</creator><creatorcontrib>Brunn, Malte ; Himthani, Naveen ; Biros, George ; Mehl, Miriam ; Mang, Andreas ; Univ. of Texas, Austin, TX (United States) ; Duke Univ., Durham, NC (United States)</creatorcontrib><description>3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architectures. Despite the importance of image registration, only a few implementations of large deformation diffeomorphic registration packages support GPUs. Our contributions are new algorithms to significantly reduce the run time of the two main computational kernels in CLAIRE: calculation of derivatives and scattered-data interpolation. We deploy (i) highly-optimized, mixed-precision GPU-kernels for the evaluation of scattered-data interpolation, (ii) replace Fast-Fourier-Transform (FFT)-based first-order derivatives with optimized 8th-order finite differences, and (iii) compare with state-of-the-art CPU and GPU implementations. As a highlight, we demonstrate that we can register 2563 clinical images in less than 6 s on a single NVIDIA Tesla V100. This amounts to over 20× speed-up over the current version of CLAIRE and over 30× speed-up over existing GPU implementations. •The LDDMM software CLAIRE is ported to GPU.•Compute intensive kernels are optimized.•A mixed-precision approach with Fast-Fourier-Transforms and finite differences is used.•Hardware acceleration is used for linear and cubic interpolations.•Clinical images can be registered in less than 6 seconds.</description><identifier>ISSN: 0743-7315</identifier><identifier>EISSN: 1096-0848</identifier><identifier>DOI: 10.1016/j.jpdc.2020.11.006</identifier><identifier>PMID: 33380769</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Computer Science ; Diffeomorphic image registration ; Gauss–Newton–Krylov method ; GPU computing ; MATHEMATICS AND COMPUTING ; Mixed-precision solver ; Parallel optimization</subject><ispartof>Journal of parallel and distributed computing, 2021-03, Vol.149 (C), p.149-162</ispartof><rights>2020 Elsevier Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c482t-d92b516cf1fb79bd87ce737048d8e7528e5730c40e19ed7bf21cb6036dad549f3</citedby><cites>FETCH-LOGICAL-c482t-d92b516cf1fb79bd87ce737048d8e7528e5730c40e19ed7bf21cb6036dad549f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.jpdc.2020.11.006$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,780,784,885,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33380769$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/servlets/purl/1850434$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Brunn, Malte</creatorcontrib><creatorcontrib>Himthani, Naveen</creatorcontrib><creatorcontrib>Biros, George</creatorcontrib><creatorcontrib>Mehl, Miriam</creatorcontrib><creatorcontrib>Mang, Andreas</creatorcontrib><creatorcontrib>Univ. of Texas, Austin, TX (United States)</creatorcontrib><creatorcontrib>Duke Univ., Durham, NC (United States)</creatorcontrib><title>Fast GPU 3D diffeomorphic image registration</title><title>Journal of parallel and distributed computing</title><addtitle>J Parallel Distrib Comput</addtitle><description>3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architectures. Despite the importance of image registration, only a few implementations of large deformation diffeomorphic registration packages support GPUs. Our contributions are new algorithms to significantly reduce the run time of the two main computational kernels in CLAIRE: calculation of derivatives and scattered-data interpolation. We deploy (i) highly-optimized, mixed-precision GPU-kernels for the evaluation of scattered-data interpolation, (ii) replace Fast-Fourier-Transform (FFT)-based first-order derivatives with optimized 8th-order finite differences, and (iii) compare with state-of-the-art CPU and GPU implementations. As a highlight, we demonstrate that we can register 2563 clinical images in less than 6 s on a single NVIDIA Tesla V100. This amounts to over 20× speed-up over the current version of CLAIRE and over 30× speed-up over existing GPU implementations. •The LDDMM software CLAIRE is ported to GPU.•Compute intensive kernels are optimized.•A mixed-precision approach with Fast-Fourier-Transforms and finite differences is used.•Hardware acceleration is used for linear and cubic interpolations.•Clinical images can be registered in less than 6 seconds.</description><subject>Computer Science</subject><subject>Diffeomorphic image registration</subject><subject>Gauss–Newton–Krylov method</subject><subject>GPU computing</subject><subject>MATHEMATICS AND COMPUTING</subject><subject>Mixed-precision solver</subject><subject>Parallel optimization</subject><issn>0743-7315</issn><issn>1096-0848</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kUFv1DAQhS0EotvCH-CAIk49kO2M7diOhJBQaUulSnCgZyuxJ7te7caLna3Ev6-jLRVcOFkaf_PmzTzG3iEsEVBdbJabvXdLDrwUcAmgXrAFQqtqMNK8ZAvQUtRaYHPCTnPeACA22rxmJ0IIA1q1C_bxustTdfPjvhJfKx-GgeIupv06uCrsuhVViVYhT6mbQhzfsFdDt8309uk9Y_fXVz8vv9V3329uL7_c1U4aPtW-5X2Dyg049LrtvdGOtNAgjTekG26o0QKcBMKWvO4Hjq5XIJTvfCPbQZyxz0fd_aHfkXc0FgNbu0_FUvptYxfsvz9jWNtVfLC67MRRFYEPR4GYp2CzCxO5tYvjSG6yaBqQQhbo_GlKir8OlCe7C9nRdtuNFA_ZcqmlVFrrGeVH1KWYc6Lh2QuCnbOwGztnYecsLKItWZSm939v8dzy5_gF-HQEqNzyIVCandLoyIc0G_Ux_E__EcVrmc4</recordid><startdate>20210301</startdate><enddate>20210301</enddate><creator>Brunn, Malte</creator><creator>Himthani, Naveen</creator><creator>Biros, George</creator><creator>Mehl, Miriam</creator><creator>Mang, Andreas</creator><general>Elsevier Inc</general><general>Elsevier</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>OIOZB</scope><scope>OTOTI</scope><scope>5PM</scope></search><sort><creationdate>20210301</creationdate><title>Fast GPU 3D diffeomorphic image registration</title><author>Brunn, Malte ; Himthani, Naveen ; Biros, George ; Mehl, Miriam ; Mang, Andreas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c482t-d92b516cf1fb79bd87ce737048d8e7528e5730c40e19ed7bf21cb6036dad549f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Computer Science</topic><topic>Diffeomorphic image registration</topic><topic>Gauss–Newton–Krylov method</topic><topic>GPU computing</topic><topic>MATHEMATICS AND COMPUTING</topic><topic>Mixed-precision solver</topic><topic>Parallel optimization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Brunn, Malte</creatorcontrib><creatorcontrib>Himthani, Naveen</creatorcontrib><creatorcontrib>Biros, George</creatorcontrib><creatorcontrib>Mehl, Miriam</creatorcontrib><creatorcontrib>Mang, Andreas</creatorcontrib><creatorcontrib>Univ. of Texas, Austin, TX (United States)</creatorcontrib><creatorcontrib>Duke Univ., Durham, NC (United States)</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>OSTI.GOV - Hybrid</collection><collection>OSTI.GOV</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of parallel and distributed computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Brunn, Malte</au><au>Himthani, Naveen</au><au>Biros, George</au><au>Mehl, Miriam</au><au>Mang, Andreas</au><aucorp>Univ. of Texas, Austin, TX (United States)</aucorp><aucorp>Duke Univ., Durham, NC (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Fast GPU 3D diffeomorphic image registration</atitle><jtitle>Journal of parallel and distributed computing</jtitle><addtitle>J Parallel Distrib Comput</addtitle><date>2021-03-01</date><risdate>2021</risdate><volume>149</volume><issue>C</issue><spage>149</spage><epage>162</epage><pages>149-162</pages><issn>0743-7315</issn><eissn>1096-0848</eissn><abstract>3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architectures. Despite the importance of image registration, only a few implementations of large deformation diffeomorphic registration packages support GPUs. Our contributions are new algorithms to significantly reduce the run time of the two main computational kernels in CLAIRE: calculation of derivatives and scattered-data interpolation. We deploy (i) highly-optimized, mixed-precision GPU-kernels for the evaluation of scattered-data interpolation, (ii) replace Fast-Fourier-Transform (FFT)-based first-order derivatives with optimized 8th-order finite differences, and (iii) compare with state-of-the-art CPU and GPU implementations. As a highlight, we demonstrate that we can register 2563 clinical images in less than 6 s on a single NVIDIA Tesla V100. This amounts to over 20× speed-up over the current version of CLAIRE and over 30× speed-up over existing GPU implementations. •The LDDMM software CLAIRE is ported to GPU.•Compute intensive kernels are optimized.•A mixed-precision approach with Fast-Fourier-Transforms and finite differences is used.•Hardware acceleration is used for linear and cubic interpolations.•Clinical images can be registered in less than 6 seconds.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>33380769</pmid><doi>10.1016/j.jpdc.2020.11.006</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0743-7315
ispartof Journal of parallel and distributed computing, 2021-03, Vol.149 (C), p.149-162
issn 0743-7315
1096-0848
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7769216
source Elsevier ScienceDirect Journals Complete
subjects Computer Science
Diffeomorphic image registration
Gauss–Newton–Krylov method
GPU computing
MATHEMATICS AND COMPUTING
Mixed-precision solver
Parallel optimization
title Fast GPU 3D diffeomorphic image registration
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T06%3A59%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Fast%20GPU%203D%20diffeomorphic%20image%20registration&rft.jtitle=Journal%20of%20parallel%20and%20distributed%20computing&rft.au=Brunn,%20Malte&rft.aucorp=Univ.%20of%20Texas,%20Austin,%20TX%20(United%20States)&rft.date=2021-03-01&rft.volume=149&rft.issue=C&rft.spage=149&rft.epage=162&rft.pages=149-162&rft.issn=0743-7315&rft.eissn=1096-0848&rft_id=info:doi/10.1016/j.jpdc.2020.11.006&rft_dat=%3Cproquest_pubme%3E2474467774%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2474467774&rft_id=info:pmid/33380769&rft_els_id=S074373152030407X&rfr_iscdi=true