MaReIA: a cloud MapReduce based high performance whole slide image analysis framework
Recent advancements in systematic analysis of high resolution whole slide images have increase efficiency of diagnosis, prognosis and prediction of cancer and important diseases. Due to the enormous sizes and dimensions of whole slide images, the analysis requires extensive computing resources which...
Gespeichert in:
Veröffentlicht in: | Distributed and parallel databases : an international journal 2019-06, Vol.37 (2), p.251-272 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 272 |
---|---|
container_issue | 2 |
container_start_page | 251 |
container_title | Distributed and parallel databases : an international journal |
container_volume | 37 |
creator | Vo, Hoang Kong, Jun Teng, Dejun Liang, Yanhui Aji, Ablimit Teodoro, George Wang, Fusheng |
description | Recent advancements in systematic analysis of high resolution whole slide images have increase efficiency of diagnosis, prognosis and prediction of cancer and important diseases. Due to the enormous sizes and dimensions of whole slide images, the analysis requires extensive computing resources which are not commonly available. Images have to be tiled for processing due to computer memory limitations, which lead to inaccurate results due to the ignorance of boundary crossing objects. Thus, we propose a generic and highly scalable cloud-based image analysis framework for whole slide images. The framework enables parallelized integration of image analysis steps, such as segmentation and aggregation of micro-structures in a single pipeline, and generation of final objects manageable by databases. The core concept relies on the abstraction of objects in whole slide images as different classes of spatial geometries, which in turn can be handled as text based records in MapReduce. The framework applies an overlapping partitioning scheme on images, and provides parallelization of tiling and image segmentation based on MapReduce architecture. It further provides robust object normalization, graceful handling of boundary objects with an efficient spatial indexing based matching method to generate accurate results. Our experiments on Amazon EMR show that
MaReIA
is highly scalable, generic and extremely cost effective by benchmark tests. |
doi_str_mv | 10.1007/s10619-018-7237-1 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6583906</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2244132000</sourcerecordid><originalsourceid>FETCH-LOGICAL-c470t-e9016a150ad92af28635a8705ab2b2df1de0d2c1b746e519b21f76fb408d81f13</originalsourceid><addsrcrecordid>eNp1kVtv1DAQhS0EokvhB_CCLPHCS2DGSXzhAamquFRqhVTRZ8uJx7spSbzYm1b993i1pVwknizNfHM8Zw5jLxHeIoB6lxEkmgpQV0rUqsJHbIWtqivVKv2YrcAIWWmlxRF7lvM1ABiF6ik7qlGgktKs2NWFu6Szk_fc8X6Mi-cXbntJfumJdy6T55thveFbSiGmyc2lfLuJI_E8Dp74MLk1cTe78S4PmYfkJrqN6ftz9iS4MdOL-_eYXX36-O30S3X-9fPZ6cl51TcKdhUZQOmwBeeNcEFoWbdOK2hdJzrhA3oCL3rsVCOpRdMJDEqGrgHtNQasj9mHg-526SbyPc275Ea7TWWxdGejG-zfnXnY2HW8sbLVtQFZBN7cC6T4Y6G8s9OQexpHN1NcshWiabAW5XIFff0Peh2XVKzvKaEQNKIpFB6oPsWcE4WHZRDsPjR7CM2W0Ow-NLt38epPFw8Tv1IqgDgAubTmNaXfX_9f9Sfn36Hr</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2227108119</pqid></control><display><type>article</type><title>MaReIA: a cloud MapReduce based high performance whole slide image analysis framework</title><source>Springer Nature - Complete Springer Journals</source><creator>Vo, Hoang ; Kong, Jun ; Teng, Dejun ; Liang, Yanhui ; Aji, Ablimit ; Teodoro, George ; Wang, Fusheng</creator><creatorcontrib>Vo, Hoang ; Kong, Jun ; Teng, Dejun ; Liang, Yanhui ; Aji, Ablimit ; Teodoro, George ; Wang, Fusheng</creatorcontrib><description>Recent advancements in systematic analysis of high resolution whole slide images have increase efficiency of diagnosis, prognosis and prediction of cancer and important diseases. Due to the enormous sizes and dimensions of whole slide images, the analysis requires extensive computing resources which are not commonly available. Images have to be tiled for processing due to computer memory limitations, which lead to inaccurate results due to the ignorance of boundary crossing objects. Thus, we propose a generic and highly scalable cloud-based image analysis framework for whole slide images. The framework enables parallelized integration of image analysis steps, such as segmentation and aggregation of micro-structures in a single pipeline, and generation of final objects manageable by databases. The core concept relies on the abstraction of objects in whole slide images as different classes of spatial geometries, which in turn can be handled as text based records in MapReduce. The framework applies an overlapping partitioning scheme on images, and provides parallelization of tiling and image segmentation based on MapReduce architecture. It further provides robust object normalization, graceful handling of boundary objects with an efficient spatial indexing based matching method to generate accurate results. Our experiments on Amazon EMR show that
MaReIA
is highly scalable, generic and extremely cost effective by benchmark tests.</description><identifier>ISSN: 0926-8782</identifier><identifier>EISSN: 1573-7578</identifier><identifier>DOI: 10.1007/s10619-018-7237-1</identifier><identifier>PMID: 31217669</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Computer memory ; Computer Science ; Data Structures ; Database Management ; Image analysis ; Image resolution ; Image segmentation ; Information Systems Applications (incl.Internet) ; Medical imaging ; Memory Structures ; Operating Systems ; Parallel processing ; Special Issue on Data Management and Analytics for Healthcare ; Tiling</subject><ispartof>Distributed and parallel databases : an international journal, 2019-06, Vol.37 (2), p.251-272</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018</rights><rights>Copyright Springer Nature B.V. 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c470t-e9016a150ad92af28635a8705ab2b2df1de0d2c1b746e519b21f76fb408d81f13</citedby><cites>FETCH-LOGICAL-c470t-e9016a150ad92af28635a8705ab2b2df1de0d2c1b746e519b21f76fb408d81f13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10619-018-7237-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10619-018-7237-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>230,314,778,782,883,27911,27912,41475,42544,51306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31217669$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Vo, Hoang</creatorcontrib><creatorcontrib>Kong, Jun</creatorcontrib><creatorcontrib>Teng, Dejun</creatorcontrib><creatorcontrib>Liang, Yanhui</creatorcontrib><creatorcontrib>Aji, Ablimit</creatorcontrib><creatorcontrib>Teodoro, George</creatorcontrib><creatorcontrib>Wang, Fusheng</creatorcontrib><title>MaReIA: a cloud MapReduce based high performance whole slide image analysis framework</title><title>Distributed and parallel databases : an international journal</title><addtitle>Distrib Parallel Databases</addtitle><addtitle>Distrib Parallel Databases</addtitle><description>Recent advancements in systematic analysis of high resolution whole slide images have increase efficiency of diagnosis, prognosis and prediction of cancer and important diseases. Due to the enormous sizes and dimensions of whole slide images, the analysis requires extensive computing resources which are not commonly available. Images have to be tiled for processing due to computer memory limitations, which lead to inaccurate results due to the ignorance of boundary crossing objects. Thus, we propose a generic and highly scalable cloud-based image analysis framework for whole slide images. The framework enables parallelized integration of image analysis steps, such as segmentation and aggregation of micro-structures in a single pipeline, and generation of final objects manageable by databases. The core concept relies on the abstraction of objects in whole slide images as different classes of spatial geometries, which in turn can be handled as text based records in MapReduce. The framework applies an overlapping partitioning scheme on images, and provides parallelization of tiling and image segmentation based on MapReduce architecture. It further provides robust object normalization, graceful handling of boundary objects with an efficient spatial indexing based matching method to generate accurate results. Our experiments on Amazon EMR show that
MaReIA
is highly scalable, generic and extremely cost effective by benchmark tests.</description><subject>Computer memory</subject><subject>Computer Science</subject><subject>Data Structures</subject><subject>Database Management</subject><subject>Image analysis</subject><subject>Image resolution</subject><subject>Image segmentation</subject><subject>Information Systems Applications (incl.Internet)</subject><subject>Medical imaging</subject><subject>Memory Structures</subject><subject>Operating Systems</subject><subject>Parallel processing</subject><subject>Special Issue on Data Management and Analytics for Healthcare</subject><subject>Tiling</subject><issn>0926-8782</issn><issn>1573-7578</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp1kVtv1DAQhS0EokvhB_CCLPHCS2DGSXzhAamquFRqhVTRZ8uJx7spSbzYm1b993i1pVwknizNfHM8Zw5jLxHeIoB6lxEkmgpQV0rUqsJHbIWtqivVKv2YrcAIWWmlxRF7lvM1ABiF6ik7qlGgktKs2NWFu6Szk_fc8X6Mi-cXbntJfumJdy6T55thveFbSiGmyc2lfLuJI_E8Dp74MLk1cTe78S4PmYfkJrqN6ftz9iS4MdOL-_eYXX36-O30S3X-9fPZ6cl51TcKdhUZQOmwBeeNcEFoWbdOK2hdJzrhA3oCL3rsVCOpRdMJDEqGrgHtNQasj9mHg-526SbyPc275Ea7TWWxdGejG-zfnXnY2HW8sbLVtQFZBN7cC6T4Y6G8s9OQexpHN1NcshWiabAW5XIFff0Peh2XVKzvKaEQNKIpFB6oPsWcE4WHZRDsPjR7CM2W0Ow-NLt38epPFw8Tv1IqgDgAubTmNaXfX_9f9Sfn36Hr</recordid><startdate>20190601</startdate><enddate>20190601</enddate><creator>Vo, Hoang</creator><creator>Kong, Jun</creator><creator>Teng, Dejun</creator><creator>Liang, Yanhui</creator><creator>Aji, Ablimit</creator><creator>Teodoro, George</creator><creator>Wang, Fusheng</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20190601</creationdate><title>MaReIA: a cloud MapReduce based high performance whole slide image analysis framework</title><author>Vo, Hoang ; Kong, Jun ; Teng, Dejun ; Liang, Yanhui ; Aji, Ablimit ; Teodoro, George ; Wang, Fusheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c470t-e9016a150ad92af28635a8705ab2b2df1de0d2c1b746e519b21f76fb408d81f13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computer memory</topic><topic>Computer Science</topic><topic>Data Structures</topic><topic>Database Management</topic><topic>Image analysis</topic><topic>Image resolution</topic><topic>Image segmentation</topic><topic>Information Systems Applications (incl.Internet)</topic><topic>Medical imaging</topic><topic>Memory Structures</topic><topic>Operating Systems</topic><topic>Parallel processing</topic><topic>Special Issue on Data Management and Analytics for Healthcare</topic><topic>Tiling</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Vo, Hoang</creatorcontrib><creatorcontrib>Kong, Jun</creatorcontrib><creatorcontrib>Teng, Dejun</creatorcontrib><creatorcontrib>Liang, Yanhui</creatorcontrib><creatorcontrib>Aji, Ablimit</creatorcontrib><creatorcontrib>Teodoro, George</creatorcontrib><creatorcontrib>Wang, Fusheng</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Distributed and parallel databases : an international journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vo, Hoang</au><au>Kong, Jun</au><au>Teng, Dejun</au><au>Liang, Yanhui</au><au>Aji, Ablimit</au><au>Teodoro, George</au><au>Wang, Fusheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MaReIA: a cloud MapReduce based high performance whole slide image analysis framework</atitle><jtitle>Distributed and parallel databases : an international journal</jtitle><stitle>Distrib Parallel Databases</stitle><addtitle>Distrib Parallel Databases</addtitle><date>2019-06-01</date><risdate>2019</risdate><volume>37</volume><issue>2</issue><spage>251</spage><epage>272</epage><pages>251-272</pages><issn>0926-8782</issn><eissn>1573-7578</eissn><abstract>Recent advancements in systematic analysis of high resolution whole slide images have increase efficiency of diagnosis, prognosis and prediction of cancer and important diseases. Due to the enormous sizes and dimensions of whole slide images, the analysis requires extensive computing resources which are not commonly available. Images have to be tiled for processing due to computer memory limitations, which lead to inaccurate results due to the ignorance of boundary crossing objects. Thus, we propose a generic and highly scalable cloud-based image analysis framework for whole slide images. The framework enables parallelized integration of image analysis steps, such as segmentation and aggregation of micro-structures in a single pipeline, and generation of final objects manageable by databases. The core concept relies on the abstraction of objects in whole slide images as different classes of spatial geometries, which in turn can be handled as text based records in MapReduce. The framework applies an overlapping partitioning scheme on images, and provides parallelization of tiling and image segmentation based on MapReduce architecture. It further provides robust object normalization, graceful handling of boundary objects with an efficient spatial indexing based matching method to generate accurate results. Our experiments on Amazon EMR show that
MaReIA
is highly scalable, generic and extremely cost effective by benchmark tests.</abstract><cop>New York</cop><pub>Springer US</pub><pmid>31217669</pmid><doi>10.1007/s10619-018-7237-1</doi><tpages>22</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0926-8782 |
ispartof | Distributed and parallel databases : an international journal, 2019-06, Vol.37 (2), p.251-272 |
issn | 0926-8782 1573-7578 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6583906 |
source | Springer Nature - Complete Springer Journals |
subjects | Computer memory Computer Science Data Structures Database Management Image analysis Image resolution Image segmentation Information Systems Applications (incl.Internet) Medical imaging Memory Structures Operating Systems Parallel processing Special Issue on Data Management and Analytics for Healthcare Tiling |
title | MaReIA: a cloud MapReduce based high performance whole slide image analysis framework |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T19%3A57%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MaReIA:%20a%20cloud%20MapReduce%20based%20high%20performance%20whole%20slide%20image%20analysis%20framework&rft.jtitle=Distributed%20and%20parallel%20databases%20:%20an%20international%20journal&rft.au=Vo,%20Hoang&rft.date=2019-06-01&rft.volume=37&rft.issue=2&rft.spage=251&rft.epage=272&rft.pages=251-272&rft.issn=0926-8782&rft.eissn=1573-7578&rft_id=info:doi/10.1007/s10619-018-7237-1&rft_dat=%3Cproquest_pubme%3E2244132000%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2227108119&rft_id=info:pmid/31217669&rfr_iscdi=true |