Applying a random projection algorithm to optimize machine learning model for breast lesion classification

Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine lear...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-09
Hauptverfasser:	Heidari, Morteza, Sivaramakrishnan Lakshmivarahan, Mirniaharikandehei, Seyedehnafiseh, Danala, Gopichandh, Sai Kiran R Maryada, Liu, Hong, Zheng, Bin
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Classification Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Feasibility studies Forecasting Lesions Machine learning Medical imaging Optimization Performance enhancement Support vector machines
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Heidari, Morteza Sivaramakrishnan Lakshmivarahan Mirniaharikandehei, Seyedehnafiseh Danala, Gopichandh Sai Kiran R Maryada Liu, Hong Zheng, Bin
description	Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve performance of machine learning model. We assemble a retrospective dataset involving 1,487 cases of mammograms in which 644 cases have confirmed malignant mass lesions and 843 have benign lesions. A CAD scheme is first applied to segment mass regions and initially compute 181 features. Then, support vector machine (SVM) models embedded with several feature dimensionality reduction methods are built to predict likelihood of lesions being malignant. All SVM models are trained and tested using a leave-one-case-out cross-validation method. SVM generates a likelihood score of each segmented mass region depicting on one-view mammogram. By fusion of two scores of the same mass depicting on two-view mammograms, a case-based likelihood score is also evaluated. Comparing with the principle component analyses, nonnegative matrix factorization, and Chi-squared methods, SVM embedded with the random projection algorithm yielded a significantly higher case-based lesion classification performance with the area under ROC curve of 0.84+0.01 (p
doi_str_mv	10.48550/arxiv.2009.09937
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2009_09937</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2444749306</sourcerecordid><originalsourceid>FETCH-LOGICAL-a526-af81a9fa7cb09f01a29d12852b129e2de9901c3d0330e7b491fc58d0b31f7fb63</originalsourceid><addsrcrecordid>eNotkEtrwzAQhEWh0JDmB_RUQc9OVy_bOobQFwR6yd2sbSmRsS1XckrTX1876WkOOzPsfIQ8MFjLXCl4xvDjvtccQK9Ba5HdkAUXgiW55PyOrGJsAICnGVdKLEizGYb27PoDRRqwr31Hh-AbU43O9xTbgw9uPHZ09NQPo-vcr6EdVkfXG9oaDP0c7XxtWmp9oGUwGMfpEud41WKMzroK57Z7cmuxjWb1r0uyf33Zb9-T3efbx3azS1DxNEGbM9QWs6oEbYEh1zXjueIl49rw2mgNrBI1CAEmK6VmtlJ5DaVgNrNlKpbk8Vp7AVEMwXUYzsUMpLgAmRxPV8e09Otk4lg0_hT66aeCSykzqQWk4g-74WV4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2444749306</pqid></control><display><type>article</type><title>Applying a random projection algorithm to optimize machine learning model for breast lesion classification</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Heidari, Morteza ; Sivaramakrishnan Lakshmivarahan ; Mirniaharikandehei, Seyedehnafiseh ; Danala, Gopichandh ; Sai Kiran R Maryada ; Liu, Hong ; Zheng, Bin</creator><creatorcontrib>Heidari, Morteza ; Sivaramakrishnan Lakshmivarahan ; Mirniaharikandehei, Seyedehnafiseh ; Danala, Gopichandh ; Sai Kiran R Maryada ; Liu, Hong ; Zheng, Bin</creatorcontrib><description>Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve performance of machine learning model. We assemble a retrospective dataset involving 1,487 cases of mammograms in which 644 cases have confirmed malignant mass lesions and 843 have benign lesions. A CAD scheme is first applied to segment mass regions and initially compute 181 features. Then, support vector machine (SVM) models embedded with several feature dimensionality reduction methods are built to predict likelihood of lesions being malignant. All SVM models are trained and tested using a leave-one-case-out cross-validation method. SVM generates a likelihood score of each segmented mass region depicting on one-view mammogram. By fusion of two scores of the same mass depicting on two-view mammograms, a case-based likelihood score is also evaluated. Comparing with the principle component analyses, nonnegative matrix factorization, and Chi-squared methods, SVM embedded with the random projection algorithm yielded a significantly higher case-based lesion classification performance with the area under ROC curve of 0.84+0.01 (p<0.02). The study demonstrates that the random project algorithm is a promising method to generate optimal feature vectors to help improve performance of machine learning models of medical images.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2009.09937</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Classification ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Learning ; Feasibility studies ; Forecasting ; Lesions ; Machine learning ; Medical imaging ; Optimization ; Performance enhancement ; Support vector machines</subject><ispartof>arXiv.org, 2020-09</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27923</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.2009.09937$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.1109/TBME.2021.3054248$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Heidari, Morteza</creatorcontrib><creatorcontrib>Sivaramakrishnan Lakshmivarahan</creatorcontrib><creatorcontrib>Mirniaharikandehei, Seyedehnafiseh</creatorcontrib><creatorcontrib>Danala, Gopichandh</creatorcontrib><creatorcontrib>Sai Kiran R Maryada</creatorcontrib><creatorcontrib>Liu, Hong</creatorcontrib><creatorcontrib>Zheng, Bin</creatorcontrib><title>Applying a random projection algorithm to optimize machine learning model for breast lesion classification</title><title>arXiv.org</title><description>Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve performance of machine learning model. We assemble a retrospective dataset involving 1,487 cases of mammograms in which 644 cases have confirmed malignant mass lesions and 843 have benign lesions. A CAD scheme is first applied to segment mass regions and initially compute 181 features. Then, support vector machine (SVM) models embedded with several feature dimensionality reduction methods are built to predict likelihood of lesions being malignant. All SVM models are trained and tested using a leave-one-case-out cross-validation method. SVM generates a likelihood score of each segmented mass region depicting on one-view mammogram. By fusion of two scores of the same mass depicting on two-view mammograms, a case-based likelihood score is also evaluated. Comparing with the principle component analyses, nonnegative matrix factorization, and Chi-squared methods, SVM embedded with the random projection algorithm yielded a significantly higher case-based lesion classification performance with the area under ROC curve of 0.84+0.01 (p<0.02). The study demonstrates that the random project algorithm is a promising method to generate optimal feature vectors to help improve performance of machine learning models of medical images.</description><subject>Algorithms</subject><subject>Classification</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Learning</subject><subject>Feasibility studies</subject><subject>Forecasting</subject><subject>Lesions</subject><subject>Machine learning</subject><subject>Medical imaging</subject><subject>Optimization</subject><subject>Performance enhancement</subject><subject>Support vector machines</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotkEtrwzAQhEWh0JDmB_RUQc9OVy_bOobQFwR6yd2sbSmRsS1XckrTX1876WkOOzPsfIQ8MFjLXCl4xvDjvtccQK9Ba5HdkAUXgiW55PyOrGJsAICnGVdKLEizGYb27PoDRRqwr31Hh-AbU43O9xTbgw9uPHZ09NQPo-vcr6EdVkfXG9oaDP0c7XxtWmp9oGUwGMfpEud41WKMzroK57Z7cmuxjWb1r0uyf33Zb9-T3efbx3azS1DxNEGbM9QWs6oEbYEh1zXjueIl49rw2mgNrBI1CAEmK6VmtlJ5DaVgNrNlKpbk8Vp7AVEMwXUYzsUMpLgAmRxPV8e09Otk4lg0_hT66aeCSykzqQWk4g-74WV4</recordid><startdate>20200909</startdate><enddate>20200909</enddate><creator>Heidari, Morteza</creator><creator>Sivaramakrishnan Lakshmivarahan</creator><creator>Mirniaharikandehei, Seyedehnafiseh</creator><creator>Danala, Gopichandh</creator><creator>Sai Kiran R Maryada</creator><creator>Liu, Hong</creator><creator>Zheng, Bin</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20200909</creationdate><title>Applying a random projection algorithm to optimize machine learning model for breast lesion classification</title><author>Heidari, Morteza ; Sivaramakrishnan Lakshmivarahan ; Mirniaharikandehei, Seyedehnafiseh ; Danala, Gopichandh ; Sai Kiran R Maryada ; Liu, Hong ; Zheng, Bin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a526-af81a9fa7cb09f01a29d12852b129e2de9901c3d0330e7b491fc58d0b31f7fb63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Classification</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Learning</topic><topic>Feasibility studies</topic><topic>Forecasting</topic><topic>Lesions</topic><topic>Machine learning</topic><topic>Medical imaging</topic><topic>Optimization</topic><topic>Performance enhancement</topic><topic>Support vector machines</topic><toplevel>online_resources</toplevel><creatorcontrib>Heidari, Morteza</creatorcontrib><creatorcontrib>Sivaramakrishnan Lakshmivarahan</creatorcontrib><creatorcontrib>Mirniaharikandehei, Seyedehnafiseh</creatorcontrib><creatorcontrib>Danala, Gopichandh</creatorcontrib><creatorcontrib>Sai Kiran R Maryada</creatorcontrib><creatorcontrib>Liu, Hong</creatorcontrib><creatorcontrib>Zheng, Bin</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Heidari, Morteza</au><au>Sivaramakrishnan Lakshmivarahan</au><au>Mirniaharikandehei, Seyedehnafiseh</au><au>Danala, Gopichandh</au><au>Sai Kiran R Maryada</au><au>Liu, Hong</au><au>Zheng, Bin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Applying a random projection algorithm to optimize machine learning model for breast lesion classification</atitle><jtitle>arXiv.org</jtitle><date>2020-09-09</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve performance of machine learning model. We assemble a retrospective dataset involving 1,487 cases of mammograms in which 644 cases have confirmed malignant mass lesions and 843 have benign lesions. A CAD scheme is first applied to segment mass regions and initially compute 181 features. Then, support vector machine (SVM) models embedded with several feature dimensionality reduction methods are built to predict likelihood of lesions being malignant. All SVM models are trained and tested using a leave-one-case-out cross-validation method. SVM generates a likelihood score of each segmented mass region depicting on one-view mammogram. By fusion of two scores of the same mass depicting on two-view mammograms, a case-based likelihood score is also evaluated. Comparing with the principle component analyses, nonnegative matrix factorization, and Chi-squared methods, SVM embedded with the random projection algorithm yielded a significantly higher case-based lesion classification performance with the area under ROC curve of 0.84+0.01 (p<0.02). The study demonstrates that the random project algorithm is a promising method to generate optimal feature vectors to help improve performance of machine learning models of medical images.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2009.09937</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2020-09
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_2009_09937
source	arXiv.org; Free E- Journals
subjects	Algorithms Classification Computer Science - Computer Vision and Pattern Recognition Computer Science - Learning Feasibility studies Forecasting Lesions Machine learning Medical imaging Optimization Performance enhancement Support vector machines
title	Applying a random projection algorithm to optimize machine learning model for breast lesion classification
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T21%3A40%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Applying%20a%20random%20projection%20algorithm%20to%20optimize%20machine%20learning%20model%20for%20breast%20lesion%20classification&rft.jtitle=arXiv.org&rft.au=Heidari,%20Morteza&rft.date=2020-09-09&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2009.09937&rft_dat=%3Cproquest_arxiv%3E2444749306%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2444749306&rft_id=info:pmid/&rfr_iscdi=true