A novel machine learning based approach for iPS progenitor cell identification
Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biom...
Gespeichert in:
Veröffentlicht in: | PLoS computational biology 2019-12, Vol.15 (12), p.e1007351-e1007351 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | e1007351 |
---|---|
container_issue | 12 |
container_start_page | e1007351 |
container_title | PLoS computational biology |
container_volume | 15 |
creator | Zhang, Haishan Shao, Ximing Peng, Yin Teng, Yanning Saravanan, Konda Mani Zhang, Huiling Li, Hongchang Wei, Yanjie |
description | Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification. |
doi_str_mv | 10.1371/journal.pcbi.1007351 |
format | Article |
fullrecord | <record><control><sourceid>proquest_plos_</sourceid><recordid>TN_cdi_plos_journals_2339843606</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_82dcaa32c6a14e979f24cd2db4a2f804</doaj_id><sourcerecordid>2331251749</sourcerecordid><originalsourceid>FETCH-LOGICAL-c592t-860774c1641647fac3c5249b71fb91ba626c212324440987d10c52479192b9203</originalsourceid><addsrcrecordid>eNptUl1rFDEUDaLY2voPRAN98WXXfE0yeRFKabVQVGj7HO5kMtss2WRMZgv-e7PdaWlFCOTe3HNO7r0chD5QsqRc0S_rtM0RwnK0nV9SQhRv6Ct0SJuGL2rcvn4WH6B3pawJqaGWb9EBp61SlLWH6McpjuneBbwBe-ejw8FBjj6ucAfF9RjGMadawkPK2P-6xjVdueinmloXAva9i5MfvIXJp3iM3gwQins_30fo9uL85uz74urnt8uz06uFbTSbFq0kSglLpahHDWC5bZjQnaJDp2kHkknLKONMCEF0q3pKdgClqWadZoQfoU973TGkYuZVFMM4163gksiKuNwj-gRrM2a_gfzHJPDm4SHllYE8eRucaVlvATizEqhwWumBCduzvhPAhpaIqvV1_m3bbVxv68QZwgvRl5Xo78wq3RupOVNCV4HPs0BOv7euTGbjy259EF3aPvRNWUP30JN_oP-fTuxRNqdSshuemqHE7OzxyDI7e5jZHpX28fkgT6RHP_C_E3C27g</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2339843606</pqid></control><display><type>article</type><title>A novel machine learning based approach for iPS progenitor cell identification</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Zhang, Haishan ; Shao, Ximing ; Peng, Yin ; Teng, Yanning ; Saravanan, Konda Mani ; Zhang, Huiling ; Li, Hongchang ; Wei, Yanjie</creator><contributor>Zou, Quan</contributor><creatorcontrib>Zhang, Haishan ; Shao, Ximing ; Peng, Yin ; Teng, Yanning ; Saravanan, Konda Mani ; Zhang, Huiling ; Li, Hongchang ; Wei, Yanjie ; Zou, Quan</creatorcontrib><description>Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1007351</identifier><identifier>PMID: 31877128</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Animals ; Big Data ; Biology and Life Sciences ; Biomarkers ; Cell division ; Cells (biology) ; Cells, Cultured ; Cellular Reprogramming ; Computational Biology ; Computer and Information Sciences ; Computer applications ; Cytology ; Deep learning ; Embryo fibroblasts ; Engineering research ; Fibroblasts ; Fibroblasts - cytology ; Fibroblasts - metabolism ; Fluorescent indicators ; Humans ; Identification ; Identification methods ; Image analysis ; Image processing ; Induced Pluripotent Stem Cells - cytology ; Induced Pluripotent Stem Cells - metabolism ; Infections ; Integer programming ; KLF4 protein ; Laboratories ; Learning algorithms ; Machine Learning ; Mice ; Microscopy ; Models, Biological ; Molecular biology ; Morphology ; Mouse Embryonic Stem Cells - cytology ; Mouse Embryonic Stem Cells - metabolism ; Neural networks ; Oct-4 protein ; Pluripotency ; Prediction models ; Principal components analysis ; Progenitor cells ; Research and Analysis Methods ; Software ; Stem cells ; Teaching methods ; Time-Lapse Imaging ; Windows (intervals)</subject><ispartof>PLoS computational biology, 2019-12, Vol.15 (12), p.e1007351-e1007351</ispartof><rights>2019 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2019 Zhang et al 2019 Zhang et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c592t-860774c1641647fac3c5249b71fb91ba626c212324440987d10c52479192b9203</citedby><cites>FETCH-LOGICAL-c592t-860774c1641647fac3c5249b71fb91ba626c212324440987d10c52479192b9203</cites><orcidid>0000-0001-9649-249X ; 0000-0001-6740-5596 ; 0000-0002-4791-7540 ; 0000-0003-0072-4706 ; 0000-0002-5541-234X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6932749/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6932749/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2102,2928,23866,27924,27925,53791,53793,79600,79601</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31877128$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Zou, Quan</contributor><creatorcontrib>Zhang, Haishan</creatorcontrib><creatorcontrib>Shao, Ximing</creatorcontrib><creatorcontrib>Peng, Yin</creatorcontrib><creatorcontrib>Teng, Yanning</creatorcontrib><creatorcontrib>Saravanan, Konda Mani</creatorcontrib><creatorcontrib>Zhang, Huiling</creatorcontrib><creatorcontrib>Li, Hongchang</creatorcontrib><creatorcontrib>Wei, Yanjie</creatorcontrib><title>A novel machine learning based approach for iPS progenitor cell identification</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification.</description><subject>Animals</subject><subject>Big Data</subject><subject>Biology and Life Sciences</subject><subject>Biomarkers</subject><subject>Cell division</subject><subject>Cells (biology)</subject><subject>Cells, Cultured</subject><subject>Cellular Reprogramming</subject><subject>Computational Biology</subject><subject>Computer and Information Sciences</subject><subject>Computer applications</subject><subject>Cytology</subject><subject>Deep learning</subject><subject>Embryo fibroblasts</subject><subject>Engineering research</subject><subject>Fibroblasts</subject><subject>Fibroblasts - cytology</subject><subject>Fibroblasts - metabolism</subject><subject>Fluorescent indicators</subject><subject>Humans</subject><subject>Identification</subject><subject>Identification methods</subject><subject>Image analysis</subject><subject>Image processing</subject><subject>Induced Pluripotent Stem Cells - cytology</subject><subject>Induced Pluripotent Stem Cells - metabolism</subject><subject>Infections</subject><subject>Integer programming</subject><subject>KLF4 protein</subject><subject>Laboratories</subject><subject>Learning algorithms</subject><subject>Machine Learning</subject><subject>Mice</subject><subject>Microscopy</subject><subject>Models, Biological</subject><subject>Molecular biology</subject><subject>Morphology</subject><subject>Mouse Embryonic Stem Cells - cytology</subject><subject>Mouse Embryonic Stem Cells - metabolism</subject><subject>Neural networks</subject><subject>Oct-4 protein</subject><subject>Pluripotency</subject><subject>Prediction models</subject><subject>Principal components analysis</subject><subject>Progenitor cells</subject><subject>Research and Analysis Methods</subject><subject>Software</subject><subject>Stem cells</subject><subject>Teaching methods</subject><subject>Time-Lapse Imaging</subject><subject>Windows (intervals)</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNptUl1rFDEUDaLY2voPRAN98WXXfE0yeRFKabVQVGj7HO5kMtss2WRMZgv-e7PdaWlFCOTe3HNO7r0chD5QsqRc0S_rtM0RwnK0nV9SQhRv6Ct0SJuGL2rcvn4WH6B3pawJqaGWb9EBp61SlLWH6McpjuneBbwBe-ejw8FBjj6ucAfF9RjGMadawkPK2P-6xjVdueinmloXAva9i5MfvIXJp3iM3gwQins_30fo9uL85uz74urnt8uz06uFbTSbFq0kSglLpahHDWC5bZjQnaJDp2kHkknLKONMCEF0q3pKdgClqWadZoQfoU973TGkYuZVFMM4163gksiKuNwj-gRrM2a_gfzHJPDm4SHllYE8eRucaVlvATizEqhwWumBCduzvhPAhpaIqvV1_m3bbVxv68QZwgvRl5Xo78wq3RupOVNCV4HPs0BOv7euTGbjy259EF3aPvRNWUP30JN_oP-fTuxRNqdSshuemqHE7OzxyDI7e5jZHpX28fkgT6RHP_C_E3C27g</recordid><startdate>20191201</startdate><enddate>20191201</enddate><creator>Zhang, Haishan</creator><creator>Shao, Ximing</creator><creator>Peng, Yin</creator><creator>Teng, Yanning</creator><creator>Saravanan, Konda Mani</creator><creator>Zhang, Huiling</creator><creator>Li, Hongchang</creator><creator>Wei, Yanjie</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-9649-249X</orcidid><orcidid>https://orcid.org/0000-0001-6740-5596</orcidid><orcidid>https://orcid.org/0000-0002-4791-7540</orcidid><orcidid>https://orcid.org/0000-0003-0072-4706</orcidid><orcidid>https://orcid.org/0000-0002-5541-234X</orcidid></search><sort><creationdate>20191201</creationdate><title>A novel machine learning based approach for iPS progenitor cell identification</title><author>Zhang, Haishan ; Shao, Ximing ; Peng, Yin ; Teng, Yanning ; Saravanan, Konda Mani ; Zhang, Huiling ; Li, Hongchang ; Wei, Yanjie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c592t-860774c1641647fac3c5249b71fb91ba626c212324440987d10c52479192b9203</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Animals</topic><topic>Big Data</topic><topic>Biology and Life Sciences</topic><topic>Biomarkers</topic><topic>Cell division</topic><topic>Cells (biology)</topic><topic>Cells, Cultured</topic><topic>Cellular Reprogramming</topic><topic>Computational Biology</topic><topic>Computer and Information Sciences</topic><topic>Computer applications</topic><topic>Cytology</topic><topic>Deep learning</topic><topic>Embryo fibroblasts</topic><topic>Engineering research</topic><topic>Fibroblasts</topic><topic>Fibroblasts - cytology</topic><topic>Fibroblasts - metabolism</topic><topic>Fluorescent indicators</topic><topic>Humans</topic><topic>Identification</topic><topic>Identification methods</topic><topic>Image analysis</topic><topic>Image processing</topic><topic>Induced Pluripotent Stem Cells - cytology</topic><topic>Induced Pluripotent Stem Cells - metabolism</topic><topic>Infections</topic><topic>Integer programming</topic><topic>KLF4 protein</topic><topic>Laboratories</topic><topic>Learning algorithms</topic><topic>Machine Learning</topic><topic>Mice</topic><topic>Microscopy</topic><topic>Models, Biological</topic><topic>Molecular biology</topic><topic>Morphology</topic><topic>Mouse Embryonic Stem Cells - cytology</topic><topic>Mouse Embryonic Stem Cells - metabolism</topic><topic>Neural networks</topic><topic>Oct-4 protein</topic><topic>Pluripotency</topic><topic>Prediction models</topic><topic>Principal components analysis</topic><topic>Progenitor cells</topic><topic>Research and Analysis Methods</topic><topic>Software</topic><topic>Stem cells</topic><topic>Teaching methods</topic><topic>Time-Lapse Imaging</topic><topic>Windows (intervals)</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Haishan</creatorcontrib><creatorcontrib>Shao, Ximing</creatorcontrib><creatorcontrib>Peng, Yin</creatorcontrib><creatorcontrib>Teng, Yanning</creatorcontrib><creatorcontrib>Saravanan, Konda Mani</creatorcontrib><creatorcontrib>Zhang, Huiling</creatorcontrib><creatorcontrib>Li, Hongchang</creatorcontrib><creatorcontrib>Wei, Yanjie</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Haishan</au><au>Shao, Ximing</au><au>Peng, Yin</au><au>Teng, Yanning</au><au>Saravanan, Konda Mani</au><au>Zhang, Huiling</au><au>Li, Hongchang</au><au>Wei, Yanjie</au><au>Zou, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel machine learning based approach for iPS progenitor cell identification</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2019-12-01</date><risdate>2019</risdate><volume>15</volume><issue>12</issue><spage>e1007351</spage><epage>e1007351</epage><pages>e1007351-e1007351</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>Identification of induced pluripotent stem (iPS) progenitor cells, the iPS forming cells in early stage of reprogramming, could provide valuable information for studying the origin and underlying mechanism of iPS cells. However, it is very difficult to identify experimentally since there are no biomarkers known for early progenitor cells, and only about 6 days after reprogramming initiation, iPS cells can be experimentally determined via fluorescent probes. What is more, the ratio of progenitor cells during early reprograming period is below 5%, which is too low to capture experimentally in the early stage. In this paper, we propose a novel computational approach for the identification of iPS progenitor cells based on machine learning and microscopic image analysis. Firstly, we record the reprogramming process using a live cell imaging system after 48 hours of infection with retroviruses expressing Oct4, Sox2 and Klf4, later iPS progenitor cells and normal murine embryonic fibroblasts (MEFs) within 3 to 5 days after infection are labeled by retrospectively tracing the time-lapse microscopic image. We then calculate 11 types of cell morphological and motion features such as area, speed, etc., and select best time windows for modeling and perform feature selection. Finally, a prediction model using XGBoost is built based on the selected six types of features and best time windows. Our model allows several missing values/frames in the sample datasets, thus it is applicable to a wide range of scenarios. Cross-validation, holdout validation and independent test experiments show that the minimum precision is above 52%, that is, the ratio of predicted progenitor cells within 3 to 5 days after viral infection is above 52%. The results also confirm that the morphology and motion pattern of iPS progenitor cells is different from that of normal MEFs, which helps with the machine learning methods for iPS progenitor cell identification.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>31877128</pmid><doi>10.1371/journal.pcbi.1007351</doi><orcidid>https://orcid.org/0000-0001-9649-249X</orcidid><orcidid>https://orcid.org/0000-0001-6740-5596</orcidid><orcidid>https://orcid.org/0000-0002-4791-7540</orcidid><orcidid>https://orcid.org/0000-0003-0072-4706</orcidid><orcidid>https://orcid.org/0000-0002-5541-234X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1553-7358 |
ispartof | PLoS computational biology, 2019-12, Vol.15 (12), p.e1007351-e1007351 |
issn | 1553-7358 1553-734X 1553-7358 |
language | eng |
recordid | cdi_plos_journals_2339843606 |
source | MEDLINE; DOAJ Directory of Open Access Journals; Public Library of Science (PLoS); EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Animals Big Data Biology and Life Sciences Biomarkers Cell division Cells (biology) Cells, Cultured Cellular Reprogramming Computational Biology Computer and Information Sciences Computer applications Cytology Deep learning Embryo fibroblasts Engineering research Fibroblasts Fibroblasts - cytology Fibroblasts - metabolism Fluorescent indicators Humans Identification Identification methods Image analysis Image processing Induced Pluripotent Stem Cells - cytology Induced Pluripotent Stem Cells - metabolism Infections Integer programming KLF4 protein Laboratories Learning algorithms Machine Learning Mice Microscopy Models, Biological Molecular biology Morphology Mouse Embryonic Stem Cells - cytology Mouse Embryonic Stem Cells - metabolism Neural networks Oct-4 protein Pluripotency Prediction models Principal components analysis Progenitor cells Research and Analysis Methods Software Stem cells Teaching methods Time-Lapse Imaging Windows (intervals) |
title | A novel machine learning based approach for iPS progenitor cell identification |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T06%3A26%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20machine%20learning%20based%20approach%20for%20iPS%20progenitor%20cell%20identification&rft.jtitle=PLoS%20computational%20biology&rft.au=Zhang,%20Haishan&rft.date=2019-12-01&rft.volume=15&rft.issue=12&rft.spage=e1007351&rft.epage=e1007351&rft.pages=e1007351-e1007351&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1007351&rft_dat=%3Cproquest_plos_%3E2331251749%3C/proquest_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2339843606&rft_id=info:pmid/31877128&rft_doaj_id=oai_doaj_org_article_82dcaa32c6a14e979f24cd2db4a2f804&rfr_iscdi=true |