Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines

The acid–base dissociation constant, pKa, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pKa prediction using rooted topological torsion fingerprints in combination with five machine...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemical information and modeling 2019-11, Vol.59 (11), p.4706-4719
Hauptverfasser: Lu, Yipin, Anand, Shankara, Shirley, William, Gedeck, Peter, Kelley, Brian P, Skolnik, Suzanne, Rodde, Stephane, Nguyen, Mai, Lindvall, Mika, Jia, Weiping
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4719
container_issue 11
container_start_page 4706
container_title Journal of chemical information and modeling
container_volume 59
creator Lu, Yipin
Anand, Shankara
Shirley, William
Gedeck, Peter
Kelley, Brian P
Skolnik, Suzanne
Rodde, Stephane
Nguyen, Mai
Lindvall, Mika
Jia, Weiping
description The acid–base dissociation constant, pKa, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pKa prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental pKa values, pKa models were developed for aliphatic amines. The models demonstrated consistently good prediction statistics and were able to generate accurate prospective predictions as validated with an external test set of 726 pKa values (RMSE 0.45, MAE 0.33, and R2 0.84 by the top model). The factors that may affect prediction accuracy and model applicability were carefully assessed. The results demonstrated that rooted topological torsion fingerprints coupled with ML methods provide a promising approach for developing accurate pKa prediction models.
doi_str_mv 10.1021/acs.jcim.9b00498
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_miscellaneous_2308521764</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2318648210</sourcerecordid><originalsourceid>FETCH-LOGICAL-g984-98f3f00872bcc12b16420b5af0698d24358941e2e5ed7abe31c534b28a27a8303</originalsourceid><addsrcrecordid>eNpdjj1PwzAURS0EEqWwM1piYUnxZ2KzRRUFRBEIFYmtchyncZXawU7Exm_HFFiY3n33HR09AM4xmmFE8JXScbbVdjeTFUJMigMwwZzJTObo7fAvc5kfg5MYtwhRKnMyAZ_PwdRWD9Y76BvYPyj4Gq3bwEelW-sMXBoV3L4wQ-vrCD_s0MIX7wdTw5Xvfec3Vqsu5RC_LYsEm9AH64Z4Dcu-79J57x88LDvbt2nTsNwlezwFR43qojn7nVOwWtys5nfZ8un2fl4us40ULJOioQ1CoiCV1phUOGcEVVw1KJeiJoxyIRk2xHBTF6oyFGtOWUWEIoUSFNEpuPzR9sG_jyYO652N2nSdcsaPcU0oEpzgImcJvfiHbv0YXHouUVjkTBCM6Bdb4nBl</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2318648210</pqid></control><display><type>article</type><title>Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines</title><source>American Chemical Society Journals</source><creator>Lu, Yipin ; Anand, Shankara ; Shirley, William ; Gedeck, Peter ; Kelley, Brian P ; Skolnik, Suzanne ; Rodde, Stephane ; Nguyen, Mai ; Lindvall, Mika ; Jia, Weiping</creator><creatorcontrib>Lu, Yipin ; Anand, Shankara ; Shirley, William ; Gedeck, Peter ; Kelley, Brian P ; Skolnik, Suzanne ; Rodde, Stephane ; Nguyen, Mai ; Lindvall, Mika ; Jia, Weiping</creatorcontrib><description>The acid–base dissociation constant, pKa, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pKa prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental pKa values, pKa models were developed for aliphatic amines. The models demonstrated consistently good prediction statistics and were able to generate accurate prospective predictions as validated with an external test set of 726 pKa values (RMSE 0.45, MAE 0.33, and R2 0.84 by the top model). The factors that may affect prediction accuracy and model applicability were carefully assessed. The results demonstrated that rooted topological torsion fingerprints coupled with ML methods provide a promising approach for developing accurate pKa prediction models.</description><identifier>ISSN: 1549-9596</identifier><identifier>EISSN: 1549-960X</identifier><identifier>DOI: 10.1021/acs.jcim.9b00498</identifier><language>eng</language><publisher>Washington: American Chemical Society</publisher><subject>Aliphatic amines ; Artificial intelligence ; Fingerprints ; Machine learning ; Model accuracy ; Support vector machines ; Topology ; Torsion</subject><ispartof>Journal of chemical information and modeling, 2019-11, Vol.59 (11), p.4706-4719</ispartof><rights>Copyright American Chemical Society Nov 25, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Lu, Yipin</creatorcontrib><creatorcontrib>Anand, Shankara</creatorcontrib><creatorcontrib>Shirley, William</creatorcontrib><creatorcontrib>Gedeck, Peter</creatorcontrib><creatorcontrib>Kelley, Brian P</creatorcontrib><creatorcontrib>Skolnik, Suzanne</creatorcontrib><creatorcontrib>Rodde, Stephane</creatorcontrib><creatorcontrib>Nguyen, Mai</creatorcontrib><creatorcontrib>Lindvall, Mika</creatorcontrib><creatorcontrib>Jia, Weiping</creatorcontrib><title>Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines</title><title>Journal of chemical information and modeling</title><description>The acid–base dissociation constant, pKa, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pKa prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental pKa values, pKa models were developed for aliphatic amines. The models demonstrated consistently good prediction statistics and were able to generate accurate prospective predictions as validated with an external test set of 726 pKa values (RMSE 0.45, MAE 0.33, and R2 0.84 by the top model). The factors that may affect prediction accuracy and model applicability were carefully assessed. The results demonstrated that rooted topological torsion fingerprints coupled with ML methods provide a promising approach for developing accurate pKa prediction models.</description><subject>Aliphatic amines</subject><subject>Artificial intelligence</subject><subject>Fingerprints</subject><subject>Machine learning</subject><subject>Model accuracy</subject><subject>Support vector machines</subject><subject>Topology</subject><subject>Torsion</subject><issn>1549-9596</issn><issn>1549-960X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNpdjj1PwzAURS0EEqWwM1piYUnxZ2KzRRUFRBEIFYmtchyncZXawU7Exm_HFFiY3n33HR09AM4xmmFE8JXScbbVdjeTFUJMigMwwZzJTObo7fAvc5kfg5MYtwhRKnMyAZ_PwdRWD9Y76BvYPyj4Gq3bwEelW-sMXBoV3L4wQ-vrCD_s0MIX7wdTw5Xvfec3Vqsu5RC_LYsEm9AH64Z4Dcu-79J57x88LDvbt2nTsNwlezwFR43qojn7nVOwWtys5nfZ8un2fl4us40ULJOioQ1CoiCV1phUOGcEVVw1KJeiJoxyIRk2xHBTF6oyFGtOWUWEIoUSFNEpuPzR9sG_jyYO652N2nSdcsaPcU0oEpzgImcJvfiHbv0YXHouUVjkTBCM6Bdb4nBl</recordid><startdate>20191125</startdate><enddate>20191125</enddate><creator>Lu, Yipin</creator><creator>Anand, Shankara</creator><creator>Shirley, William</creator><creator>Gedeck, Peter</creator><creator>Kelley, Brian P</creator><creator>Skolnik, Suzanne</creator><creator>Rodde, Stephane</creator><creator>Nguyen, Mai</creator><creator>Lindvall, Mika</creator><creator>Jia, Weiping</creator><general>American Chemical Society</general><scope>7SC</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20191125</creationdate><title>Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines</title><author>Lu, Yipin ; Anand, Shankara ; Shirley, William ; Gedeck, Peter ; Kelley, Brian P ; Skolnik, Suzanne ; Rodde, Stephane ; Nguyen, Mai ; Lindvall, Mika ; Jia, Weiping</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-g984-98f3f00872bcc12b16420b5af0698d24358941e2e5ed7abe31c534b28a27a8303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Aliphatic amines</topic><topic>Artificial intelligence</topic><topic>Fingerprints</topic><topic>Machine learning</topic><topic>Model accuracy</topic><topic>Support vector machines</topic><topic>Topology</topic><topic>Torsion</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lu, Yipin</creatorcontrib><creatorcontrib>Anand, Shankara</creatorcontrib><creatorcontrib>Shirley, William</creatorcontrib><creatorcontrib>Gedeck, Peter</creatorcontrib><creatorcontrib>Kelley, Brian P</creatorcontrib><creatorcontrib>Skolnik, Suzanne</creatorcontrib><creatorcontrib>Rodde, Stephane</creatorcontrib><creatorcontrib>Nguyen, Mai</creatorcontrib><creatorcontrib>Lindvall, Mika</creatorcontrib><creatorcontrib>Jia, Weiping</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of chemical information and modeling</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lu, Yipin</au><au>Anand, Shankara</au><au>Shirley, William</au><au>Gedeck, Peter</au><au>Kelley, Brian P</au><au>Skolnik, Suzanne</au><au>Rodde, Stephane</au><au>Nguyen, Mai</au><au>Lindvall, Mika</au><au>Jia, Weiping</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines</atitle><jtitle>Journal of chemical information and modeling</jtitle><date>2019-11-25</date><risdate>2019</risdate><volume>59</volume><issue>11</issue><spage>4706</spage><epage>4719</epage><pages>4706-4719</pages><issn>1549-9596</issn><eissn>1549-960X</eissn><abstract>The acid–base dissociation constant, pKa, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pKa prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental pKa values, pKa models were developed for aliphatic amines. The models demonstrated consistently good prediction statistics and were able to generate accurate prospective predictions as validated with an external test set of 726 pKa values (RMSE 0.45, MAE 0.33, and R2 0.84 by the top model). The factors that may affect prediction accuracy and model applicability were carefully assessed. The results demonstrated that rooted topological torsion fingerprints coupled with ML methods provide a promising approach for developing accurate pKa prediction models.</abstract><cop>Washington</cop><pub>American Chemical Society</pub><doi>10.1021/acs.jcim.9b00498</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1549-9596
ispartof Journal of chemical information and modeling, 2019-11, Vol.59 (11), p.4706-4719
issn 1549-9596
1549-960X
language eng
recordid cdi_proquest_miscellaneous_2308521764
source American Chemical Society Journals
subjects Aliphatic amines
Artificial intelligence
Fingerprints
Machine learning
Model accuracy
Support vector machines
Topology
Torsion
title Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T06%3A03%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prediction%20of%20pKa%20Using%20Machine%20Learning%20Methods%20with%20Rooted%20Topological%20Torsion%20Fingerprints:%20Application%20to%20Aliphatic%20Amines&rft.jtitle=Journal%20of%20chemical%20information%20and%20modeling&rft.au=Lu,%20Yipin&rft.date=2019-11-25&rft.volume=59&rft.issue=11&rft.spage=4706&rft.epage=4719&rft.pages=4706-4719&rft.issn=1549-9596&rft.eissn=1549-960X&rft_id=info:doi/10.1021/acs.jcim.9b00498&rft_dat=%3Cproquest%3E2318648210%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2318648210&rft_id=info:pmid/&rfr_iscdi=true