Geometric Representation Condition Improves Equivariant Molecule Generation

Recent advancements in molecular generative models have demonstrated substantial potential in accelerating scientific discovery, particularly in drug design. However, these models often face challenges in generating high-quality molecules, especially in conditional scenarios where specific molecular...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Li, Zian, Zhou, Cai, Wang, Xiyuan, Peng, Xingang, Zhang, Muhan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Li, Zian
Zhou, Cai
Wang, Xiyuan
Peng, Xingang
Zhang, Muhan
description Recent advancements in molecular generative models have demonstrated substantial potential in accelerating scientific discovery, particularly in drug design. However, these models often face challenges in generating high-quality molecules, especially in conditional scenarios where specific molecular properties must be satisfied. In this work, we introduce GeoRCG, a general framework to enhance the performance of molecular generative models by integrating geometric representation conditions. We decompose the molecule generation process into two stages: first, generating an informative geometric representation; second, generating a molecule conditioned on the representation. Compared to directly generating a molecule, the relatively easy-to-generate representation in the first-stage guides the second-stage generation to reach a high-quality molecule in a more goal-oriented and much faster way. Leveraging EDM as the base generator, we observe significant quality improvements in unconditional molecule generation on the widely-used QM9 and GEOM-DRUG datasets. More notably, in the challenging conditional molecular generation task, our framework achieves an average 31\% performance improvement over state-of-the-art approaches, highlighting the superiority of conditioning on semantically rich geometric representations over conditioning on individual property values as in previous approaches. Furthermore, we show that, with such representation guidance, the number of diffusion steps can be reduced to as small as 100 while maintaining superior generation quality than that achieved with 1,000 steps, thereby significantly accelerating the generation process.
doi_str_mv 10.48550/arxiv.2410.03655
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_03655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_03655</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_036553</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBibmZpyMni7p-bnppYUZSYrBKUWFKUWp-aVJJZk5ucpOOfnpWSCWZ65BUX5ZanFCq6FpZlliUWZiXklCr75OanJpTmpCu6pealFYC08DKxpiTnFqbxQmptB3s01xNlDF2xtfEFRZm5iUWU8yPp4sPXGhFUAAC_PO8Y</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Geometric Representation Condition Improves Equivariant Molecule Generation</title><source>arXiv.org</source><creator>Li, Zian ; Zhou, Cai ; Wang, Xiyuan ; Peng, Xingang ; Zhang, Muhan</creator><creatorcontrib>Li, Zian ; Zhou, Cai ; Wang, Xiyuan ; Peng, Xingang ; Zhang, Muhan</creatorcontrib><description>Recent advancements in molecular generative models have demonstrated substantial potential in accelerating scientific discovery, particularly in drug design. However, these models often face challenges in generating high-quality molecules, especially in conditional scenarios where specific molecular properties must be satisfied. In this work, we introduce GeoRCG, a general framework to enhance the performance of molecular generative models by integrating geometric representation conditions. We decompose the molecule generation process into two stages: first, generating an informative geometric representation; second, generating a molecule conditioned on the representation. Compared to directly generating a molecule, the relatively easy-to-generate representation in the first-stage guides the second-stage generation to reach a high-quality molecule in a more goal-oriented and much faster way. Leveraging EDM as the base generator, we observe significant quality improvements in unconditional molecule generation on the widely-used QM9 and GEOM-DRUG datasets. More notably, in the challenging conditional molecular generation task, our framework achieves an average 31\% performance improvement over state-of-the-art approaches, highlighting the superiority of conditioning on semantically rich geometric representations over conditioning on individual property values as in previous approaches. Furthermore, we show that, with such representation guidance, the number of diffusion steps can be reduced to as small as 100 while maintaining superior generation quality than that achieved with 1,000 steps, thereby significantly accelerating the generation process.</description><identifier>DOI: 10.48550/arxiv.2410.03655</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2024-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.03655$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.03655$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Li, Zian</creatorcontrib><creatorcontrib>Zhou, Cai</creatorcontrib><creatorcontrib>Wang, Xiyuan</creatorcontrib><creatorcontrib>Peng, Xingang</creatorcontrib><creatorcontrib>Zhang, Muhan</creatorcontrib><title>Geometric Representation Condition Improves Equivariant Molecule Generation</title><description>Recent advancements in molecular generative models have demonstrated substantial potential in accelerating scientific discovery, particularly in drug design. However, these models often face challenges in generating high-quality molecules, especially in conditional scenarios where specific molecular properties must be satisfied. In this work, we introduce GeoRCG, a general framework to enhance the performance of molecular generative models by integrating geometric representation conditions. We decompose the molecule generation process into two stages: first, generating an informative geometric representation; second, generating a molecule conditioned on the representation. Compared to directly generating a molecule, the relatively easy-to-generate representation in the first-stage guides the second-stage generation to reach a high-quality molecule in a more goal-oriented and much faster way. Leveraging EDM as the base generator, we observe significant quality improvements in unconditional molecule generation on the widely-used QM9 and GEOM-DRUG datasets. More notably, in the challenging conditional molecular generation task, our framework achieves an average 31\% performance improvement over state-of-the-art approaches, highlighting the superiority of conditioning on semantically rich geometric representations over conditioning on individual property values as in previous approaches. Furthermore, we show that, with such representation guidance, the number of diffusion steps can be reduced to as small as 100 while maintaining superior generation quality than that achieved with 1,000 steps, thereby significantly accelerating the generation process.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBibmZpyMni7p-bnppYUZSYrBKUWFKUWp-aVJJZk5ucpOOfnpWSCWZ65BUX5ZanFCq6FpZlliUWZiXklCr75OanJpTmpCu6pealFYC08DKxpiTnFqbxQmptB3s01xNlDF2xtfEFRZm5iUWU8yPp4sPXGhFUAAC_PO8Y</recordid><startdate>20241004</startdate><enddate>20241004</enddate><creator>Li, Zian</creator><creator>Zhou, Cai</creator><creator>Wang, Xiyuan</creator><creator>Peng, Xingang</creator><creator>Zhang, Muhan</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241004</creationdate><title>Geometric Representation Condition Improves Equivariant Molecule Generation</title><author>Li, Zian ; Zhou, Cai ; Wang, Xiyuan ; Peng, Xingang ; Zhang, Muhan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_036553</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Zian</creatorcontrib><creatorcontrib>Zhou, Cai</creatorcontrib><creatorcontrib>Wang, Xiyuan</creatorcontrib><creatorcontrib>Peng, Xingang</creatorcontrib><creatorcontrib>Zhang, Muhan</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Li, Zian</au><au>Zhou, Cai</au><au>Wang, Xiyuan</au><au>Peng, Xingang</au><au>Zhang, Muhan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Geometric Representation Condition Improves Equivariant Molecule Generation</atitle><date>2024-10-04</date><risdate>2024</risdate><abstract>Recent advancements in molecular generative models have demonstrated substantial potential in accelerating scientific discovery, particularly in drug design. However, these models often face challenges in generating high-quality molecules, especially in conditional scenarios where specific molecular properties must be satisfied. In this work, we introduce GeoRCG, a general framework to enhance the performance of molecular generative models by integrating geometric representation conditions. We decompose the molecule generation process into two stages: first, generating an informative geometric representation; second, generating a molecule conditioned on the representation. Compared to directly generating a molecule, the relatively easy-to-generate representation in the first-stage guides the second-stage generation to reach a high-quality molecule in a more goal-oriented and much faster way. Leveraging EDM as the base generator, we observe significant quality improvements in unconditional molecule generation on the widely-used QM9 and GEOM-DRUG datasets. More notably, in the challenging conditional molecular generation task, our framework achieves an average 31\% performance improvement over state-of-the-art approaches, highlighting the superiority of conditioning on semantically rich geometric representations over conditioning on individual property values as in previous approaches. Furthermore, we show that, with such representation guidance, the number of diffusion steps can be reduced to as small as 100 while maintaining superior generation quality than that achieved with 1,000 steps, thereby significantly accelerating the generation process.</abstract><doi>10.48550/arxiv.2410.03655</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2410.03655
ispartof
issn
language eng
recordid cdi_arxiv_primary_2410_03655
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Learning
title Geometric Representation Condition Improves Equivariant Molecule Generation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-14T04%3A42%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Geometric%20Representation%20Condition%20Improves%20Equivariant%20Molecule%20Generation&rft.au=Li,%20Zian&rft.date=2024-10-04&rft_id=info:doi/10.48550/arxiv.2410.03655&rft_dat=%3Carxiv_GOX%3E2410_03655%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true