Representations of Materials for Machine Learning

High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Damewood, James, Karaguesian, Jessica, Lunger, Jaclyn R, Tan, Aik Rui, Xie, Mingrou, Peng, Jiayu, Gómez-Bombarelli, Rafael
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Damewood, James
Karaguesian, Jessica
Lunger, Jaclyn R
Tan, Aik Rui
Xie, Mingrou
Peng, Jiayu
Gómez-Bombarelli, Rafael
description High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.
doi_str_mv 10.48550/arxiv.2301.08813
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2301_08813</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2301_08813</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-3ee28acaebddecdb59c1f11f3a67d0306e89d5ca825d65af5db2fc9c72edbe063</originalsourceid><addsrcrecordid>eNotjskKwjAURbNxIeoHuLI_0JrBtOlSxAkqgrgvr8mLBjQtaRH9e-uwulwOHA4hU0aThZKSziE83SPhgrKEKsXEkLATNgFb9B10rvZtVNvoAB0GB7c2snXon746j1GBELzzlzEZ2J7h5L8jct6sz6tdXBy3-9WyiCHNRCwQuQINWBmD2lQy18wyZkVPDRU0RZUbqUFxaVIJVpqKW53rjKOpkKZiRGY_7be5bIK7Q3iVn_by2y7eEU9AEg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Representations of Materials for Machine Learning</title><source>arXiv.org</source><creator>Damewood, James ; Karaguesian, Jessica ; Lunger, Jaclyn R ; Tan, Aik Rui ; Xie, Mingrou ; Peng, Jiayu ; Gómez-Bombarelli, Rafael</creator><creatorcontrib>Damewood, James ; Karaguesian, Jessica ; Lunger, Jaclyn R ; Tan, Aik Rui ; Xie, Mingrou ; Peng, Jiayu ; Gómez-Bombarelli, Rafael</creatorcontrib><description>High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.</description><identifier>DOI: 10.48550/arxiv.2301.08813</identifier><language>eng</language><subject>Physics - Materials Science</subject><creationdate>2023-01</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2301.08813$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2301.08813$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Damewood, James</creatorcontrib><creatorcontrib>Karaguesian, Jessica</creatorcontrib><creatorcontrib>Lunger, Jaclyn R</creatorcontrib><creatorcontrib>Tan, Aik Rui</creatorcontrib><creatorcontrib>Xie, Mingrou</creatorcontrib><creatorcontrib>Peng, Jiayu</creatorcontrib><creatorcontrib>Gómez-Bombarelli, Rafael</creatorcontrib><title>Representations of Materials for Machine Learning</title><description>High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.</description><subject>Physics - Materials Science</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjskKwjAURbNxIeoHuLI_0JrBtOlSxAkqgrgvr8mLBjQtaRH9e-uwulwOHA4hU0aThZKSziE83SPhgrKEKsXEkLATNgFb9B10rvZtVNvoAB0GB7c2snXon746j1GBELzzlzEZ2J7h5L8jct6sz6tdXBy3-9WyiCHNRCwQuQINWBmD2lQy18wyZkVPDRU0RZUbqUFxaVIJVpqKW53rjKOpkKZiRGY_7be5bIK7Q3iVn_by2y7eEU9AEg</recordid><startdate>20230120</startdate><enddate>20230120</enddate><creator>Damewood, James</creator><creator>Karaguesian, Jessica</creator><creator>Lunger, Jaclyn R</creator><creator>Tan, Aik Rui</creator><creator>Xie, Mingrou</creator><creator>Peng, Jiayu</creator><creator>Gómez-Bombarelli, Rafael</creator><scope>GOX</scope></search><sort><creationdate>20230120</creationdate><title>Representations of Materials for Machine Learning</title><author>Damewood, James ; Karaguesian, Jessica ; Lunger, Jaclyn R ; Tan, Aik Rui ; Xie, Mingrou ; Peng, Jiayu ; Gómez-Bombarelli, Rafael</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-3ee28acaebddecdb59c1f11f3a67d0306e89d5ca825d65af5db2fc9c72edbe063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Physics - Materials Science</topic><toplevel>online_resources</toplevel><creatorcontrib>Damewood, James</creatorcontrib><creatorcontrib>Karaguesian, Jessica</creatorcontrib><creatorcontrib>Lunger, Jaclyn R</creatorcontrib><creatorcontrib>Tan, Aik Rui</creatorcontrib><creatorcontrib>Xie, Mingrou</creatorcontrib><creatorcontrib>Peng, Jiayu</creatorcontrib><creatorcontrib>Gómez-Bombarelli, Rafael</creatorcontrib><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Damewood, James</au><au>Karaguesian, Jessica</au><au>Lunger, Jaclyn R</au><au>Tan, Aik Rui</au><au>Xie, Mingrou</au><au>Peng, Jiayu</au><au>Gómez-Bombarelli, Rafael</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Representations of Materials for Machine Learning</atitle><date>2023-01-20</date><risdate>2023</risdate><abstract>High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning relationships among composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by a machine learning model. Datasets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and property of interests. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs of machine learning models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus, require further investigation.</abstract><doi>10.48550/arxiv.2301.08813</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2301.08813
ispartof
issn
language eng
recordid cdi_arxiv_primary_2301_08813
source arXiv.org
subjects Physics - Materials Science
title Representations of Materials for Machine Learning
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T08%3A31%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Representations%20of%20Materials%20for%20Machine%20Learning&rft.au=Damewood,%20James&rft.date=2023-01-20&rft_id=info:doi/10.48550/arxiv.2301.08813&rft_dat=%3Carxiv_GOX%3E2301_08813%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true