Image and data mining in reticular chemistry powered by GPT-4V
The integration of artificial intelligence into scientific research opens new avenues with the advent of GPT-4V, a large language model equipped with vision capabilities. In this study, we demonstrate that GPT-4V, accessible through the ChatGPT web user interface or an API, offers promising possibil...
Gespeichert in:
Veröffentlicht in: | Digital discovery 2024-03, Vol.3 (3), p.491-51 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The integration of artificial intelligence into scientific research opens new avenues with the advent of GPT-4V, a large language model equipped with vision capabilities. In this study, we demonstrate that GPT-4V, accessible through the ChatGPT web user interface or an API, offers promising possibilities in navigating and mining complex data for metal-organic frameworks (MOFs) especially from graphical sources (
e.g.
sorption isotherms, powder X-ray diffraction patterns, thermogravimetric analysis graphs,
etc.
). Our approach involved an automated process of converting 346 scholarly articles into 6240 images, which represents a benchmark dataset in this task, followed by deploying GPT-4V to categorize and analyze these images using natural language prompts, which can be written by chemists or materials scientists with minimal prior coding knowledge. This methodology enabled GPT-4V to accurately identify and interpret key plots integral to MOF characterization, such as nitrogen isotherms, PXRD patterns, and TGA curves, among others, with accuracy and recall above 93%. The model's proficiency in extracting critical information from these plots not only underscores its capability in data mining but also highlights its potential to aid in the digitalization of experimental data and the creation of datasets for reticular chemistry. In addition, the trends and values of nitrogen isotherm data from the selected literature allowed for a comparison between theoretical and experimental porosity values for over 200 compounds, highlighting certain discrepancies and underscoring the importance of integrating computational and experimental data. This work highlights the potential of AI in accelerating scientific discovery by bridging the gap between computational tools and experimental research.
The integration of artificial intelligence into scientific research opens new avenues with the advent of GPT-4V, a large language model equipped with vision capabilities. |
---|---|
ISSN: | 2635-098X 2635-098X |
DOI: | 10.1039/d3dd00239j |