Prediction of diffusion coefficients in aqueous systems by machine learning models
Currently there are no accurate models for the prediction of diffusion coefficients at infinite dilution in aqueous systems. Frequently, models that work well for polar solvents often perform worse in the case of water. At the same time, experimental data of tracer diffusion coefficients are scarce...
Gespeichert in:
Veröffentlicht in: | Journal of molecular liquids 2024-07, Vol.405, p.125009, Article 125009 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Currently there are no accurate models for the prediction of diffusion coefficients at infinite dilution in aqueous systems. Frequently, models that work well for polar solvents often perform worse in the case of water. At the same time, experimental data of tracer diffusion coefficients are scarce and can be impractical to measure when information on this important transport property is required. In this work, machine learning models were developed to predict the tracer diffusion coefficient of any solute in water at atmospheric pressure. Several approaches were carried out to construct the model, using different types of input parameters: pure component properties and theoretical molecular descriptors, such as atom counts, structural fragments and fingerprints, computed using different sources. A database of 126 systems (1192 data points) was used for training and the best model showed a global average absolute relative deviation (AARD) of 3.92%, with a maximum deviation of 24.27% on the test set. This model uses as inputs the temperature and 195 molecular descriptors computed using the RDKit cheminformatics package, which can be automatically calculated from a molecular identifier thus making the model very simple to use. In comparison, the well-known Wilke-Chang equation provided an AARD of 13.03% in the same test set, demonstrating the improved accuracy of the proposed solution. The models developed in this work are provided at github.com/EgiChem/ml-D12-water-app.
•Machine learning models were developed to predict the binary diffusion coefficients of solutes in water.•Models were trained on a database of experimental data of 126 systems (1192 data points).•Different types of molecular descriptors were tested to construct the models.•All new models performed significantly better than the classic Wilke-Chang equation.•The best machine learning model presented an average deviation for the test set of 3.92%. |
---|---|
ISSN: | 0167-7322 1873-3166 |
DOI: | 10.1016/j.molliq.2024.125009 |