Robust Topological Descriptors for Machine Learning Prediction of Guest Adsorption in Nanoporous Materials
In recent years, machine learning (ML) for predicting material properties has emerged as a quicker alternative to experimental and simulation-based investigations. Standard ML approaches tend to utilize specific domain knowledge when designing feature inputs. Each ML property predictor then requires a set of tailored structural features - this can commonly lead to drawbacks, due to the small number of implemented features and their lack of prediction transferability across different predictors. The latter has been empirically observed in the case of guest uptake predictors for nanoporous materials, where local and global porosity features become dominant descriptors at low and high pressures, respectively. Here, we provide a more holistic feature representation for materials structures using tools from topological data analysis and persistent homology to describe the geometry and topology of nanoporous materials at various scales. We demonstrate an application of these topology-based feature representations to predict methane uptakes for zeolite structures in the range of 1-200 bar. These predictions show a root-mean-square deviation decrease of up to 50 pressures in comparison to a model based on commonly used local and global features. Similarly, the topology-based model shows an increase of 0.2-0.3 in R2 score in comparison to the commonly used porosity descriptors. Notably, unlike the standard porosity features, the topology-based features show accuracy across multiple different pressures. Furthermore, we show feature importance in terms of different topological features, thus elucidating information about the channel and pore sizes that correlate best to adsorption properties. Finally, we demonstrate that ML models relying on a combination of topological and commonly employed descriptors provide even better guest uptake regressors.
READ FULL TEXT