SPLTRAK Abstract Submission
Bandgap Model using Symbolic Regression for Environmentally Compatible Lead-Free Inorganic Double Perovskites  
Ahmer A.B. Baloch, Omar Albadwawi, Badreyya AlShehhi, Vivian Alberts
Research and Development Center, Dubai Electricity and Water Authority, Dubai, United Arab Emirates

Data-driven models have become an essential practice of scientific research in the perovskite field, along with theory and experiments. Material informatics has emerged as a viable alternative method of exploring and formulating novel perovskite compounds using a descriptor-based approach. Herein, we develop a method that includes feature augmentation with symbolic regression to rapidly estimate and screen non-toxic lead-free inorganic double perovskites (A2BB'X6) using machine learning. Predictive models were created by identifying a physico-chemical relevant descriptor from an extensive pool of augmented features.  Using primary atomic and molecular features, a high dimensional space of descriptors (containing ≈ 3×105 features) was reconstructed using mathematical operators. By increasing the complexity from 1-D to 5-D descriptor, correlation coefficient was increased from 81.6% to 92.4%. These accurate & interpretable models can then be employed for screening lead-free perovskites with appropriate band gaps and stability.