Abstract:Photoactive iridium complexes are of broad interest due to their applications ranging from lighting to photocatalysis. However, the excited state property prediction of these complexes challenges ab initio methods such as time-dependent density functional theory (TDDFT) both from an accuracy and a computational cost perspective, complicating high throughput virtual screening (HTVS). We instead leverage low-cost machine learning (ML) models to predict the excited state properties of photoactive iridium complexes. We use experimental data of 1,380 iridium complexes to train and evaluate the ML models and identify the best-performing and most transferable models to be those trained on electronic structure features from low-cost density functional theory tight binding calculations. Using these models, we predict the three excited state properties considered, mean emission energy of phosphorescence, excited state lifetime, and emission spectral integral, with accuracy competitive with or superseding TDDFT. We conduct feature importance analysis to identify which iridium complex attributes govern excited state properties and we validate these trends with explicit examples. As a demonstration of how our ML models can be used for HTVS and the acceleration of chemical discovery, we curate a set of novel hypothetical iridium complexes and identify promising ligands for the design of new phosphors.
Abstract:Two outstanding challenges for machine learning (ML) accelerated chemical discovery are the synthesizability of candidate molecules or materials and the fidelity of the data used in ML model training. To address the first challenge, we construct a hypothetical design space of 32.5M transition metal complexes (TMCs), in which all of the constituent fragments (i.e., metals and ligands) and ligand symmetries are synthetically accessible. To address the second challenge, we search for consensus in predictions among 23 density functional approximations across multiple rungs of Jacob's ladder. To accelerate the screening of these 32.5M TMCs, we use efficient global optimization to sample candidate low-spin chromophores that simultaneously have low absorption energies and low static correlation. Despite the scarcity (i.e., $<$ 0.01\%) of potential chromophores in this large chemical space, we identify transition metal chromophores with high likelihood (i.e., $>$ 10\%) as the ML models improve during active learning. This represents a 1,000 fold acceleration in discovery corresponding to discoveries in days instead of years. Analyses of candidate chromophores reveal a preference for Co(III) and large, strong-field ligands with more bond saturation. We compute the absorption spectra of promising chromophores on the Pareto front by time-dependent density functional theory calculations and verify that two thirds of them have desired excited state properties. Although these complexes have never been experimentally explored, their constituent ligands demonstrated interesting optical properties in literature, exemplifying the effectiveness of our construction of realistic TMC design space and active learning approach.