Physical and theoretical chemists appear to have a remarkable ability to defy the exponential increase of available computer power by incrementing the complexity of their models at an even faster pace.
On the bright side, this process is making predictive modelling of materials and molecules a reality, but at the same time the sheer amount of data produced by simulations, as well as their intrinsic complexity, makes it extracting physical insight from the models a challenge of its own.
In this talk I will discuss how computers can help us facing this challenge, leveraging machine-learning algorithms that are at the core of modern data science. The crucial ingredient in applying these algorithms to chemistry and materials lies in providing a robust mathematical translation of the notion of chemical similarity. I will focus in particular on how one can introduce an effective metric to compare structures. I will then demonstrate how such an (al)chemical similarity measure can be used as the basis for building intuitive maps of chemical landscapes, as well as to predict physical-chemical properties without relying on expensive quantum chemical calculations.