The name Sergund Rowland is unfamiliar to most of us. However, we owe our gratitude to the American atmospheric chemist, since he was the first to realize in the early 1970s that a single person released into the air could destroy up to 100,000 ozone atoms.

Fifty years later, Rowland’s colleagues face similar challenges. The need to move to a new energy model and get rid of the shackles of fossil fuels can not be delayed. But to make it a reality, scientific leaps are required. New materials, new molecules, new mechanical devices must be created in order to turn an energy page.

In this race, organic semiconductors are considered as one of the most promising technologies that will allow the construction of solar cells of incomparable efficiency. Somewhere here, however, the difficulties begin. For such applications, improved organic molecules must be discovered. However, the number of potentially small organic molecules is estimated to be around 1033. The number is frightening, which is why researchers are increasingly turning to machine learning to find out which molecules are really useful in this data universe. . Even so, owning one is still beyond the reach of the average person.

Scientists from the Fritz Haber Institute in Berlin and the Technical University of Munich have taken a different path. In a recent paper in Nature Communications, they describe a solution to this problem through so-called “active learning”.

Instead of learning from existing data, the machine learning algorithm decides what data it really needs to learn and “discards” the rest. This is a critical parameter, as it dramatically reduces the amount of data it has to control.

Scientists first performed simulations on a few smaller molecules, drawing data on their electrical conductivity – essential for examining possible solar cell material. Based on this data, the algorithm decides whether small modifications to the specific molecules could offer useful properties or not. In both cases it automatically requests new simulations, improves through the new data, examines new molecules and repeats the process.

In their work, scientists claim that new and promising molecules can be effectively identified in this way, while the algorithm continues to explore the vast molecular space, even now, as these lines are read. Each week it proposes new molecules that could lead to the next generation of solar cells, while it continues to become more and more effective.