Date:
Author: Ben Woodcroft, Associate Professor of Microbial Informatics, Queensland University of Technology
Original article: https://theconversation.com/1-trillion-species-3-billion-years-how-we-used-ai-to-trace-the-evolution-of-bacteria-on-earth-253720

There are roughly a trillion species of microorganisms on Earth – the vast majority of which are bacteria.
Bacteria consist of a single cell. They do not have bones and are not like big animals that leave clear signs in the geological record, which thankful palaeontologists can study many millions of years later.
This has made it very hard for scientists to establish a timeline of their early evolution. But with the help of machine learning, we have been able to fill in many of the details. Our new research, published today in Science, also reveals some bacteria developed the ability to use oxygen long before Earth became saturated with it roughly 2.4 billion years ago.
A monumental event in Earth’s history
About 4.2 billion years ago, the Moon formed. Violently. A Mars-size object collided with Earth, turning its surface into molten rock. If life existed before this cataclysm, it was probably destroyed.
After that, the current ancestors of all living beings appeared: single-celled microbes. For the first 80% of life’s history, Earth was inhabited solely by these microbes.
Nothing in biology makes sense except in the light of evolution, as evolutionary biologist Theodosius Dobzhansky famously said in 1973. But how did the evolution of life proceed through the early history of Earth?
Comparing DNA sequences from the wonderful diversity of life we see today can tell us how different groups relate to each other. For instance, we humans are more closely related to mushrooms than we are to apple trees. Likewise, such comparisons can tell us how different groups of bacteria are related to each other.
But comparison of DNA sequences can only take us so far. DNA comparisons do not say when in Earth’s history evolutionary events took place. At one point in time, an organism reproduced two offspring. One of them gave rise to mushrooms, the other to humans (and lots of other species too).
One thing geology teaches us about is the existence of another monumental event in the history of Earth, 2.4 billion years ago. At that time, the atmosphere of the Earth changed dramatically. A group of bacteria called the cyanobacteria invented a trick that would alter the story of life forever: photosynthesis.
Harvesting energy from the sun powered their cells. But it also generated an inconvenient waste product, oxygen gas.
Over the course of millions of years, oxygen in the atmosphere slowly accumulated. Before this “Great Oxidation Event”, Earth contained almost no oxygen, so life was not ready for it. In fact, to uninitiated bacteria, oxygen is a poisonous gas, and so its release into the atmosphere probably caused a mass extinction. The surviving bacteria either evolved to use oxygen, or retreated into the recesses of the planet where it doesn’t penetrate.
The bacterial tree of life
The Great Oxidation Event is especially interesting for us not only because of its impact in the history of life, but also because it can be given a clear date. We know it happened around 2.4 billion years ago – and we also know most bacteria that adapted to oxygen had to live after this event. We used this information to layer on dates to the bacterial tree of life.
We started by training an artificial intelligence (AI) model to predict whether a bacteria lives with oxygen or not from the genes it has. Many bacteria we see today use oxygen, such as cyanobacteria and others that live in the ocean. But many do not, such as the bacteria that live in our gut.
As far as machine learning tasks go, this one was quite straightforward. The chemical power of oxygen markedly changes a bacteria’s genome because a cell’s metabolism becomes organised around oxygen use, and so there are many clues in the data.
We then applied our machine learning models to predict which bacteria used oxygen in the past. This was possible because modern techniques allow us to estimate not only how the species we see today are related, but also which genes each ancestor carried in its genome.

GSFC/NASA
A surprising twist
By using the planet-wide geological event of the Great Oxidation Event effectively as a “fossil” calibration point, our approach produced a detailed timeline of bacterial evolution.
Combining results from geology, paleontology, phylogenetics and machine learning, we were able to refine the timing of bacterial evolution significantly.
Our results also revealed a surprising twist: some bacterial lineages capable of using oxygen existed roughly 900 million years before the Great Oxidation Event. This suggests these bacteria evolved the ability to use oxygen even when atmospheric oxygen was scarce.
Remarkably, our findings indicated that cyanobacteria actually evolved the ability to use oxygen before they developed photosynthesis.
This framework not only reshapes our understanding of bacterial evolutionary history but also illustrates how life’s capabilities evolved in response to Earth’s changing environments.
Ben Woodcroft receives funding from the ARC.
Adrián A. Davín does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.