
Massive Genomic Study Maps 167,000 New Variants
How did your country report this? Share your view in the comments.
Diverging Reports Breakdown
Massive Genomic Study Maps 167,000 New Variants
Researchers have significantly expanded the catalogue of known human genetic variation. They analysed the genomes of 1,019 people drawn from 26 populations on five continents. Each person carried a median of 7.5 million letters’ worth of structural changes. The new reference set reduces the list of suspect mutations from tens of thousands to just a few hundred, accelerating the path to the diagnosis of rare genetic syndromes and other types of diseases like cancer. More than half of the newly mapped diversity in the human genome was found to lie in highly repetitive stretches of DNA, parts of the genome once dismissed as junk or too hard to study. These repetitive segments of DNA include mobile elements, also known as ‘jumping genes’ due to their ability to copy and paste themselves around the genome. The researchers saw a similar trick with another class of jumping genes called SVAs, which boost their activity by hijacking a powerful regulatory switch to make far more copies of itself than usual, scattering extra genetic material across many people’s DNA.
Read time: 4 minutes
Researchers have significantly expanded the catalogue of known human genetic variation. The resulting datasets, shared in two back-to-back publications in the journal Nature, constitute what may be the most complete overview of the human genome to date.
The first paper, jointly led by the European Molecular Biology Laboratory (EMBL), Heinrich Heine University Düsseldorf (HHU) and the Centre for Genomic Regulation (CRG) in Barcelona, analysed the genomes of 1,019 people drawn from 26 populations on five continents.
The researchers specifically looked for structural variants in the human genome. These are large chunks of DNA that have been deleted, duplicated, inserted, inverted or shuffled. Differences in structural variants between individuals can mean changes to thousands of DNA letters at once, often knocking out genes and driving many rare diseases and cancers.
The team found and categorised more than 167,000 structural variants across the 1,019 individuals, doubling the known amount of structural variation in the human pangenome, a reference that stitches together DNA from many people instead of relying on a single genome. Each person carried a median of 7.5 million letters’ worth of structural changes, underscoring how much genome editing nature performs on its own.
“We found a treasure trove of hidden genetic variation in these populations, many of which were underrepresented in earlier reference sets. For example, 50.9% of insertions and 14.5% of deletions we found have not been reported in previous variation catalogues. It’s an important step to map blind spots in the human genome and reduce the bias that has long favoured genomes of European descent and paves the way for therapies and tests that work just as well for people everywhere,” says Dr. Bernardo Rodríguez-Martín, co-corresponding author of the study.
Around three in five (59%) of the variants uncovered occurred in fewer than one per cent of individuals, a level of rarity crucial for diagnosing genetic disease because it can help filter out harmless variations more effectively. In tests, the new reference set reduces the list of suspect mutations from tens of thousands to just a few hundred, accelerating the path to the diagnosis of rare genetic syndromes and other types of diseases like cancer.
Bernardo Rodríguez-Martín began working on the project in Jan Korbel’s lab at EMBL and completed it after moving to the CRG when starting his own group. He developed SVAN, a software which categorises every DNA change, like “extra piece copied” or “chunk deleted”, helping the team sift through the genetic data to discern new patterns.
SVAN revealed that more than half of the newly mapped diversity in the human genome was found to lie in highly repetitive stretches of DNA, parts of the genome once dismissed as junk or too hard to study. “Repetitive elements represent a rich and previously overlooked reservoir of genetic diversity. They are key protagonists in human diversity, disease and evolution,” says Emiliano Sotelo-Fonseca, PhD student at the CRG and co-author of the first study.
These repetitive segments of DNA include mobile elements, also known as ‘jumping genes’ due to their ability to copy and paste themselves around the genome. The researchers found that among the thousands of mobile elements in the human genome, most of the germline mutagenesis derives from the activity of a few dozens of highly active elements.
For example, one particularly hyperactive LINE-1 element was found to hijack a powerful regulatory switch to make far more copies of itself than usual, scattering extra genetic material across many people’s DNA. The researchers saw a similar trick with another class of jumping genes called SVAs.
“Our work shows how mobile elements boost their activity by hijacking our genome regulation buttons, an underappreciated strategy that could help drive diseases like cancer and which merits further research,” says Dr. Rodríguez-Martín.
The second paper, jointly led by the European Molecular Biology Laboratory (EMBL) and Heinrich Heine University Düsseldorf (HHU), used a much smaller sample set of only 65 individuals but combined several powerful sequencing methods to piece together human genomes in unprecedented detail.
The approach helped researchers decode the hardest-to-read stretches, including centromeres. The near-complete, gap-free assemblies of every chromosome for these individuals helped researchers detect large genetic variants within those regions missed by the first paper and other studies.
The findings show that combining the approach of paper one, with many genomes sequenced at modest depth, with the approach of paper two, with a few genomes in high detail, is the fastest path to a complete, inclusive map of human genetic diversity.
Advertisement
“One study uses less sequencing power, but a much larger cohort. The other uses a smaller cohort, but much more sequencing power per sample. This led to complementary conclusions,” says Dr. Jan Korbel, Group Leader and Interim Head at EMBL Heidelberg, and co-senior author of both studies.
Both papers re-sequenced individuals from the 1000 Genomes project, the landmark effort that mapped global genetic diversity in 2015. The project relied on “short read” sequencing technology, which could only read very small bits of DNA at a time. These were too short to reveal big chunks of DNA that are missing or copied, long stretches that flip direction or repeats that look almost identical in many places.
The advances made by the new studies were possible thanks to “long-read” sequencing, a recent technology that reads thousands to tens-of-thousands of DNA letters in one go, helping researchers find large amounts of hidden variation undetectable with previous methods.
The two papers also make important inroads towards the construction of a human pangenome reference. For the last twenty years, scientists have used one person’s DNA sequence as the “standard” human genome. A pangenome would be better suited for personalised medicine, reflecting global diversity.
By developing innovative algorithms that can analyse 1,019 diverse genomes for breadth and 65 ultra-complete genomes for depth, the researchers provide a roadmap that makes assembling a true human pangenome more practical rather than aspirational, particularly as long-read sequencing costs are falling.
“Through these studies, we have created a comprehensive and medically-relevant resource that can now be used by researchers everywhere to better understand the origins of human genomic variation, and see how it is affected by a plethora of different factors,” says Tobias Marschall, Professor at Heinrich Heine University Düsseldorf and co-senior author of both studies. “This is a great example of collaborative research opening up new vistas in genomic science and a step towards a more complete human pangenome.”
Reference: Schloissnig S, Pani S, Ebler J, et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature. 2025. doi: 10.1038/s41586-025-09290-7
This article has been republished from the following materials. Note: material may have been edited for length and content. For further information, please contact the cited source. Our press release publishing policy can be accessed here.
Massive Genomic Study Maps 167,000 New Variants
Researchers have significantly expanded the catalogue of known human genetic variation. They analysed the genomes of 1,019 people drawn from 26 populations on five continents. Each person carried a median of 7.5 million letters’ worth of structural changes. The new reference set reduces the list of suspect mutations from tens of thousands to just a few hundred, accelerating the path to the diagnosis of rare genetic syndromes and other types of diseases like cancer. More than half of the newly mapped diversity in the human genome was found to lie in highly repetitive stretches of DNA, parts of the genome once dismissed as junk or too hard to study. These repetitive segments of DNA include mobile elements, also known as ‘jumping genes’ due to their ability to copy and paste themselves around the genome. The researchers saw a similar trick with another class of jumping genes called SVAs, which boost their activity by hijacking a powerful regulatory switch to make far more copies of itself than usual, scattering extra genetic material across many people’s DNA.
Read time: 4 minutes
Researchers have significantly expanded the catalogue of known human genetic variation. The resulting datasets, shared in two back-to-back publications in the journal Nature, constitute what may be the most complete overview of the human genome to date.
The first paper, jointly led by the European Molecular Biology Laboratory (EMBL), Heinrich Heine University Düsseldorf (HHU) and the Centre for Genomic Regulation (CRG) in Barcelona, analysed the genomes of 1,019 people drawn from 26 populations on five continents.
The researchers specifically looked for structural variants in the human genome. These are large chunks of DNA that have been deleted, duplicated, inserted, inverted or shuffled. Differences in structural variants between individuals can mean changes to thousands of DNA letters at once, often knocking out genes and driving many rare diseases and cancers.
The team found and categorised more than 167,000 structural variants across the 1,019 individuals, doubling the known amount of structural variation in the human pangenome, a reference that stitches together DNA from many people instead of relying on a single genome. Each person carried a median of 7.5 million letters’ worth of structural changes, underscoring how much genome editing nature performs on its own.
“We found a treasure trove of hidden genetic variation in these populations, many of which were underrepresented in earlier reference sets. For example, 50.9% of insertions and 14.5% of deletions we found have not been reported in previous variation catalogues. It’s an important step to map blind spots in the human genome and reduce the bias that has long favoured genomes of European descent and paves the way for therapies and tests that work just as well for people everywhere,” says Dr. Bernardo Rodríguez-Martín, co-corresponding author of the study.
Around three in five (59%) of the variants uncovered occurred in fewer than one per cent of individuals, a level of rarity crucial for diagnosing genetic disease because it can help filter out harmless variations more effectively. In tests, the new reference set reduces the list of suspect mutations from tens of thousands to just a few hundred, accelerating the path to the diagnosis of rare genetic syndromes and other types of diseases like cancer.
Bernardo Rodríguez-Martín began working on the project in Jan Korbel’s lab at EMBL and completed it after moving to the CRG when starting his own group. He developed SVAN, a software which categorises every DNA change, like “extra piece copied” or “chunk deleted”, helping the team sift through the genetic data to discern new patterns.
SVAN revealed that more than half of the newly mapped diversity in the human genome was found to lie in highly repetitive stretches of DNA, parts of the genome once dismissed as junk or too hard to study. “Repetitive elements represent a rich and previously overlooked reservoir of genetic diversity. They are key protagonists in human diversity, disease and evolution,” says Emiliano Sotelo-Fonseca, PhD student at the CRG and co-author of the first study.
These repetitive segments of DNA include mobile elements, also known as ‘jumping genes’ due to their ability to copy and paste themselves around the genome. The researchers found that among the thousands of mobile elements in the human genome, most of the germline mutagenesis derives from the activity of a few dozens of highly active elements.
For example, one particularly hyperactive LINE-1 element was found to hijack a powerful regulatory switch to make far more copies of itself than usual, scattering extra genetic material across many people’s DNA. The researchers saw a similar trick with another class of jumping genes called SVAs.
“Our work shows how mobile elements boost their activity by hijacking our genome regulation buttons, an underappreciated strategy that could help drive diseases like cancer and which merits further research,” says Dr. Rodríguez-Martín.
The second paper, jointly led by the European Molecular Biology Laboratory (EMBL) and Heinrich Heine University Düsseldorf (HHU), used a much smaller sample set of only 65 individuals but combined several powerful sequencing methods to piece together human genomes in unprecedented detail.
The approach helped researchers decode the hardest-to-read stretches, including centromeres. The near-complete, gap-free assemblies of every chromosome for these individuals helped researchers detect large genetic variants within those regions missed by the first paper and other studies.
The findings show that combining the approach of paper one, with many genomes sequenced at modest depth, with the approach of paper two, with a few genomes in high detail, is the fastest path to a complete, inclusive map of human genetic diversity.
Advertisement
“One study uses less sequencing power, but a much larger cohort. The other uses a smaller cohort, but much more sequencing power per sample. This led to complementary conclusions,” says Dr. Jan Korbel, Group Leader and Interim Head at EMBL Heidelberg, and co-senior author of both studies.
Both papers re-sequenced individuals from the 1000 Genomes project, the landmark effort that mapped global genetic diversity in 2015. The project relied on “short read” sequencing technology, which could only read very small bits of DNA at a time. These were too short to reveal big chunks of DNA that are missing or copied, long stretches that flip direction or repeats that look almost identical in many places.
The advances made by the new studies were possible thanks to “long-read” sequencing, a recent technology that reads thousands to tens-of-thousands of DNA letters in one go, helping researchers find large amounts of hidden variation undetectable with previous methods.
The two papers also make important inroads towards the construction of a human pangenome reference. For the last twenty years, scientists have used one person’s DNA sequence as the “standard” human genome. A pangenome would be better suited for personalised medicine, reflecting global diversity.
By developing innovative algorithms that can analyse 1,019 diverse genomes for breadth and 65 ultra-complete genomes for depth, the researchers provide a roadmap that makes assembling a true human pangenome more practical rather than aspirational, particularly as long-read sequencing costs are falling.
“Through these studies, we have created a comprehensive and medically-relevant resource that can now be used by researchers everywhere to better understand the origins of human genomic variation, and see how it is affected by a plethora of different factors,” says Tobias Marschall, Professor at Heinrich Heine University Düsseldorf and co-senior author of both studies. “This is a great example of collaborative research opening up new vistas in genomic science and a step towards a more complete human pangenome.”
Reference: Schloissnig S, Pani S, Ebler J, et al. Structural variation in 1,019 diverse humans based on long-read sequencing. Nature. 2025. doi: 10.1038/s41586-025-09290-7
This article has been republished from the following materials. Note: material may have been edited for length and content. For further information, please contact the cited source. Our press release publishing policy can be accessed here.