(CNN)In 2003, the Human Genome Project made history when it sequenced 92% of the human genome. But for nearly two decades since, scientists have struggled to decipher the remaining 8%. Now, a team of nearly 100 scientists from the Telomere-to-Telomere (T2T) Consortium has unveiled the complete human genome -- the first time it's been sequenced in its entirety, the researchers say.
Scientists sequence the complete human genome for the first time
"Having this complete information will allow us to better understand how we form as an individual organism and how we vary not just between other humans but other species," Evan Eichler, a Howard Hughes Medical Institute investigator at the University of Washington and the research leader, said Thursday.
The new research introduces 400 million letters to the previously sequenced DNA -- an entire chromosome's worth. The full genome will allow scientists to analyze how DNA differs between people and whether these genetic variations play a role in disease.
The research, published in the journal Science on Thursday, was previously in preprint, allowing other teams to use the sequence in their own studies.
Until now, it was unclear what these unknown genes coded.
"It turns out that these genes are incredibly important for adaptation," Eichler said. "They contain immune response genes that help us to adapt and survive infections and plagues and viruses. They contain genes that are ... very important in terms of predicting drug response."
Eichler also said that some of the recently uncovered genes are even responsible for making human brains larger than those of other primates, providing insight into what makes humans unique.
This remaining 8% of the human genome had stumped scientists for years because of its complexities. For one thing, it contained DNA regions with several repetitions, which made it challenging to string the DNA together in the correct order using previous sequencing methods.
The researchers relied on two DNA sequencing technologies that emerged over the past decade to bring this project to fruition: the Oxford Nanopore DNA sequencing method, which can sequence up to 1 million DNA letters at once but with some mistakes, and the PacBio HiFi DNA sequencing method, which can read 20,000 letters with 99.9% accuracy.
Sequencing DNA is like solving a jigsaw puzzle, Eichler said. Scientists must first break the DNA into smaller parts and then use sequencing machines to piece it together in the correct order. Previous sequencing tools could sequence only small sections of DNA at once.
With a 10,000-piece puzzle, it's hard to correctly arrange small puzzle pieces when they look alike, much like it is to sequence small sections of repetitive DNA. But with a 500-piece puzzle, it's much easier to arrange larger pieces -- or, in this case, longer segments of DNA.
A second challenge was finding cells that contained only one genome.
Standard human cells contain two sets of DNA, a maternal copy and a paternal copy, but this team used DNA from a group of cells called a complete hydatidiform mole, which contains a duplicate of the paternal set of DNA. A complete hydatidiform mole is a rare complication of a pregnancy caused by the abnormal growth of cells that originate from the placenta. This approach simplifies the genome so that scientists need sequence only one set rather than two sets of DNA.
Because the research team used a duplicate set of DNA, the scientists were unable to sequence the Y chromosome originally. According to lead study author Adam Phillippy, the team has managed to sequence the Y chromosome using a different set of cells.
A complete set of 24 sequenced chromosomes is available on the University of Santa Cruz genome browser