First Fully Complete Human Genome Has Been Published After 20 Years
The first fully complete human genome with no gaps is now
available to view for scientists and the public, marking a huge moment for
human genetics. Announced in a preprint in June 2021, six papers
have now been published in the journal Science.
They describe the painstaking work that goes into sequencing an over 6 billion
base pair genome, with 200 million added in this new research. The new genome
now adds 99 genes likely to code for proteins and 2,000 candidate
genes that were previously unknown.
Many will be asking:
"wait, didn’t we already sequence the human genome?" In part, yes –
in 2000, the Human Genome Sequencing Consortium published their first
drafts of the human genome, results that subsequently paved the way for almost
every facet of human genetics available today.
The most recent draft
of the human genome has been used as a reference since 2013. But weighed down
by impractical sequencing techniques, these drafts left out the most complex
regions of our DNA, which make up around 8 percent of the total
genome. This is because these sequences are highly repetitive and contain
many duplicated regions – attempting to put them together in the
right places is like trying to complete a jigsaw puzzle where all the pieces
are the same shape and have no image on the front. Long gaps and
underrepresentation of large, repeating sequences made it so that this genetic
material has been excluded for the past 20 years. Scientists had to come up
with more accurate methods of sequencing to illuminate the darkest corners of
the genome.
“These parts of the human genome that we haven’t been able to
study for 20-plus years are important to our understanding of how the genome
works, genetic diseases, and human diversity and evolution,” said Karen Miga,
assistant professor of biomolecular engineering at UC Santa Cruz, in a statement.
Much like the Human Genome Sequencing Consortium, the new
reference genome (called T2T-CHM13) was produced by the Telomere-2-Telomere
Consortium, a group of researchers dedicated to finally mapping each chromosome
from one telomere to the other. T2T-CHM13 will now be available on UCSC Genome Browser for
everyone to enjoy, complimenting the standard human reference genome,
GRCh38.
In case you don't believe it, this is the HGSC reference genome.
Each number is a chromosome, and the font is size 4.5, which is almost
illegible. Image Credit: widdowquinn/Flickr CC BY-NC-SA 2.0
The new reference genome was created using two modern
sequencing techniques, called Oxford Nanopore and PacBio HiFi ultra-long read sequencing, which
massively increases the length of DNA that can be read while also improving the
accuracy. Through this, they could sequence strings of DNA previously
unreadable by more rudimentary techniques, alongside correcting some structural
errors that existed in the previous reference genomes.
Looking to the future,
the consortium hopes to add even more reference genomes as part of the Human
Pangenome Reference Consortium to improve diversity in human genetics,
something sorely lacking at present.
“We’re adding a second complete genome, and then there will be
more,” said David Haussler, director of the UC Santa Cruz Genomics Institute,
in a statement.
“The next phase is to
think about the reference for humanity’s genome as not being a single genome
sequence. This is a profound transition, the harbinger of a new era in which we
will eventually capture human diversity in an unbiased way.”
No comments:
Post a Comment