MIT Study: SARS-CoV-2 Integrates into the Human Genome
17. May 2021
In the early months of the COVID-19 pandemic, healthcare workers analyzing test results began noticing something strange: patients who had already recovered from COVID-19 would sometimes inexplicably test positive on a PCR test weeks or even months later.
Although people can catch COVID-19 for a second time, this did not appear to be the case for these patients; no live viruses were isolated from their samples, and some studies found these false positive results even while holding participants in quarantine. Also, RNAs generally have a short life—most only stick around for a few minutes—so it was unlikely for positive tests to be the result of residual RNAs.
Now, a new paper from the lab of Whitehead Institute Member and MIT professor of biology Rudolf Jaenisch may offer an answer to why some patients continue to test positive after recovery from COVID-19. In the paper, published online in PNAS, Jaenisch and collaborators show that genetic sequences from the RNA virus SARS-CoV-2 can integrate into the genome of the host cell through a process called reverse transcription. These sections of the genome can then be “read” into RNAs, which could potentially be picked up by a PCR test.
SARS-CoV-2 is not the only virus that integrates into the human genome. Around 8 percent of our DNA consists of the remnants of ancient viruses. Some viruses, called retroviruses, rely on integration into human DNA in order to replicate themselves.
“SARS-CoV-2 is not a retrovirus, which means it doesn't need reverse transcription for its replication,” says Whitehead Institute postdoc and first author Liguo Zhang. “However, non-retroviral RNA virus sequences have been detected in the genomes of many vertebrate species, including humans.”
With this in mind, Zhang and Jaenisch began to design experiments to test whether this viral integration could be happening with the novel coronavirus. With the help of Jaenisch lab postdoc Alexsia Richards, the researchers infected human cells with coronavirus in the lab and then sequenced the DNA from infected cells two days later to see whether it contained traces of the virus’ genetic material.
To ensure that their results could be confirmed with different methodology, they used three different DNA sequencing techniques. In all samples, they found fragments of viral genetic material (though the researchers emphasize that none of the inserted fragments was enough to recreate a live virus).
Zhang, Jaenisch and colleagues then examined the DNA flanking the small viral sequences for clues to the mechanism by which they got there. In these surrounding sequences, the researchers found the hallmark of a genetic feature called a retrotransposon.
Sometimes called “jumping genes,” transposons are sections of DNA that can move from one region of the genome to another. They are often activated to “jump” in conditions of high stress or during cancer or aging, and are powerful agents of genetic change.
One common transposon in the human genome is called the LINE1 retrotransposon, which is made up of a powerhouse combination of DNA-cutting machinery and reverse transcriptase, an enzyme that creates DNA molecules from an RNA template (like the RNA of SARS-CoV-2).
“There’s a very clear footprint for LINE1 integration,” Jaenisch says. “At the junction of the viral sequence to the cellular DNA, it makes a 20 base pair duplication.”
Besides the duplication, another feature as evidence for LINE1-mediated integration is a LINE1 endonuclease recognition sequence. The researchers identified these features in nearly 70 percent of the DNAs that contained viral sequences, but not all, suggesting that the viral RNA may be integrating into cellular DNA via multiple mechanisms.
To screen for viral integration outside of the lab, the researchers analyzed published datasets of RNA transcripts from different types of samples, including COVID-19 patient samples. With these datasets, Zhang and Jaenisch were able to calculate the fraction of genes that were transcribed in these patients’ cells which contained viral sequences that could be derived from integrated viral copies. The percentage varied from sample to sample, but for some, a relatively large fraction of viral transcripts seem to have been transcribed from viral genetic material integrated into the genome.
A previous draft of the paper with this finding was published online on the preprint server bioRxiv. However, recent research revealed that at least some of the viral-cellular reads could be the product of misleading artefacts of the RNA sequencing method. In the present paper, the researchers were able to eliminate these artefacts that could have been obscuring the results.
Instead of simply tallying transcripts that contained viral material, the researchers looked at which direction the transcripts had been read. If the viral reads were the result of live viruses or existing viral RNAs in the cell, the researchers would expect that most of the viral transcripts would have been read in the correct orientation for the sequences in question; in acutely infected cells in culture, more than 99 percent are in the correct orientation. If the transcripts were the product of random viral integration into the genome, however, there would be a near 50-50 split—half the transcripts would have been read forwards, the other half backwards, relative to the host genes.
“This is what we saw in some patient samples,” says Zhang. “It suggests that much of the viral RNA in some samples could be transcribed from integrated sequences.”
Because the dataset they used was quite small, Jaenisch emphasizes that more information is needed to establish exactly how common this phenomenon is in real life and what it might mean for human health.
It is possible that only a very few human cells experience any kind of viral integration at all. In the case of another RNA virus that integrates into the host cell genome, only a fraction of a percent of infected cells (between .001 and .01) contained integrated viral DNA. For SARS-CoV-2, the frequency of integration in humans is still unknown.
“The fraction of cells which have the integrating with could be very small,” says Jaenisch. “But even if it's rare, there are more than 140 million people who have been infected already, right?”
In the future, Jaenisch and Zhang plan to investigate whether the fragments of SARS-CoV-2 genetic material could be made into proteins by the cell.
“If they do, and trigger immune responses, it may provide continuous protection against the virus,” Zhang says.
They also hope to investigate whether these integrated sections of DNA could be partly to blame for some of the long-term autoimmune consequences that some COVID-19 patients experience.
“At this point, we can only speculate,” says Jaenisch. “But one thing we do think we can explain is why some patients are long-term PCR positive.”
Republished courtesy of MIT. Photo: An image of lung cancer cells infected with the SARS-CoV-2 virus. Blue represents DNA, green shows the SARS-CoV-2 nucleocapsid protein, and red represents double-stranded RNA, which occurs when the virus replicates its genome. A new study from the Jaenisch lab suggests that some virus RNA can be reverse transcribed and inserted into the human genome, which may explain why some patients continue to test positive for COVID-19 even after recovery. Credit: Alexsia Richards/Whitehead Institute
Researchers show SARS-CoV-2 genes can be integrated into the human genome
By Sally Robertson - 10. May 2021
Researchers in the United States have shown that genes from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) – the causative agent of coronavirus disease 2019 (COVID-19) – can be integrated into the genome of infected human cells.
The team says the viral RNA can be expressed as chimeric transcripts with fused cellular and viral sequences.
“Importantly, such chimeric transcripts are detected in patient-derived tissues,” writes the team from the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts and the National Cancer Institute in Frederick, Maryland.
Study: Further evidence supports controversial claim that SARS-CoV-2 genes can integrate with human DNA. Image Credit: vchal / Shutterstock
Rudolf Jaenisch and colleagues say the findings may help to explain why some patients who have recovered from SARS-CoV-2 infection still test positive for the virus months later.
Patients remaining positive for viral RNA is an unresolved issue
Continuous or recurrent SARS-CoV-2-positive tests by polymerase chain reaction (PCR) have been reported in patients weeks or months after they have recovered from COVID-19. However, no infectious virus was isolated or shed from these patients, and the cause of the continued viral RNA production remains unknown.
Like other beta coronaviruses, SARS-CoV-2 uses an RNA-dependent RNA polymerase to replicate its genomic RNA and to transcribe subgenomic RNAs.
One potential explanation for the recurrent detection of viral RNA in the absence of viral replication is that DNA copies of viral subgenomic RNAs may become integrated into the DNA of the host cell via reverse transcription.
Transcription of the integrated DNA copies could be responsible for positive PCR tests long after the initial infection was cleared,” writes Jaenisch and colleagues.
Indeed, nonretroviral RNA virus sequences have been detected in the genomes of many animals. Several integrations in these sequences exhibit signals that are consistent with the integration of DNA copies of viral mRNAs into the germline via long interspersed nuclear element (LINE) retrotransposons.
Furthermore, nonretroviral RNA viruses such as lymphocytic choriomeningitis virus (LCMV) can be reverse transcribed into DNA copies by an endogenous reverse transcriptase. Studies have also shown that DNA copies of the viral sequences can integrate into the DNA of host cells.
“Moreover, expression of endogenous LINE1 and other retrotransposons in host cells is commonly up-regulated upon viral infection, including SARS-CoV-2 infection,” says Jaenisch and the team.
What did the researchers do?
The team used three different approaches to investigate whether SARS-CoV-2 RNA can be reverse transcribed and integrated into the genome of infected human cells in culture. The three approaches used to detect the viral RNA were nanopore long-read sequencing, Illumina paired-end whole genomic sequencing, and Tn5 tagmentation-based DNA integration site enrichment sequencing.
As reported in the Proceedings of the National Academy of Sciences, all three approaches provided evidence that SARS-CoV-2 RNA can be integrated into the genome of the host cell.
DNA copies of SARS-CoV-2 sequences were present in the genome and were shown to be integrated via a LINE1-mediated retroposition mechanism.
SARS-CoV-2 RNA can be reverse transcribed and integrated into the host cell genome. (A) Experimental workflow. (B) Chimeric sequence from a Nanopore sequencing read showing integration of a full-length SARS-CoV-2 NC subgenomic RNA sequence (magenta) and human genomic sequences (blue) flanking both sides of the integrated viral sequence. Features indicative of LINE1-mediated “target-primed reverse transcription” include the target site duplication (yellow highlight) and the LINE1 endonuclease recognition sequence (underlined). Sequences that could be mapped to both genomes are shown in purple with mismatches to the human genomic sequences in italics. The arrows indicate sequence orientation with regard to the human and SARS-CoV-2 genomes as shown in C and D. (C) Alignment of the Nanopore read in B with the human genome (chromosome X) showing the integration site. The human sequences at the junction region show the target site, which was duplicated when the SARS-CoV-2 cDNA was integrated (yellow highlight) and the LINE1 endonuclease recognition sequence (underlined). (D) Alignment of the Nanopore read in B with the SARS-CoV-2 genome showing the integrated viral DNA is a copy of the full-length NC subgenomic RNA. The light blue highlighted regions are enlarged to show TRS-L (I) and TRS-B (II) sequences (underlined, these are the sequences where the viral polymerase jumps to generate the subgenomic RNA) and the end of the viral sequence at the poly(A) tail (III). These viral sequence features (I–III) show that a DNA copy of the full-length NC subgenomic RNA was retro-integrated. (E) A human–viral chimeric read pair from Illumina paired-end whole-genome sequencing. The read pair is shown with alignment to the human (blue) and SARS-CoV-2 (magenta) genomes. The arrows indicate the read orientations relative to the human and SARS-CoV-2 genomes. The highlighted (light blue) region of the human read mapping is enlarged to show the LINE1 recognition sequence (underlined). (F) Distributions of human–CoV2 chimeric junctions from Nanopore (Left) and Illumina (Right) sequencing with regard to features of the human genome.
In some tissue samples taken from patients, the team also found evidence suggesting that a large proportion of the viral sequences were transcribed from integrated DNA copies of viral sequences, generating viral–host chimeric transcripts.
These and other data are consistent with a target primed reverse transcription and retroposition integration mechanism and suggest that endogenous LINE1 reverse transcriptase can be involved in the reverse transcription and integration of SARS-CoV-2 sequences in the genomes of infected cells,” writes the team.
- SARS-CoV-2 evades neutralizing antibodies and rapidly spreads by cell fusion, finds study
- Heterologous AstraZeneca and Pfizer vaccines induce strong immune response and T cell reactivity against multiple SARS-CoV-2 variants
- Can immunity form and interaction with seasonal coronaviruses influence SARS-CoV-2 epidemiology?
However, approximately 30% of the viral integrants lacked a recognizable nearby LINE1 endonuclease recognition site, thereby indicating that the integration could also occur via another mechanism.
What are the implications of the findings?
Jaenisch and colleagues say the findings raise several questions that require further investigation.
For example, the researchers ask whether integrated SARS-CoV-2 sequences express viral antigens in patients and whether these might influence the clinical course of disease.
If a cell with an integrated and expressed SARS-CoV-2 sequence survives and presents a viral- or neoantigen after the infection is cleared, this might engender continuous stimulation of immunity without producing infectious virus and could trigger a protective response or conditions such as autoimmunity as has been observed in some patients,” they write.
More generally, the integration of viral DNA in somatic cells may represent a consequence of natural infection that could play a role in the effects of other common disease-causing RNA viruses such as dengue and influenza virus, says the team.
The results may also be relevant for clinical trials of antiviral therapies.
If integration and expression of viral RNA are fairly common, reliance on extremely sensitive PCR tests to determine the effect of treatments on viral replication and viral load may not always reflect the ability of the treatment to fully suppress viral replication because the PCR assays may detect viral transcripts that derive from viral DNA sequences that have been stably integrated into the genome rather than infectious virus,” says Jaenisch and colleagues.
- Cohen J. Further evidence supports controversial claim that SARS-CoV-2 genes can integrate with human DNA. Science, 2021. https://www.sciencemag.org/news/2021/05/further-evidence-offered-claim-genes-pandemic-coronavirus-can-integrate-human-dna
- Reverse-transcribed SARS-CoV-2 RNA can integrate into the genome of cultured human cells and can be expressed in patient-derived tissues. PNAS, 2021; 118 (21) e2105968118; DOI: https://doi.org/10.1073/pnas.2105968118, https://www.pnas.org/content/118/21/e2105968118