Severe acute respiratory coronavirus 2 (SARS-CoV-2), the etiological agent of 2019
coronavirus disease (COVID-19), continuously circulates and has caused over one
100,000/month since its original emergence into the human population1. Based on official
statistics on laboratory-confirmed reports, the case fatality ratio of COVID-19 ranges from
1.5% to 10% in developed and developing countries, respectively1. Differently than other
highly pathogenic coronaviruses from the 21st century, such as SARS-CoV and Middle
East respiratory coronavirus (MERS-CoV), SARS-CoV-2 shedding occurs from the pre-
symptomatic period to a few weeks after symptoms onset2. Longer viral replication favors
tissue damage, as shown by the positive correlation between high activity of lactate
dehydrogenase (LDH) activity, a marker of cell death, with COVID-19 progression3. While
type II pneumocytes are targeted and destroyed by the infection and the respiratory
parenchyma is harmed, innate and adaptive immunological responses are not always able
to prevent further progression to poor clinical outcomes, and may even worsen the tissue
lesions4,5. In fact, severe COVID-19 has been associated with increased and uncontrolled
release of pro-inflammatory mediators (cytokine storm), so that the resolutive mechanisms
are overcome by a marked upregulation of IL-6, TNF-alpha and IL-1-beta4,5. In addition,
immune cells that orchestrate the innate and adaptive response, such as monocytes and
neutrophils, undergo pyroptosis and netosis6,7. Consistently, leukopenia and an
uncontrolled coagulopathy, marked by platelet activation and high D-dimer levels, correlate
with COVID-19 severity8,9. Altogether, SARS-CoV-2-triggered inflammation and
hypercoagulability have rapidly been defined as main features of the natural history of
disease progression from mild to severe COVID-19 clinical presentations4.
Up to now, the factors described above have been associated with disease
progression from mild to severe, but they are limited to explain the mortality of critically ill
COVID-19 patients. Further investigation is thus necessary to search for overlooked factors
associated with these high mortality rates. Although the stay of COVID-19 patients in the
ICU for weeks is more likely to predispose them to nosocomial infections, mortality is high
even for patients negative for bacterial infections1014. Despitethe best clinical practice to
routinely surveillance of bacterial infection in the ICU, unculturable and unbiased diagnosed
viruses are neglected in daily practice. The systematic analysis of the virome from critically
ill COVID-19 patients is thus necessary, especially in samples from the lower respiratory
tract. Thus, we analyzed a cohort of critically ill COVID-19 patients under IMV with
sustained SARS-CoV-2 loads, inflammation and coagulopathy to determine whether their
lower respiratory tract virome, beyond coronavirus, could improve the rationalization of
patients’ outcome.
From March to December 2020, we prospectively included 25 critically ill COVID-19
patients requiring IMV, at the median age of 57-year-old and presenting the most common
infection symptoms and comorbidities (Extended Table 1). Patients displayed high SARS-
CoV-2 RNA levels (median of 106 copies/mL), laboratory markers of systemic inflammation
and coagulopathy (because of elevated plasma levels of C reactive protein [CRP] and D-
dimer, respectively), and case fatality ratio of 58% (Extended Table 1). Due to the IMV, the
tracheal aspirate (TA) was the source of samples to perform SARS-CoV-2 RNA
quantification and virome analysis. We were surprised that the TA of over 70% of the
patients had higher SARS-CoV-2 RNA loads than other samples from the lower respiratory
tract15 (Extended Figure 1A).RNA content from TAs was unbiased sequenced and
rendered an average of 2x107 genomic reads, of which 10% and 90% were virus- or
human-related, respectively (Figure 1A). Approximately 95% of the virus-associated reads
in the transcriptome were linked to SARS-CoV-2 (Figure 1A). Nevertheless, we enriched 
the new coronavirus sequences using Atoplex kit (MGI, China) (Supplementary Table 1)
to phylogenetically classify them into the emerging clades 19A, 20A and 20B, in the
proportions of the 16%, 12% and 72%, respectively (Extended Figure 1B and C),
reconfirming that the entire cohort was composed of COVID-19 patients. The SARS-CoV-2
emerging clades identified here were representative of the virus circulating in Brazil during
the year of 2020 (
In addition to SARS-CoV-2, human endogenous retrovirus K (HERV-K; also known
as HML-2) sequences were consistently detected in the TA from these COVID-19 patients,
at a proportion of 5 ±2% (mean ± SEM) of the virome (Figure 1A and Supplementary
Table 2). Not so important, random phage types were detected in some samples with low
coverage (Figure 1A). Among all HERVs, HERV-K is the most contemporaneous in human
genome, being incorporated during human-chimpanzee speciation16. Thus, it is noteworthy
to find an human-specific marker associated with critically ill COVID-19 patients, as non-
human primates are less likely to die from SARS-CoV-2 infection17, raising the attention to a
possible role of HERV-K in the dichotomy of SARS-CoV-2 severity between humans and
non-human primates.
Moreover, HERV-K was 5-fold more present in the virome of TA aspirates from
COVID-19 patients under IMV than in nasopharyngeal swabs (NS) from mild cases,
previously studied by us18(Figure 1B and Supplementary Table 2). HERV-K levels in the
TA from deceased or discharged COVID-19 patients did not differ, but these values were
significantly higher compared to TA from non-COVID-19 patients, randomly retrieved from
sequence read archive (SRA) (Figure 1B). The data from SRA indicate that HERV-K may
be found in the lower respiratory tract of some patients with other illnesses (Figure 1B).
Indeed, HERV-K detection in the respiratory tract has been associated with lung
adenocarcinoma19, as well as other types of cancer, neurological disorders, multiple
sclerosis and arthritis20. Here we described that higher levels of HERV-K were found in
COVID-19 patients that deceased earlier (Figure 1C), accentuating its association with
disease severity. Of note, no statistically significant association was found between HERV-
K and other features, such as days from COVID-19 onset, age, gender or SARS-CoV-2
RNA levels (Extended Figure 2).
Thousands of loci in the human genome are associated with HERV-K-related genetic
elements and some of them are actively expressing retroviral structural genes21, we search
for correlation between the HERV-K transcript consensus described here and active HERV-
K loci in the human genome. Most often sequences from HERV-K structural genes were
expressed from different chromosomic regions, suggesting the activation of otherwise silent
genes (Extended Table 2 and Supplementary Figure 1). Indeed, critically ill COVID-19
patients differently expressed HERV-K-associated structural genes, doubling and tripling
the gag-pro-pol and envtranscripts compared to NS from mild cases and non-COVID TA
SRAs (Figure 1D). Expression of gag,in prostate cancer and multiple sclerosis has been
associated with immune dysregulation22,23. The gene polencodes for a reverse
transcriptase that may jeopardize the cell cycle of lymphocytes because of its association
with leukemia24. Gene prois both a protease, which can directly cleave cellular protein with
major functional impact in cell biology25, and a deoxyuridine triphosphate
nucleotidohydrolase (dUTPase), whose activity leads to inflammation, immune
dysregulation, and progressive obliterative vascular remodeling in the respiratory tract26.
HERV-K env may trigger cell-cell fusion, leading to epithelial to mesenchymal transition and
various types of cancer, including in the respiratory tract19,27.
For further confirmation of HERV-K presence, we performed orthologue assays. TA
from 14 patients were enough for further examination via shotgun proteomics(Figure 2A
and B, and Supplementary File 1); which revealed more than 20,000 peptides linked to
2,249 human proteins, with a high degree of confidence (FDR <1 at the level of PSM,
peptides and proteins identification). Totrace similarities with the HERV-K proteome, we 
compared the peptides from tracheal aspirate human proteome and HERV-K proteins Gag,
Pro, Pol, Env and Rec (Uniprot IDs # P62684, P63121, P63132, Q902F9 and P61574,
respectively) through BlastP (NCBI/BLAST). A total of 167 alignments were detected with
≥50% coverage and sequence identity, assuming a minimum of 10 residues aligned and
peptides ranging in size from 20 to 47 amino acids, under BlastPalgorithm default
parameters. (Figure 2 A and B, and Supplementary File 2).Besides the proteic detection,
gag transcripts were quantified by delta-cycle threshold (Ct) (gag - ribosomal protein L;
RPL, as the housekeeping transcript). The gagtranscript has been chosen because of its
specificity to HERV-K23. The levels of the HERV-K in the virome of TAare associated with
the lower delta-Ct values (which means gag expression closer to the housekeeping RPL)
(Figure 2C). After this validation, we evaluated if HERV-K transcripts were detectable in the
plasma from these patients.Indeed, gagwas detected, with Ct values < 50, being more
likely to be detected in deceased patients than discharged individuals or healthy donors
(HD) (Figure 2D). These data reconfirm HERV-K in our samples at RNA and peptide
For further evidence of casual relationship between SARS-CoV-2 and the expression
of endogenous retrovirus, we experimentally infected either human primary monocytes
obtained from healthy donors or Calu-3 cells, a lineage that resembles type II pneumocytes,
with SARS-CoV-2 and quantified HERV-K. Upon SARS-CoV-2 infection, HERV-K was up-
regulated in the monocytes, but not in Calu-3 cells (Figure 2E), in line with other non-viral
stimuli28 and with human ontogeny29. Monocytes, likewise other immune cells, are important
during the natural history of COVID-19, either orchestrating the immune response or
succumbing due to pyroptosis and cytokine storm4,6.
We next examined a possible correlation between HERV-K in the TA with immune-
modulation and/or coagulopathy. For this purpose, Spearman correlation analysis for levels
of cytokines, coagulation factors and immune cell counts were scored in deceased and
discharged patients (Extended Figure 3). Tobe conservative when assumingstatistical
significance, we additionally performed regression analysis, for those markers that passed
Spearman correlation, evaluating differences in angular and/or linear coefficients (Figure
3). As a general tendency for the endogenous mediators, HERV-K reduced their levels in
the TA (Extended Figure 3A)and favored inflammation in the peripheral plasma
(Extended Figure 3B). HERV-K negatively associates with two survival/growth factors for
immune cells in the blood, granulocyte colony-stimulating factor (G-CSF)30 and nerve
growth factor (NGF)31(Figure 3A). Consistently, surviving patients presented higher NGF
levels than those who died (Figure 3A). Surprisingly, HERK-K and other HERVs had been
reported as inducers of both G-CSF32 (in immune cells) and NGF33(in neurons), however,
HERV activation in CNS is associated mainly with the onset and progression of
neurological diseases34, and the induction of G-CSF was evaluated only with a domain of
the HERV-K TM protein32. As a function of HERV-K levels, other regulatory/anti-
inflammatory signals were also decreased in the plasma of deceased, such as IL-1Ra and
IL-13 (Figure 3A), which respectively antagonizes IL-1-dependent stimulus and favors an
allergenic-like/TH2 response35,36. Interesting, the reduction of IL-13production is reported
also by a HERV-H-LTR-derived protein, together with the inhibition of CD4 and CD8 T cell
responses37. Deceased patients respond to higher HERV-K levels increasing IL-17 (Figure
3A), a further pro-inflammatory mediator that may upregulate IL-6, CRP and airway
remodeling38, and it is also described to be upregulated by HERV-K and others HERVs in
autoimmune diseases39,40.
During severe COVID-19, clotting factors are intensely consumed9 and HERV-K
levels associated with specific modulations in survivors and deceased patients (Extended
Figure 3C). In the light of HERV-K levels, an apparent higher consumption of factor V
occurs, being more intense in patients that have deceased (Figure 3B).As clotting
cascades are activated, the fibrinolysis product, D-dimer, positively correlates with HERV-K
levels in patients that died (Figure 3B), interesting, the modulation of coagulation cascade
genes by HERV-K is also described in immune cells32. To correlate with cell-mediated
immunity specific populations were quantified by flow cytometry (Supplementary Figure 2)
and plotted as a function of HERV-K levels (Extended Figure 3D-H). HERV-K negatively
correlates with natural killer cells (Figure 3C), suggesting a possible contribution to impair
an adequate innate antiviral response41. Monocyte activation positively correlates with
HERV-K (Figure 3C), which agrees with experimental SARS-CoV-2 infection in monocytes,
pyroptosis of these cells and release of pro-inflammatory factors6,42.These data indicate
thatHERV-K levels contributed to distinguish discharged from deceased critically ill COVID-
19 patients under IMV, reconfirming classical markers of COVID-19 severity and immune
activation, and aggregating new information on other potential targets for intervention.
Although HERV-W was shown to correlate with activation and inflammatory markers
in blood cells from COVID-19 patients43, it is not clear whether the expression of HERV-W
was due to SARS-CoV-2 or the broad inflammation. We were able to show increased
expression of HERV-K due to SARS-CoV-2 infection in monocytes and further correlations
with immune activation, coagulopathy and death. HERV-K has been associated with
immune activation in HIV-1-infected individuals. For HIV-1-infected antiretroviral therapy-
suppressed individuals, augmentationof HERV-K levels precedes immune activation and
HIV-1 rebound44. There are different possible links between HERV-K and immune 
activation, such as Toll-like receptors engagement45and direct cell-cell fusion during viral
infection46. Our data adds SARS-CoV-2 as a trigger of HERV-K expression, with possible
correlation with mortality in the ICU. There are also hypotheses linking transposons and
retrotransposons to COVID-19 pathogenesis on immune activation and
coagulation/fibrinolysis cascade47,48; and recently, it was reported the presence of SARS-
CoV-2 sequences integrated in host genome, possibly mediated through LINE-1
retrotransposons49. Together, these data reinforce our findings, as the pathways for
activation of transposable elements are similar, and HERV-K could be concomitantly with
these elements, or its activation be the trigger or consequence of SARS-CoV-2 integration.
To the best of our knowledge, our work is the first evidence of the presence of
HERV-K in the respiratory tract and in the plasma of critically ill COVID-19 patients. We
also connected HERV-K levels with the pro-inflammatory and regulatory events that
contribute for patient`s outcome. Our data also gives insights that HERV-K expression is a
result of a broader gene expression event during COVID-19, as judged by different
activated loci and its expression may further contribute to epigenetic remodeling and long-
term consequences. In conclusion, our findings provide original evidence that SARS-CoV-2
triggers increased HERV-K expression, and high levels of HERV-K are associated with
disease severity and early mortality.
Ethics and Patients
From March to December 2020, inpatients from the D’or Institute (ID’or) and Instituto
Estadual do Cérebro Paulo Niemayer (IECPN)admitted in the ICU were included upon
signed informed consent by their responsible relative. Both TA and Acid-citrate-dextrose
(ACD)-anticoagulated blood samples were collected. All patients already had SARS-CoV-2
positive RT-PCR upon entrance in this ward. Nevertheless, we reconfirmed COVID-19
laboratory diagnosis, and summary data from the patients are presented in Extended
Table 1. The National Review Board of Brazil approved the study protocol (Comissão
Nacional de Ética em Pesquisa [CONEP] 30650420.4.1001.0008).
RNA extraction and RT-PCR
The total RNA from TA was extracted using QIAamp Viral RNA (Qiagen, Germany),
according to manufacturer’s instructions. Quantitative RT-PCR was performed using GoTaq
Probe qPCR and RT-qPCR Systems (Promega, USA) in a StepOne Real-Time PCR
System (Thermo Fisher Scientific, CA, USA). Primers, probes, and cycling conditions used
to detect the SARS-CoV-2 RNA have been described elsewhere15. Astandard curve was
employed for virus quantification, using synthetic RNA for gene N (Microbiologics, MN,
USA). Amplifications were carried out in 25 µL reaction mixtures containing 2× reaction mix
buffers, 50 µM of each primer, 10 µM of probe, and 5 µL of RNA template.
HERV-K was amplified as described elsewhere23. Total RNA from the plasma or
culture supernatant was extracted with QIAamp Viral RNA (Qiagen, Germany). Total RNA
concentration was determined by spectrophotometric (NanoDrop 2000, ThermoFisher
Scientific, CA, USA) and 10 μg of RNA samples were submitted to First-Strand cDNA
synthesis. cDNA was synthesized by 0.5 μl of oligo (dT)20, 0.5 μl of random hexamer
primers, 10 mM dNTPs, First-Strand Buffer, 0.1 M DTT and 200U SuperScript III First-
Strand Synthesis System (Invitrogen, ThermoFisher Scientific, CA, USA) according to
manufacturer’s instructions. The total cDNA concentrations were determined by
spectrophotometric (NanoDrop 2000, ThermoFisher Scientific, CA, USA). Then, 40-cycle 




Please Sign in (or Register) to view further.