Exploring DNA methylation, telomere length, mitochondrial DNA, and immune function in patients with Long-COVID

Andrea Polli, Lode Godderis, Dries S. Martens, et. al., BMC Medicine volume 23, Article number: 60 (2025) 

Abstract

Background

Long-COVID is defined as the persistency or development of new symptoms 3 months after the initial SARS-CoV-2 infection, with these symptoms lasting for at least 2 months with no other explanation. Common persistent symptoms are fatigue, sleep disturbances, post-exertional malaise (PEM), pain, and cognitive problems. Long-COVID is estimated to be present in about 65 million people. We aimed to explore clinical and biological factors that might contribute to Long-COVID.

Methods

Prospective longitudinal cohort study including patients infected with SARS-CoV-2 between March 2020 and March 2022. Patients were assessed between 4 and 12 months after infection at the COVID follow-up clinic at UZ Leuven. We performed a comprehensive clinical assessment (including questionnaires and the 6-min walking test) and biological measures (global DNA methylation, telomere length, mitochondrial DNA copy number, inflammatory cytokines, and serological markers such as C-reactive protein, D-dimer, troponin T).

Results

Of the 358 participants, 328 were hospitalised, of which 130 had severe symptoms requiring intensive care admission; 30 patients were ambulatory referrals. Based on their clinical presentation, we could identify 6 main clusters. One-hundred and twenty-seven patients (35.4%) belonged to at least one cluster. The bigger cluster included PEM, fatigue, sleep disturbances, and pain (n = 57). Troponin T and telomere shortening were the two main markers predicting Long-COVID and PEM-fatigue symptoms.

Conclusions

Long-COVID is not just one entity. Different clinical presentations can be identified. Cardiac involvement (as measured by troponin T levels) and telomere shortening might be a relevant risk factor for developing PEM-fatigue symptoms and deserve further exploring.

Peer Review reports

Background

The COVID-19 pandemic has been affecting millions of lives around the globe. COVID-19 is a viral respiratory disease caused by SARS-CoV-2 [1]. Since the beginning of the pandemic to date (May 2024), the world counts over 750 million cases and 7 million deaths. Over 85% of confirmed COVID-19 cases are mild [12]. However, 10–15% develop more severe symptoms such as acute respiratory distress or multisystem organ failure requiring hospitalisation and, in some cases, intensive care [3].

Around 10–15% of patients report a variety of persistent symptoms including pain, fatigue, post-exertional malaise, cognitive problems, and sleep disturbances following SARS-CoV-2 infection [13]. This condition has been termed Long-COVID. Long-COVID is defined by the WHO as the persistency or development of new symptoms 3 months after the initial SARS-CoV-2 infection, with these symptoms lasting for at least 2 months with no other explanation [45]. A recent meta-analysis of over 735,000 subjects with COVID showed that about 45% of them report at least one unresolved symptom at 4 months [6]. Other epidemiological studies estimate Long-COVID to be present in about 65 million people [4]. Importantly, over 80% of people experiencing symptoms 4 months after infection will likely have persistent symptoms at the 2-year follow-up [7]. Long-COVID can develop after very mild symptoms at onset [8]. A study on 545 patients found that over 70% of people with persistent symptoms started with mild or moderate initial infection [9]. About 1–1.5% of people with COVID-19 experienced over 12 weeks of sick leave [10].

A better understanding of pathophysiological mechanisms is warranted. Currently, research exploring biological mechanisms related to Long-COVID is scarce and inconclusive [11]. However, a number of biological mechanisms are emerging and hold promise to explain Long-COVID. Such biological factors are DNA methylation, telomere length, mitochondrial dysfunction, and immune system alteration. Together, these processes influence several basic cellular functions such as gene expression, cell replication, and energy metabolism. Patients with severe COVID-19 indeed seem to show a widespread hypomethylation, biological age acceleration, telomere shortening, and mitochondrial dysfunction [12,13,14,15]. However, it is unclear whether these mechanisms are relevant in Long-COVID [16,17,18].

The present manuscript aims to target this knowledge gap by exploring several biological factors that might contribute to Long-COVID development and maintenance. We explored telomere length, mitochondrial DNA content, and global DNA methylation in peripheral mononuclear cells in a large cohort of patients infected with SARS-CoV-2 who did or did not recover 4 to 24 months after the initial infection. We also explored the expression of 14 different cytokines in plasma, as a surrogate of immune cell functions, as well as serological factors (D-dimers, glycated haemoglobin, platelets and white blood cell count, C-reactive protein, creatinine, troponin T). We hypothesised people developing Long-COVID will show shorter telomere length, lower mitochondrial DNA content, lower DNA methylation, and a dysregulated immune response.

Methods

Participants and procedure

We conducted a prospective observational cohort study of consecutive adult patients (≥ 18 years) diagnosed with COVID-19 between March 2020 and March 2022 at the University Hospital Leuven (UZ Leuven). Hospitalised patients who agreed to participate returned to the COVID follow-up clinic at the Department of Respiratory Diseases of the UZ Leuven 4 to 24 months after infection. In addition, ambulatory referrals to our clinic because of ongoing symptoms following a confirmed SARS-CoV-2 infection in the study period were invited to participate. Participants who gave consent for the study were sent an email with the reminder of the appointment and a list of clinical questionnaires to complete 3 days before the medical assessment. We gathered information from the patient’s electronic hospital records on demographics, comorbidities, severity of illness, length of stay in hospital, and, if applicable, whether they required intensive treatment in the intensive care unit (ICU). The day of the assessment, they underwent a clinical assessment with detailed history and physical examination. Functional assessment was performed using the 6-min walking test (6MWT), following European Respiratory Society guidelines [19]. A complete description of the medical and functional assessments has been described elsewhere [20]. Then, 10 mL of blood was withdrawn for routine serological assessment. An additional 10 mL blood was collected in EDTA tubes, to study cytokine expression in plasma, and telomere length, mitochondrial DNA content, and DNA methylation in peripheral blood mononuclear cells (PBMCs). The study was approved by the institutional ethics committees of KU/UZ Leuven (S64081 and amendment S65411). Written informed consent was obtained from all participants.

Clinical assessment

Clinical questionnaires were distributed online 3 days before the hospital visit to patients who agreed to participate. Questionnaires included the Short Form-36 health survey (SF-36) [21], the Hospital Anxiety and Depression Scale (HADS) [22], and the DePaul Symptom Questionnaire (DSQ). The DSQ was initially developed to assess symptoms associated with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) [23,24,25]. Given the similarities between ME/CFS and Long-COVID, both on the proposed pathophysiology and on the clinical presentation, the DSQ is well suitable to assess the most prevalent symptoms reported by this population. We used a modified short form of the DSQ and selected 26 items assessing frequency and severity of symptoms such as fatigue, unrefreshing sleep, myalgia, headache, visceral symptoms, cognitive disturbances, and post-exertional malaise. The scoring is the product of frequency and severity and results in a range between 0 and 16. The questionnaire can be found in Additional file 1: Appendix 1. We used the DSQ to differentiate between people with Long-COVID and people who recovered from COVID with no consequences, as well as to assess the type of symptoms reported by each subject.

Defining Long-COVID

To date, no clear criteria have been validated for the diagnosis of Long-COVID. Previous research has shown that symptoms, though heterogeneous, seem to group in relatively independent clusters including two or more predominant symptoms [5]. Such clusters might be useful to explore common pathophysiological mechanisms [2]. We used principal component analyses to cluster the 26 symptoms that we assessed via the DSQ into the main components. Principal component analyses (PCA) allow for the identification of the main uncorrelated clusters that are able to explain the variability of the clinical presentation. Then, K-means cluster analysis was used to separate patients based on the mean value of each component, and thus define who belongs to the Long-COVID group, and what are the predominant symptoms. In line with a recent consensus statement including patients, clinicians, and researchers [5], we defined Long-COVID as a syndrome characterised by at least one cluster of persistent symptoms, lasting for at least 2 months, that were a consequence of SARS-CoV-2 infection.

Biological factor measurements

Blood was collected and transferred to the Clinical Biology Laboratory of the UZ Leuven within 3 h from collection. The first 10 mL tubes were used to isolate serum and perform routine serological measurements. The EDTA tube was centrifuged at 1500 rpm for 10 min to isolate 3 aliquots of 800 µL of plasma and stored at − 80 °C. The remaining blood was processed via Ficoll centrifugation to isolate PBMCs.

DNA methylation of LINE-1 transposable elements

LINE-1 transposable elements comprise almost half of the human genome and are commonly used as a robust surrogate marker of global DNA methylation [26]. We extracted DNA from PBMCs and measured DNA methylation of LINE-1 following a protocol developed in our lab and described elsewhere (see Additional file 1: Box 1) [27].

Telomere length and mitochondrial DNA copy number (mtDNAcn)

DNA from PBMCs was used to measure telomere length and mtDNAcn. Average relative telomere length was measured using a modified singleplex quantitative PCR (qPCR) method adapted from Cawthon [2829]. Average relative mtDNA content was measured using a modified singleplex qPCR procedure as described by Janssen et al. [30]. Mathematical calculation formulas to obtain RQ, NRQ, and CNRQs are provided by Hellemans et al. (see Additional file 1: Box 1) [31].

Serological markers

Serological markers included D-dimers, glycated haemoglobin (HbA1c), aspartate transaminases (AST), alanine transaminases (ALT), estimated glomerular filtration rate (eGFR), C-reactive protein (CRP), troponin T HS, platelets, and white blood cell (WBC) counts. Serum markers were analysed in the Clinical Laboratory of the UZ Leuven, according to routine procedures and analysed with the Cobas 8000 analyser (Roche, Switzerland). Data were retrieved from each patient’s medical record.

Cytokine measurements

In total, 14 cytokines were measured in plasma using the U-PLEX T-cell combo kit from Meso Scale Diagnostics (MSD, MD, USA). The kit allows for quantification of 14 cytokines at once. These cytokines were selected as they are thought to reflect broad lymphocyte T-helper cell immune function—recently classified in Th1 (IFN-y, TNF-a, IL-2), Th2 (IL-4, IL-10, IL-13, GM-CSF), Th9 (IL-9), Th17 (IL-17A, IL-17E/IL-25, IL-17F, IL-21, MIP-3a), and Th22 (IL-22) [32]. Data were analysed using the MSD Discovery Workbench Software. Values below fit curve or lower limit of detection were excluded from analyses. Technical errors based on visual inspection of data (i.e. clear outliers when plotting the data for data distribution) were also excluded.

Power calculation and statistics

Well-conducted studies investigating the role of mitochondrial metabolism, oxidative stress, telomere length, and DNA methylation in patients with post-viral fatigue and chronic fatigue syndrome show medium effect sizes [8101517]. We used G*Power to calculate the sample size needed to detect at least a fixed medium effect (f = 0.25) for Long-COVID with 80% power at the 5% significance level using a general linear model. As sex, age, BMI, severity of the initial infection, and total white cell count (WBC) can influence metabolic and immune functions, we corrected the calculation including two dichotomous variable (sex and ICU stay) and three continuous variables (age, BMI, WBC) in our models. The sample size needed to perform our analyses was 158. Different multivariate general linear models (mGLMs) were employed to assess whether symptoms, physical function, or biological markers, were significantly different between groups. Models were adjusted per sex, ICU stay, age, and BMI (and WBC when exploring biological factors). Finally, linear regression models were employed to explore whether the main symptom cluster or physical function (6MWT distance) can be predicted by the biological factors collected. Sex and ICU stay, as well as age and BMI, were used as fixed factors and covariates, respectively.

Results

In total, 358 patients accepted to participate in the study. Demographic and clinical characteristics are summarised in Table 1. They all were infected with SARS-CoV-2 between March 2020 and March 2022. Of these, 328 (91.6%) were hospitalised at UZ Leuven because of symptoms caused by the infection, of whom 130 (36.3%) had severe symptoms requiring intensive care admission. Thirty (8.4%) patients were ambulatory referrals to the Respiratory Diseases outpatient COVID follow-up clinic of the UZ Leuven by their treating primary physician. Mean follow-up time-point was 306 days from infection (± 149 days). In total, 207 patients gave their consent for the amendment study and additional biological analyses were performed. A flowchart of the study can be found in Fig. 1.Table 1 Socio-demographic and clinical characteristics of the explored cohort. Data presented as raw means and standard deviations (SD). BMI, body-mass index; HADS-D, Depression as assessed by the Hospital Anxiety and Depression Scale; DSQ, DePaul Symptom Questionnaire; 6MWT, 6-min walking test; mGLM, multivariate general linear models including age and BMI as covariates, gender and ICU as fixed factors with p < 0.05. Bold values represent statistically significant difference between the cluster and the asymptomatic controls, after correcting for multiple testing. †Significant between-group differences at the chi-square test. pComparison between controls and the PEM-fatigue subgroup; ccomparison between controls and the cognitive subgroup; vcomparison between controls and the visceral subgroup; acomparison between controls and the autonomic subgroup; Icomparison between controls and the immune subgroup; ncomparison between controls and the neurological subgroup

Full size table

figure 1
Fig. 1

Case definition and symptom prevalence

The most common symptoms reported by patients in the present study were fatigue (33.2%), unrefreshing sleep (29.5%), PEM (24.4%), and muscle pain (22.3%). Symptoms were reported to be of at least moderate intensity and felt at least half of the time in an average week. PCA revealed that symptoms clustered in 6 main independent components. The first component includes PEM, fatigue, unrefreshing sleep, muscle weakness, and pain. We called this component PEM-fatigue. This component alone explained 40.27% of the symptom variance. The second component, Cognitive dysfunction, included problems with memory, attention, and expression. The third component, Autonomic symptoms, includes symptoms for unsteadiness when standing, cold hands and feet, and unexplainable hot–cold feelings. The fourth component, Visceral problems, includes bloating, stomach-ache, and irritable bowel symptoms. The fifth component, named Neurological symptoms, includes headache and sensitivity to light, noise, and smells. The sixth component, called Immune disturbances, includes sore throat and flu-like symptoms. A full description of the PCA can be found in the additional files (Additional file 2: Tables S1–S3, Figs. S1–S4). Then, we used K-mean cluster analyses to separate the population in two non-overlapping clusters based on symptom severity per each symptom component. Two-hundred and thirty-one subjects (74.5%) scored significantly low to all six components and belonged to no cluster. These subjects showed no persistent symptoms and served as asymptomatic controls. One-hundred and twenty-seven participants (35.5%) had moderate or severe symptoms in at least one component and were defined as Long-COVID patients. The most prevalent set of symptoms were PEM-fatigue, that is present in 57 participants (15.9%). The second most prevalent set of symptoms was related to cognitive dysfunction, present in 41 patients (11.4%). Thirty-seven patients (10.3%) showed significant immune symptoms, 30 reported visceral problems (8.3%), 19 autonomic symptoms (5.3%), and 15 neurological symptoms (4.1%) (Fig. 1 and Additional file 2: Fig. S2). We also attempted at identifying a scoring system that would predict whether a patient belong so a certain cluster. Using receiver operating characteristic (ROC) analyses, we showed that a score of 4 or higher in most items of each cluster predicts whether a patient belongs to the cluster of interest with excellent accuracy (Additional file 3: Tables S4–S7 and Figs. S5–S8). Our results are in line with previous studies using the DSQ in patients with ME/CFS.

Associations between patients’ characteristics and clinical symptoms

An overview of the socio-demographic and clinical characteristics of the explored cohort can be found in Table 1. Significantly more women were in the Long-COVID group (44% vs. 32% in the control group). Time from infection did not influence long-term symptoms. Acute infection severity (i.e. need for ICU at admission) did not influence the prevalence of persistent symptoms—14.9% of participant who accessed the ICU reported persistent symptoms vs. 20.4% of participants who did not access ICU. Similarly, acute infection severity did not influence physical function either—need for ICU at admission is not associated with the distance covered at the 6MWT.

Relevance of biological factors in Long-COVID

A complete overview of the biological factors assessed, including raw and corrected values, can be found in Table 2. No between-group differences were found in any biological marker between asymptomatic controls and the patient group, when including all patients with at least one symptom cluster. However, when comparing asymptomatic controls to patients in the PEM-fatigue cluster, troponin T, telomere length, and mtDNAcn were significantly different between groups (Table 2 and Fig. 2).Table 2 Biological factors assessed. Data presented as raw means and standard deviations (SD). HbA1c, glycated haemoglobin; ALT, alanine aminotransferase; AST, aspartate aminotransferase; CRP, C-reactive protein; eGFR, estimated glomerular filtration rate; WBC, white blood cell; T/S, telomere copy number to single-copy gene number; mtDNAcn, mitochondrial DNA copy number; mtDNA/S, ratio of ND1/Hmito3 copy number to single-copy gene number; mGLM, multivariate general linear models including age and BMI as covariates, gender and ICU as fixed factors with p < 0.05. Bold values represent statistically significant difference between the cluster and the asymptomatic controls, after correcting for multiple testing. *Statistically significant between-group differences

Full size table

figure 2
Fig. 2

This suggested that different biological mechanisms underlie different clinical subgroups. We thus decided to continue with the analyses focussing on these patients’ subgroup, as it was the largest subgroup of patients, and explained over 40% of the symptom variance alone. Linear regression showed that PEM-fatigue symptoms were predicted by telomere length, troponin T, and age. No cytokine was able to predict PEM-fatigue symptoms (Table 3). Troponin T was significant higher in women (corrected mean diff. = 6.49 [SE = 2.68], p = 0.016) and older participants (ß = 0.28 [SE = 0.07], p = 0.001). Age also predicted telomere shortening (ß =  − 0.007 [SE = 0.002], p = 0.001) [33].Table 3 Results from regression models for PEM-fatigue symptoms. TL, telomere length; mtDNAcn, mitochondrial DNA copy number; DNAm, DNA methylation. #Combined model refers to the regression models combining all biological markers—serological, cytokines, and markers from PBMCs. All models were adjusted for age, BMI, gender, ICU, time from infection, and total white blood cell count

Full size table

Discussion

The present study aimed to investigate persistent symptoms in a large cohort of patients who were infected with SARS-CoV-2 and to explore possible biological mechanisms that can contribute to the persistence of symptoms. We analysed serological markers, inflammatory markers in plasma, telomere length as a proxy of cell’s general health and biological ageing, mtDNAcn as a proxy of cell metabolism, and LINE-1 DNA methylation as a proxy of global DNA methylation. To the best of our knowledge, our cohort is one of the largest to date with such a comprehensive clinical and biological assessment in patients with Long-COVID.

Over 35% of patients infected with SARS-CoV-2 reported at least 2 persistent symptoms between 4 and 24 months after infection. Based on patients’ clinical presentation, we were able to identify 6 different clusters. The main cluster, present in almost half of those with persistent symptoms (and 16% of the total population), included fatigue, PEM, pain, and sleep disturbances. Other clusters were defined by cognitive dysfunction, visceral problems, autonomic symptoms, immune disturbances, and neurological symptoms. Similar findings were published in over 2000 patients with ME/CFS [34]. Several large cohorts have been exploring Long-COVID in the past few months, but very few attempted to cluster patients based on their symptom presentation [35,36,37,38].

Patient stratification based on the clinical presentation is a promising approach to study biological underlying mechanisms in more homogeneous groups. We found PEM-fatigue symptoms are associated with higher levels of troponin T, shorter telomere length, and lower mtDNAcn. Higher troponin T and shorter telomere length also predicted PEM-fatigue symptoms in linear regression models. Troponin T is therefore the result that is most consistent in our analyses, as it predicts both symptoms and physical function. Troponin increase is used as a sign of cardiac involvement and it has been thoroughly investigated in patients with COVID-19 [39,40,41]. Though it is not clear whether troponin level are a consequence of SARS-CoV-2 infection, or are associated to comorbidities [39], previous research showed that its level remains elevated for up to 14 months [4041]. Here, we show that troponin T (together with D-dimer—another marker of vascular dysfunction) is significantly elevated in patients who show persistent symptoms and decreased physical function after 12 months. This suggests possible cardiac contribution to symptom persistence and should be considered when proposing rehabilitation programmes.

Telomere length is known to be associated with chronological ageing and cellular senescence [3342]. Together with other markers, telomere length represents an informative marker to assess biological age and general health [43]. Telomere shortening is influenced by several factor through the lifespan, such as chronic stress, oxidative stress, inflammation, and metabolic mechanisms [43]. Telomere shortening has been associated with fatigue [44] and has been observed in patients with ME/CFS [45]. Similarly, telomere shortening and reduced biological age was found in people with COVID-19 and has been suggested as a predictor for symptom persistency [1246]. Telomeres can only shorten during cell division. In total, cells divide between 36 and 120 times during a 85-year lifespan, with half of them happening in the first 24 years of age [47]. Thus, it is probably more likely that telomere shortening happened before the infection, rather than as a consequence of it. This calls for more research focussing on how to maintain longer telomeres, especially during early life, as a form of prevention.

mtDNAcn decline is often found in during ageing and metabolic diseases, and it has been proposed as a biomarker for poorer health [48]. Our results would be in line with these observations, as one of the main mechanisms associated with Long-COVID, especially when fatigue and PEM are present, is impaired metabolism [49]. Although mtDNAcn was lower in patients with PEM-fatigue symptoms, it did not significantly predict symptoms in our regression models. More research is needed to elucidate the actual role of mitochondrial biogenesis in Long-COVID.

A final relevant result from our biological analyses is the lack of associations between Long-COVID and immune markers such as CRP and cytokine expression. Immune dysregulation has been consistently found after SARS-CoV-2 infection, and accumulating evidence shows that might be the case for Long-COVID as well [5051]. The most significant cytokines identified in previous studies were GM-CSF, TNF-α, IFN-y, IL-2, IL-4, IL-6, IL-10, and IL-17 [5051]. We included those cytokines in our study but found no between-group difference nor associations with clinical symptoms. This might be explained by the specific design of each study, such as the eligibility criteria for the patient and control groups, the case definition for Long-COVID, the confounding factors, and covariates included in the statistical models (i.e. adjusted for age, BMI, sex, need for ICU, and total WBC). A systematic review on the topic included 15 studies, showing significant heterogeneity in the methodology applied [50]. For instance, some define Long-COVID, when symptoms persist for 6 weeks, though current consensus agrees on defining Long-COVID only when symptoms persist for at least 4 months [51].

Strengths and limitations

The results from the present study should be interpreted with caution. Our study might suffer from selection bias leading to an overestimation of persistent symptoms given the fact that the majority of cases had been admitted to hospital. In addition, it was not possible for us to control for every possible comorbidity, or medication intake. Patients were enrolled over the period of 2 years, which means that they were likely infected by different virus variants and viral load. We were unfortunately unable to track this in their medical record. Secondly, ICU stay was used as a surrogate marker for acute disease severity, which is not ideal given that ICU referral/admission varied over the course of the pandemic—i.e. lower threshold for ICU admission was set during the first two waves as compared to the third wave, where more semi-intensive care on ward was applied. Some data are missing in our dataset, as a consequence of declined participation or incomplete filling of clinical questionnaires from the patients. When biological data were missing, it was mostly the consequence of patients not agreeing to physically visit the department or because the serological analyses were not reported in the medical record. In total, only about half of the participants had a complete assessment including both clinical and biological measurements. However, patients included in the present study were enough to perform powered statistical analyses, according to sample size calculation. Finally, the present study is cross-sectional in nature, and we thus cannot give a definite answer on the cause of Long-COVID. We indeed call for further research exploring biological mechanisms in patients with Long-COVID, to confirm our results. However, all our models are random effect models adjusted for age, BMI, gender, ICU admission, and total white blood cell count. These are solid models. In addition, the fact that two out of three markers (telomere length and troponin expression) are also found significant in regression models (again, using model adjustments as above) strengthens our between-group results.

Conclusions

We confirm previous research showing that about a third of patients with COVID-19 reported persistent symptoms at 4 to 24 months after infection. Long-COVID should not be regarded as a single entity. We clustered patients according to their clinical presentation and found 6 independent subgroups that should be considered in future research. Our results suggest that cardiac involvement (as measured by troponin T levels) and telomere shortening might be a relevant risk factor for developing PEM-fatigue symptoms, and possibly other post-viral syndromes, which deserves further exploring. There is an urgent need for a better understanding of biological mechanisms in Long-COVID and similar syndromes like ME/CFS. Future research should employ patient clustering to better subgroup patients, use prospective designs to unravel symptom fluctuation and causal mechanisms, and identify disease-specific targets to develop tailored treatments.

Leave a Reply

Your email address will not be published. Required fields are marked *