Abstract Aging is the primary risk factor for many cancer types, including lung adenocarcinoma (LUAD). To understand how aging-related alterations in the regulation of key cellular processes might affect LUAD risk and survival, we built individual-specific gene regulatory networks integrating gene expression, transcription factor protein-protein interaction, and sequence motif data, using PANDA/LIONESS algorithms, for non-cancerous lung samples from GTEx project and LUAD samples from TCGA. In healthy lung, pathways involved in cell proliferation and immune response were increasingly targeted with age; these aging-associated alterations were accelerated by smoking and resembled oncogenic shifts observed in LUAD. Aging-associated genes showed greater aging-biased targeting patterns in individuals with LUAD compared to healthier counterparts, a pattern suggestive of age acceleration. Using drug repurposing tool CLUEreg, we found small molecule drugs that may potentially alter the accelerating aging profiles we found. We defined a network-informed aging signature that was associated with survival in LUAD. graphic file with name 41514_2025_247_Figa_HTML.jpg Subject terms: Cancer, Genetics Introduction Lung cancer is second only to breast cancer worldwide in annual incidence and is the leading cause of cancer death. Lung cancer risk increases with age and as the average age of the population increases worldwide, the prevalence of lung cancer is expected to continue growing^[37]1. In 2021, 75% of lung cancer fatalities were reported in individuals aged 65 and older^[38]2. While lung adenocarcinoma (LUAD) in younger adults is often diagnosed at more advanced stages compared to those in older adults^[39]3, elderly individuals have more comorbidities and tend to be less tolerant of certain cancer therapeutics than younger individuals^[40]4. These differences are likely the result of aging-induced alterations in the regulation of key cellular processes^[41]5, but the mechanism by which age shifts the gene regulatory landscape to alter lung cancer risk and survival outcome is largely unknown. In this paper we address this critical gap in our understanding by building individual (person)-specific gene regulatory networks to gain insights into aging related changes in gene regulation that might influence the risk and prognosis of LUAD across all age groups. Additionally, we explore how aging-associated regulatory changes are further accelerated by tobacco smoking history, since lung diseases including LUAD are more prevalent among individuals with a history of smoking, compared to individuals who have never smoked in their lifetime^[42]6. Transcription factors (TF) have been established as known drivers of aging and cancer^[43]7. In recent research^[44]8 involving mouse single-cell RNA-sequencing atlas, it has been shown that within immune cells from different organs including lung, TF activity changes with age. This pattern was particularly dominant in tumor-associated macrophages. Aging-associated TF dysregulation was also observed in antigen processing and inflammation, both of which are key hallmarks in many age-related diseases including cancers. To understand aging-associated heterogeneity in the gene regulatory landscape of LUAD, we identified biological pathways that are differentially regulated by TFs in tumors from individuals of different ages and investigated if any of these age-associated changes in regulatory networks might influence survival and the response to chemotherapy, potentially favoring individuals exhibiting regulatory signatures akin to those found in younger individuals. We further validated our findings in two independent datasets of non-cancerous lung tissue and LUAD. Most studies investigating the role of aging in LUAD have focused on the mutational landscape of tumor among individuals across various age groups^[45]9,[46]10. Tumor mutations in several genes, including CDKN2A, KRAS, MDM2, MET, and PIK3CA, have been found to increase in frequency with the individual’s age, while the frequencies of mutation in ALK, ROS1, RET and ERBB2 show a decreasing trend with age^[47]11. ALK and EGFR mutations are high among younger individuals with LUAD, especially among females and nonsmokers^[48]10,[49]12,[50]13. Analysis of somatic interactions has indicated that EGFR-positive samples in younger individuals are more prone to concurrent mutations in PIK3CA, MET, TP53, and RB1 when compared to older individuals^[51]11. Age may influence both the number of mutations in a tumor and their evolutionary timing^[52]14. While germline mutations are more commonly identified in tumors from younger individuals, tumors in older individuals appear to be predominantly influenced by somatic mutations^[53]15. Such mutations clearly play a role in cancer risk and prognosis, acting in part, by altering the activity of biological pathways associated with cancer. However, changes in these pathways can only be partially explained by known mutations, indicating that other mechanisms of pathway activation might play a significant role^[54]16. Despite some studies in lung cancer that have reported altered expression of genes linked to survival^[55]17,[56]18, there has been limited research investigating aging-associated alterations in gene regulatory networks that influence the risk and prognosis of LUAD and lung cancer in general. We addressed this gap in our understanding by using the network-modeling approaches, PANDA^[57]19 and LIONESS^[58]20, to derive person-specific gene regulatory networks for non-cancerous lung tissue samples from the Genotype Tissue Expression project (GTEx) and LUAD tumor samples from The Cancer Genome Atlas (TCGA), with a focus on evaluating age-associated genes and their regulation by TFs. This approach was motivated by multiple earlier network-modeling analyses that identified disease relevant regulatory features in both healthy tissues as well as in tumor^[59]21–[60]24. PANDA (Passing Attributes between Networks for Data Assimilation) is a computational method to synthesize three sources of information about the regulatory process and infer condition-specific regulatory networks connecting regulatory TFs and the genes they control. PANDA starts with a “motif prior” based on mapping TFs to likely promotor regions of the genome to establish candidate TF-gene regulatory connections, uses pairwise correlations between gene expression levels as a measure of possible co-regulation by the same TF, considers TF-TF protein-protein interaction (PPI) data as a sign of regulatory cooperation, and then uses a machine learning “message passing” framework to iteratively update the weights of regulatory edges, finally producing a bipartite network of directed edges from TFs to target genes. LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples) builds upon PANDA by enabling estimation of sample-specific gene regulatory networks. The conceptual framework underlying LIONESS is that a gene regulatory network estimated on a population is an average over the networks for individuals in that population such that removing an individual perturbs that average estimate in a way that is, to first order, linear in the network edge weights. LIONESS begins with a PANDA network for the entire population. LIONESS then systematically removes each sample, one at a time, reconstructing the PANDA network for the remaining data, and uses linear interpolation between the gene regulatory networks with and without that sample to estimate the network for the left-out individual; the process repeats iteratively until a PANDA network has been generated for each member of the starting population. By analyzing individual-specific regulatory networks, we found increased TF targeting of pathways related to intracellular adhesion, cell proliferation, and immune response with age in healthy lung tissue. These aging-associated alterations are further increased by tobacco smoking and resemble oncogenic shifts in the regulatory landscape observed in LUAD, thereby suggesting a potential association between aging-associated dysregulation of biological pathways and an elevated risk of developing LUAD. Using a web-based drug repurposing tool CLUEreg, we also found potential geroprotective small molecule drug candidates that may be useful in reducing the risk of LUAD by reversing the aging-associated regulatory signatures. By constructing a network-informed aging signature for tumor samples based on the TF-targeting patterns of key biological pathways significantly changing with age in LUAD, we found that a lower aging signature is associated with better survival probability. In contrast, chronological age was not predictive of survival, thus demonstrating that the aging signature captures aspects of tumor biology not captured by chronological age alone. Using CLUEreg, we also found distinct small molecule drug candidates tailored to LUAD samples with varying aging signatures. In conclusion, our findings not only highlight the mechanisms underlying increased risk and poorer prognosis of LUAD associated with aging-induced gene regulatory alterations but also establish a potential avenue for leveraging individual-specific gene regulatory networks in designing personalized therapeutic interventions. Results Identifying aging-associated gene regulatory alterations in healthy human lung and geropropective drug candidates We used the PANDA + LIONESS algorithm to infer individual-specific gene regulatory networks using gene expression data from non-cancerous lung tissue samples from GTEx (Fig. [61]1), For every individual, the algorithm infers a bipartite gene regulatory network linking TFs to their target genes within that individual. By analyzing these networks across samples, we identified several genes (Fig. [62]2a) that are differentially targeted by TFs as a function of age, in the lung. Among these genes, there are 1018 that exhibit significantly increased targeting by TFs with age (p < 0.05). Most significant among them are NNAT^[63]25, FBLN7^[64]26, SH3BP1^[65]27, CNTN1^[66]28, THEM5^[67]29, and FOXP4^[68]30; upregulation of these genes has been previously reported to be associated with cell proliferation and poorer prognosis in multiple cancers. We also find 404 genes that exhibit significantly decreased targeting by TFs with age (p < 0.05) including DUSP15^[69]31, ALDH1L2^[70]32, HPD^[71]33, GSTT2^[72]34, FOXI3^[73]35, and ZIC2^[74]36, all of which have been shown to be influential in predicting tumor progression and therapeutic efficacy in various cancer types, including non-small cell lung cancer. Fig. 1. Schematic overview of the study. [75]Fig. 1 [76]Open in a new tab Top box, overview of our approach to constructing individual specific networks using PANDA and LIONESS which integrate information on protein-protein interactions (PPIs) between transcription factors (TFs), prior information on TF-Gene motif binding, and gene expression data – in this case, from GTEx lung tissues and TCGA LUAD primary tumor samples - downloaded from Recount3. Bottom box, overview of the differential targeting analysis. Fig. 2. Change in TF-targeting of genes with age. [77]Fig. 2 [78]Open in a new tab a Left: Volcano plot of genes that are differentially (increasingly or decreasingly) targeted by TFs over varying age in lung tissue samples from GTEx. The x-axis represents log fold change (logFC), which is defined as the change in gene indegree in response to a unit change in age. The y-axis represents negative of logarithm of p values (−log10(P.Value)). b Right: Boxplot of the t-statistics of the age coefficient associated with the oncogenes and tumor suppressor genes (listed in the COSMIC database) from the limma analysis in GTEx. Positive value means these genes are targeted more with age on average and negative value means these genes are targeted less with age on average. For comparison, we also show the same t-statistics for non-cancer genes (genes that are not annotated as oncogenes and/or tumor suppressor genes in the COSMIC database). Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line. Outliers are marked by points. In our analysis we include genes that are explicitly marked as either “Oncogene” or “TSG” respectively in the COSMIC database, thus excluding all the genes that can work either as an oncogene or as a tumor suppressor gene, depending on the mutation. The reported p values correspond to the hypothesis testing with respect to alternative hypotheses reported in parentheses. The alternative hypotheses “<” or “>” denote the hypotheses “mean > 0” and “mean < 0” respectively. We compared the list of aging-related genes with a known list of cancer-associated proto-oncogenes and tumor suppressor genes downloaded from the Catalog of Somatic Mutations in Cancer (COSMIC)^[79]37. Among genes increasingly targeted with age, there was only 1 proto-oncogene XPO1 and among genes decreasingly targeted with age, there were 4 tumor suppressor genes (TSGs) DICER1, PIK3R1, POT1 and PTEN. We further examined the network topology surrounding all cancer-related genes annotated in the COSMIC database by analyzing gene targeting score in relation to age. Specifically, we compared the t-statistics of the age coefficients (normalized age gradients) for TF targeting of oncogenes and TSGs (Fig. [80]2b) and found that among the healthy samples in GTEx, TF-targeting increases with age on average for both oncogenes and TSGs (Wilcoxon rank sum test gives a p value of <4e-09 for oncogenes and 0.001 for TSGs). For comparison, it should be noted that, non-cancer genes (that is, genes not annotated as oncogene or TSG in the COSMIC database) are also increasingly targeted by TFs with age (p value of Wilcoxon rank sum test is 2.2e-16), meaning that the gene regulatory networks inferred for individuals in GTEx, increase in TF regulatory density as the age of the individual increases. Nevertheless, the mean TF-targeting with aging is the greatest for oncogenes (aging slope of oncogenes is significantly larger than those of non-cancer genes and tumor suppressor genes with p values being equal to 0.001 and <2.2e-16, respectively). This indicates that although changes in regulation are a natural consequence of aging, the greatest changes occur in the regulatory neighborhoods of oncogenes, including genes that are common drug targets^[81]38 in LUAD, such as MYCN, ERBB3, and AKT1 (Fig. S.[82]1). We performed gene set enrichment analysis (GSEA) with genes, ranked by how much their targeting patterns change with age, and found (Fig. [83]3 leftmost column) that biological pathways associated with intracellular adhesion and cell proliferation, cell growth, and death have increasing TF targeting with age, including pathways annotated to adherens junction, apoptosis, hematopoietic cell lineage, cell adhesion molecules, and pathways in cancer. Pathways associated with immune response, including B-cell receptor signaling pathway, cytokine-cytokine receptor interaction, chemokine signaling pathway and intestinal immune network for IgA production, also show increased TF targeting with age. We confirmed these findings on an independent dataset LGRC (Fig. [84]3 middle column). Fig. 3. Aging-related changes in pathway targeting by transcription factors is similar to those as in lung adenocarcinoma. [85]Fig. 3 [86]Open in a new tab Heatmap of normalized enrichment scores (NES) for pathways that are significantly (at FDR cutoff 0.05) differentially targeted by transcription factors with age among non-cancerous lung samples (GTEx). The first two columns exhibit NES from GSEA based on the age coefficients from the limma analysis of GTEx and LGRC. The third column shows NES from GSEA based on difference between tumor samples from TCGA and healthy samples from GTEx. Using genes that are differentially targeted by age as input (1018 increasingly targeted and 404 decreasingly targeted by TFs with age) to a web-based drug repurposing tool CLUEreg^[87]39, we identified small molecule drug candidates (Supplementary Data [88]S1) with potential to reverse the aging-associated regulatory alterations in the gene regulatory networks from non-cancerous lung samples. CLUEreg compares these lists to the library of 19,791 drug-specific gene regulatory networks to identify drug candidates that are likely to reverse the targeting score of the gene in the input list. For each drug, CLUEreg computes a p value by resampling 10,000 random inputs of varying lengths and using them to construct a null distribution. For our given input gene lists, CLUEreg gave a list of 150 drugs as output. Among these drugs, we selected 112 drugs with a p < 0.05 as our candidate drugs with the potential to reverse gene regulatory signatures of aging. Some of these drugs, henceforth referred to as “geroprotective drugs”, including Carnosol^[89]40, Curcumin^[90]41, Cucurbitacin B^[91]42, Isonicotinamide^[92]43, Meclofenoxate^[93]44, Scriptaid^[94]45, and Withaferin A^[95]46 have already been shown to have potential geroprotective effects in various animal models, including humans. Among these geroprotective drug candidates, we found several FDA-approved anti-cancer drugs, including Trametinib, Doxorubicin, Alisertib, Actinomycin-d, Toremifene, and Plumbagin, as well as several investigational drugs with potential anti-tumor effects including Avrainvillamide analogs^[96]47, aurora kinase inhibitors (MK-5108, AT-9283)^[97]48, Avicins^[98]49, HMN-214^[99]50, Chaetocin^[100]51, ron kinase inhibitors^[101]52, and Linifanib^[102]53, among others. Tobacco smoking is associated with accelerated aging To explore whether tobacco smoking is associated with an acceleration of the aging process, we compared the gene regulatory networks from non-cancerous samples, between individuals with a history of smoking and individuals who have never smoked in their lifetime. We split the 1422 aging-associated genes we had previously identified into two sets: genes that exhibit increasing TF targeting with age (1018 genes) and those that show decreasing TF targeting (404 genes). For every gene in these two sets, we used limma^[103]54, to compute the age coefficient (Fig. [104]4) in a linear model for individuals with and without a history of smoking. Fig. 4. Aging-related transcription factor-targeting of genes by smoking history. [105]Fig. 4 [106]Open in a new tab Boxplot of the rates of change in TF-targeting with age in GTEx (designated by the t-statistics from the limma analysis with interaction between age and smoking status) a Left: for 1018 genes that are increasingly targeted with age in healthy human lung (based on evidence from GTEx) and b Right: the same boxplot for 404 genes that are decreasingly targeted with age in healthy human lung (based on evidence from GTEx). “Ever” refers to the group of individuals with a prior history (both past and current) of tobacco smoking and “Never” refers to the group of individuals without a prior history of smoking. We found that for genes with increasing age-associated TF targeting, the t-statistics of the age coefficients among individuals with a history of smoking have significantly greater positive values than among never-smokers (p < 2.2e-16). Similarly, for genes with decreasing TF-targeting with age, the t-statistic of the age coefficients among individuals with a history of smoking have significantly (p < 2.2e-16) larger negative values than those among individuals who have never smoked. In other words, for both kinds of aging-associated genes, the age gradients are significantly steeper for the individuals with a history of smoking, than those for the individuals who have never smoked in their lifetime. The steeper age gradients mean that the rate of change in gene regulation with age is faster among individuals with a history of smoking, comparted to individuals who have never smoked. To visually represent the continuous changes in gene regulation with age, we plotted the aging trajectories (as described in “Methods”) for the non-cancerous samples from GTEx (Fig. S.[107]2 in the [108]Supplementary Material). For the genes which are increasingly targeted by TFs with age (left plot on Fig. [109]S2), the slope of the regression line is steeper among individuals who have a history of smoking, compared to individuals who have never smoked. Although for genes that are decreasingly targeted by TFs with age (right plot of Fig. [110]S2), we do not observe any significant difference between the slopes of the aging trajectories among individuals with or without a history of smoking. Nevertheless, these findings indicate that tobacco smoking is associated with an acceleration of the aging-induced alterations in gene regulation. We also validated these findings in the non-cancerous lung samples from the independent LGRC (a.k.a. [111]GSE47460) validation dataset (Fig. [112]S3). Taken together, we find that even in individuals without evidence of lung cancer, there are aging-associated changes in the TF-targeting of genes that are further accelerated by tobacco smoking, which may be linked to an increased risk of developing LUAD at a younger age among individuals with a history of smoking. Aging-associated gene regulatory alterations in non-cancerous lung resemble oncogenic gene regulatory shifts observed in LUAD To explore the association of aging with LUAD risk, we compared the targeting patterns of aging-associated biological pathways between non-cancerous samples from GTEx and LUAD samples from TCGA. We observe that aging-associated pathways involved in cell adhesion, cell proliferation and immune response (except for type I diabetes mellitus and allograft rejection pathways), are also highly targeted in LUAD tumor, compared to non-cancerous lung (Fig. [113]3 rightmost column). This indicates that increased TF-targeting of these pathways with age might be a contributing factor to an elevated risk of developing LUAD among older adults and that those with the greatest regulatory targeting of these pathways might be at the greatest risk. LUAD among younger individuals, although less frequent, is often detected at more advanced stages compared to their older counterparts^[114]3. Given that we found age-acceleration of gene targeting to correlate with LUAD, we tested whether LUAD in younger individuals was also associated with patterns of accelerated aging, compared to healthier individuals of similar age. To confirm this hypothesis, we compared the TF-targeting pattern of 1422 aging-associated genes in normal adjacent lung samples from individuals with LUAD (from TCGA) versus non-cancerous lung samples from GTEx (Fig. [115]5). For the 1018 genes that exhibited increased targeting with age, we found that their mean TF-targeting was significantly higher (p < 2.2e-16) in the normal adjacent lung samples from younger individuals (age less than median age 66) with LUAD, compared to individuals of similar age without LUAD (Fig. [116]5 left). In contrast, for older individuals with LUAD (age greater than median age 66) we did not find a significantly higher mean targeting of aging genes compared to healthy individuals of similar age. Fig. 5. Individuals with lung adenocarcinoma show transcription factor targeting patterns indicative of accelerated aging, even in normal lung tissue, compared to those without the disease. [117]Fig. 5 [118]Open in a new tab Boxplot of the difference in transcription factor-targeting of genes in TCGA normal adjacent samples compared to GTEx non-cancerous lung for a Left: 1018 genes that are increasingly targeted with age in healthy human lung (based on evidence from GTEx) and b Right: for 404 genes that are decreasingly targeted with age in healthy human lung (based on evidence from GTEx). The reported p values correspond to the hypothesis testing with respect to alternative hypotheses reported in parentheses. The alternative hypotheses “<” or “>” denote the hypotheses “mean > 0” and “mean < 0” respectively. In other words, the gene regulatory patterns observed in the normal-adjacent lung tissues of younger individuals with LUAD are more like those found in older individuals, than they are to their healthy counterparts of the same age. This suggests that LUAD in younger individuals may be driven, in part, by age-accelerated changes in gene regulation, and that this acceleration may also be associated with more aggressive tumor biology at diagnosis we see in younger individuals. Biological pathways differentially regulated in LUAD tumor across varying age We analyzed gene regulatory networks of LUAD samples from TCGA and performed GSEA to identify biological pathways that are differentially regulated by TFs with age (Fig. [119]6 left column). Fig. 6. Transcription-factor targeting patterns of genes in lung adenocarcinoma tumor samples vary by age. [120]Fig. 6 [121]Open in a new tab Normalized enrichment scores (NES) of the biological pathways that are significantly (at an FDR cutoff 0.05) differentially targeted with age by transcription factors in tumor samples from TCGA. Left column shows NES from GSEA on TCGA samples, and the right column shows NES for the same pathways from GSEA on [122]GSE68465 samples. We found several pathways involved in cell signaling and cell proliferation that were increasingly targeted by TFs with age. When we compared these differentially targeted pathways to those differentially targeted with age in non-cancerous lung samples from GTEx, we found that both had identified the pathway associated with cell adhesion molecules. However, there were many pathways with aging-associated regulatory changes found exclusively in tumor samples and not in healthy lung samples, including the NOD-like receptor signaling pathway, FC-epsilon RI signaling pathway, toll-like receptor signaling pathway and JAK-STAT signaling pathway—all of which have been associated with LUAD development, progression and outcome. In contrast, we found that metabolic pathways including oxidative phosphorylation, nitrogen metabolism, arginine and proline metabolism, ascorbate and alderate metabolism were decreasingly targeted by TFs with age. These results were validated in an independent dataset [123]GSE68465 (Fig. [124]6 right column). Biological pathways associated with immune response that had been previously identified as increasingly targeted (at an FDR cutoff 0.05) with age in non-cancerous lung samples, also showed increased targeting by TFs with age in tumors. We also found several additional immune-related pathways to have age-dependent regulatory patterns in tumor, that were not evident in non-cancerous samples. Such immune-related pathways include those involved in antigen processing and presentation, graft versus host disease, JAK-STAT signaling pathway, natural killer cell mediated cytotoxicity, primary immunodeficiency and T-cell receptor signaling pathway, all of which were increasingly targeting by TFs with age. Aging-related gene regulatory dysregulation in antigen processing was previously observed in murine models^[125]8. This difference between non-cancerous lung versus LUAD tumor in the age-associated targeting patterns of immune pathways can be partially attributed to differences in immune cell infiltration by age in non-cancerous lung tissue as compared to tumor. Immune infiltration analysis (Fig. [126]S4) showed that the proportion of CD8+ central memory cells increased with age in both non-cancerous lungs, and in tumor. However, aging-related changes in immune cell composition in tumor were distinct from those in healthy lungs for most immune cell types. For example, while the proportions of activated myeloid dendritic cells and B cells increased with age in LUAD tumors, in non-cancerous lung the proportion of these cells did not change significantly with age. The proportion of macrophages also increased with age in tumor; in non-cancerous lung, macrophages were more abundant in samples from younger individuals. This higher infiltration of immune cells in tumor with age might be associated with a higher targeting of immune pathways among older individuals with LUAD, as evidenced by the positive correlation between immune score and TF-targeting score of immune pathways (Table S.[127]1). In contrast, the proportion of CD8+ naïve T-cells and common myeloid progenitors showed a decreasing trend with age in tumor, while exhibiting no significant difference in composition across healthy lung samples of varying age (Fig. [128]S4) Gene regulatory network-informed aging signature of tumor predicts survival in LUAD We conducted survival analysis using the Cox proportional hazard model to understand whether the aging-associated regulatory patterns of biological pathways have any influence on the prognosis of LUAD for individuals of varying age. First, using randomly chosen 50% of the TCGA samples as training data, we constructed an “aging signature” for tumor samples as follows: we fitted a penalized Cox proportional hazard model with LASSO penalty, using the targeting score (as defined in “Methods”) of 28 biological pathways (Fig. [129]6) as input. These pathways were earlier discovered to be significantly differentially targeted by TFs as a function of age in tumors. The penalized model selected 4 out of 28 of these pathways as significantly predictive of survival outcome in the training data: pathways associated to asthma, nitrogen metabolism, oxidative phosphorylation and ribosome. We defined biological aging signature as the predicted value from this cox regression model. This network-informed “aging signature” of tumors is a linear combination of the TF-targeting scores of these 4 biological pathways and is uncorrelated with both chronological age (correlation = 0.021 with p = 0.7397) and clinical tumor stage (p value from ANOVA = 0.842). We implemented the fitted penalized Cox regression model on the TCGA test data and the [130]GSE68465 independent validation dataset to compute the biological aging signature for each sample in these datasets. After dividing all test samples into two parts depending on if their aging signature value is higher or lower than the median across all samples, we found that samples with a lower aging signature had significantly better survival probability than samples with a higher aging signature (Fig. [131]7 left; p = 0.015 from Kaplan-Meier survival model). For comparison, we split these same samples into two chronological subgroups based on whether samples were above or below the median chronological age (Fig. [132]7 right) and not find any significant difference (p = 0.147) in survival probability. This indicates that the aging signature, defined using gene regulatory networks is more informative than chronological age in predicting LUAD survival in TCGA. The results remained consistent even after adjusting for self-reported gender, race, smoking status, clinical tumor stage and therapy status (p values for the network-informed aging signature and chronological age were 0.049 and 0.095, respectively). In the validation dataset [133]GSE68465 also (Fig. [134]S5), we observed that individuals with lower biological aging signature had better survival outcome than individuals with higher biological aging signature, however the difference was not statistically significant (p = 0.124). Fig. 7. The network-informed aging signature is more informative about survival probability in lung adenocarcinoma (LUAD), than chronological age. [135]Fig. 7 [136]Open in a new tab Kaplan-Meier plot for survival outcome in TCGA, a Left: split by median network-informed aging signature and b Right: by median chronological age into younger and older individuals with LUAD. Drug repurposing with CLUEreg identifies distinct small molecule drugs for tumor samples with lower versus higher aging signature To find potential targeted cancer therapeutics that might differ in efficacy depending on aging signatures, we split the TCGA tumor samples into two groups – one above and one below the median value of the network-informed aging signature. For each group, we separately used CLUEreg^[137]39 to identify small molecule drug candidates depending on the differential regulatory patterns between tumor and healthy samples and obtained a list of 150 small molecule drugs each for the two aging signature groups (Fig. [138]S6). While 59 small-molecule cancer therapeutics including FDA-approved Cisplatin and Amifostine and investigational drugs Timosaponin and Cardamonin appeared as potential drug candidates for both aging-signature groups, several other drugs appeared exclusively in only one of the two groups. Drugs including Homoharringtonine, Ingenol, Vatalanib (investigational), Midostaurin (investigational) and Ubenimex (investigational) appeared only for individuals with lower aging signature. Other drugs including Leucovorin, Actinomycin-d and Plumbagin appeared for the higher aging signature group alone. Several potential geroprotective drug candidates, including Meclofenoxate and Isonicotinamide, also appear in the lists of anti-tumor drugs. It is worth noting that we found 28 geroprotective drugs in the list for higher aging signature group while only 5 geroprotective drugs for the lower aging signature group. This suggests that the information captured by the aging signature encompasses disease-relevant processes driving LUAD, which are intertwined with other aging-related processes that also contribute to the development and progression of the disease. Discussion LUAD, like most other solid tumors, is an age-biased disease in which individuals generally have a greater risk, poorer prognosis and poorer response to most therapies compared to their younger counterparts^[139]4. However, tumors diagnosed in younger individuals are often detected at more advanced clinical stages implying more severe disease biology. Earlier studies have found that across diverse cancer types, the mutational landscape of tumors in younger individuals is very different from that in older individuals^[140]11. However, mutational burden alone does not fully explain the mechanisms of disease, driven by the activity of biological pathways that are activated leading up to and during disease. To bridge this gap in our understanding of LUAD, we investigated how regulation of various genes and biological pathways change with age using individual specific gene regulatory networks. Analyzing gene regulatory networks that integrate gene expression, TF-binding motif and TF PPI data from both non-cancerous human lung and LUAD samples, we found aging-associated alterations in regulatory mechanisms involving key biological processes. Analyzing gene regulatory networks in GTEx lung samples, we found several genes with known relevance to cancer incidence and prognosis that were differentially targeted by TFs with age, including proto-oncogenes AKT1, ERBB3 and MYCN. Using pathway enrichment analysis, we saw a clearer picture of how age affects cancer-related processes in non-cancerous lung tissue, including altered regulation of intracellular adhesion, cell proliferation, and immune response. By conducting a differential targeting analysis on non-cancerous lung samples and LUAD samples, we confirmed that targeting of these same biological processes also changes in tumor and in the same direction as they do in “normal” aging. This suggests that aging-associated alterations in gene regulatory patterns of these pathways might be a contributing factor to a higher risk of LUAD development among older individuals. Further, we found that tobacco smoking was associated with an acceleration in the aging-associated gene regulatory changes, helping to explain the increased risk of LUAD incidence at a younger age among individuals with a history of smoking. Gene regulatory network analysis of LUAD identified an age-associated higher targeting of several biological pathways associated with immune response, these associations were not detected in non-cancerous lung samples from GTEx. Greater targeting of immune pathways with age was correlated with a higher infiltration of immune cells including active myeloid progenitors, B cells, macrophages and CD8+ central memory cells. It is worth noting that an increased infiltration of CD8+ central memory cells with age has been previously observed in peripheral blood, indicating that this pattern is not unique to lung tissues, but rather a systematic effect prevalent in other tissues as well^[141]55. We suspect that a higher targeting of immune pathways in conjunction with a higher proportion of immune cells among older individuals might contribute to an age-biased response to immunotherapy. This is concordant with evidence from earlier studies which demonstrated that while chemotherapy is more beneficial for younger individuals^[142]5, some immune checkpoint inhibitors provide greater benefit to adults with age 65 or older, compared to younger adults^[143]56,[144]57. We constructed a network-informed “aging signature” for tumor samples, based on the TF-targeting patterns of 4 biological pathways (identified in TCGA and validated in [145]GSE68465) that exhibited significant differential targeting by TFs with age and impacted survival outcome in the training data. We found that individuals with a lower aging signature had better survival outcome compared to those with a higher aging signature in both the test data and validation data. In fact, within TCGA, this biological aging signature had better efficacy in predicting survival, compared to the chronological age at the time of tumor diagnosis. The consistent theme of age-associated alterations in regulation being linked to LUAD suggested that network-based aging signatures might identify aging-related tailored therapeutics. Separately analyzing LUAD samples partitioned into low-and-high-aging signature groups, we found 59 small-molecule cancer therapeutics, including FDA-approved Cisplatin and Amifostine, common to both aging-signature groups. But we also found several drugs were exclusive to one aging signature group alone, meaning that considering age-related regulatory changes might be useful in determining personalized therapeutic protocols. Certain potential geroprotective drugs including Meclofenoxate and Isonicotinamide appeared in the lists of anti-tumor drugs, mostly for the higher aging signature group, as did a number of candidate drugs such as Curcumin^[146]41, that have been shown have geroprotective effects. Unfortunately, older adults are severely underrepresented in clinical trials for most cancers including LUAD, thereby impacting the validity of clinical guidelines in diseases with an age effect^[147]58. Our analysis underscores the importance of including individuals across the spectrum of disease-associated demographics in clinical trials. It is important to note that our analysis is based on observational data alone and hence experimental validation is required to establish a causal relationship between the aging-associated regulatory changes identified by our analysis and the manifestation of tumor. Another limitation of our study is that the datasets used for discovery (GTEx and TCGA) and validation primarily consist of individuals of white and African American descent. Despite adjusting for the impact of race in our analysis, the applicability of our findings to other ethnicities may still be constrained due to underrepresentation in our data and the confounding effects of various social determinants of health on lung cancer. Further studies involving more diverse populations are necessary to confirm the validity of our results across a broader range of racial and ethnic backgrounds. Additionally, a more complete inclusion of social determinants such as individual socio-economic background^[148]59, is essential for the generalizability of our findings. Despite these limitations, our analysis provides interesting insights into the aging-associated alterations in gene regulation and their relevance in the clinical manifestation of LUAD, including some immediate implications in the context of personalized cancer therapy^[149]60. Based on our analysis we infer that aging related changes in regulation of key biological processes involved in intra-cellular adhesion, cell proliferation and immune responses are associated with altered risk, prognosis and response to therapy in LUAD among individuals of varying age. Notably, we observed that even among individuals of similar age, individuals with lower network-informed aging signature had better prognostic outcome, than individuals with higher aging signature. This observation implies that chronological age alone does not provide substantial information on prescribing personalized therapy for LUAD and gene regulatory networks can prove to be effective tools in facilitating more efficient personalized therapy design and improving prognosis in LUAD for individuals across varying age. What emerges from our analysis is an interesting picture of how aging influences LUAD. “Normal” aging in the lung is associated with alterations in the regulation of particular biological processes, and indeed, by inferring and analyzing gene regulatory network structure, we identified genes and biological processes that exhibit altered patterns of regulation with age. But as has been known, not all individuals age at the same rate. When we examine LUAD, we find that greater changes in age-associated patterns of gene regulation are more strongly associated with disease than is chronological age. We also find that smoking results in an apparent acceleration of the aging-associated patterns of gene regulation in both the lungs of “healthy individuals with a history of smoking” and in “normal adjacent” tissue from individuals suffering from LUAD, consistent with the fact that smoking dramatically increases risk, progression, and severity, and affects response to therapy. This suggests that the regulatory changes that are captured in the aging signature we derived are, at the least, correlated if not causally linked to LUAD disease processes. Looking at younger people with LUAD, we find that they also exhibit age -acceleration in their “normal” lung tissue relative to their peers without LUAD. Differential regulation in tumors of younger people with LUAD represented a subset of the changes we saw in the tumors of older patients, which suggests that the pathways associated with these changes might be particularly important in understanding the severity of disease in younger individuals. Finally, we found that even among individuals of similar age, those with a lower network-informed aging signature had better prognostic outcome, than individuals with higher aging signature. What all these means is that while chronological aging might have some effect on the risk of developing LUAD and its properties, the changes in aging-related regulation are far more important in estimating disease risk, in understanding disease processes, in identifying candidate therapies, and in designing aging-aware precision treatment protocols. This observation implies that chronological age alone does not provide substantial information on prescribing personalized therapy for LUAD and gene regulatory networks can prove to be effective tools in facilitating more efficient personalized therapy design and improving prognosis in LUAD for individuals across varying age. Methods Discovery dataset Uniformly processed RNA-Seq data were downloaded from the Recount3 database^[150]61 for two discovery datasets using R package “recount3” (version 1.4.0): (i) lung tissue samples from the GTEx Project^[151]62 (version 8) and (ii) LUAD samples from TCGA^[152]63 on May 26, 2022. We accessed clinical data for GTEx samples from the dbGap website ([153]https://dbgap.ncbi.nlm.nih.gov/) under study accession phs000424.v8.p2. Clinical data for TCGA samples were downloaded from Recount3. We refer to the GTEx samples as “non-cancerous lung samples” throughout our analysis. From 655 lung samples, 71 samples were removed because they were designated as “biological outliers” in the GTEx portal ([154]https://gtexportal.org/) for various reasons (as described in [155]https://gtexportal.org/home/faq). We analyzed the remaining 584 lung samples (187 female and 397 male) from GTEx. We removed two recurrent tumor samples from the TCGA data and included the remaining 541 primary tumor samples (293 female and 248 male) and 59 normal adjacent (34 female and 25 male) samples. Table [156]1 summarizes the clinical characteristics of the datasets. Table 1. Clinical characteristics of the discovery and validation datasets GTEx TCGA TCGA LGRC [157]GSE68465 Type Healthy Tumor Normal adjacent Healthy Tumor Sample size 584 539 59 108 443 Age Mean ± std (range) 54 ± 11.81 (21–70) 65 ± 9.91 (33–88) 66 ± 10.83 (42–86) 64 ± 11.35 (32–87) 64 ± 10.1 (33–87) Gender Female (%) 187 (32.02%) 291 (54.00%) 34 (57.63%) 59 (54.63%) 220 (49.66%) Male (%) 397 (67.98%) 248 (46.00%) 25 (42.37%) 49 (45.37%) 223 (50.34%) Race White (%) 499 (85.44%) 411 (76.25%) 55 (93.22%) - 295 (66.60%) Black or African American (%) 70 (11.99%) 53 (9.83%) 4 (6.78%) - 12 (2.71%) Others (%) 15 (2.57%) 75 (13.92%) 0 - 7 (1.58%) Unknown (%) - - - 129 (29.11%) Smoking status History of smoking (%) 386 (66.10%) 448 (83.11%) 46 (77.97%) 65 (60.19%) 300 (67.72%) No history of smoking (%) 182 (31.16%) 77 (14.29%) 7 (11.86%) 32 (29.63%) 49 (11.06%) NA (%) 16 (2.74%) 14 (2.60%) 6 (10.17%) 12 (10.18%) 94 (21.22%) Tumor stage I (%) - 295 (54.73%) 30 (50.85%) - 150 (33.86%) II (%) - 126 (23.38%) 13 (22.03%) - 251 (56.66%) III (%) - 84 (15.59%) 13 (22.03%) - 28 (6.32%) IV (%) - 26 (4.82%) 2 (3.39%) - 12 (2.71%) NA (%) - 8 (1.48%) 1 (1.69%) - 2 (0.45%) Ischemic time (hours) Mean ± std (range) 8.04 ± 6.98 (0.0–24.4) - - - - [158]Open in a new tab Both GTEx and TCGA gene expression data were normalized by transcript per million (TPM), using the “getTPM” function in the Bioconductor package “recount” (version 1.20.0)^[159]64 on R version 4.1.2. Lowly expressed genes were filtered out by removing genes with counts <1 TPM in at least 10% of the samples (126 samples) in GTEx and TCGA combined, removing 36386 genes, and keeping 27,470 genes. To construct gene regulatory networks, we further removed those genes that were not present in the TF/target gene regulatory prior used for creating the gene regulatory networks (see section “Differential targeting analysis using sample-specific gene regulatory networks”). This filtering left with 27,162 genes, including those on allosomes, which were used in subsequent analysis. For female samples in both GTEx and TCGA, gene expression values of all genes on the Y chromosome (36 genes in total) were replaced by “NA”. Principal component analysis (PCA) did not show any visible batch effect. Validation dataset We chose two independent studies for validation from the Gene Expression Omnibus (GEO) repository: [160]GSE47460^[161]65 (downloaded on Feb 12, 2023) and [162]GSE68465^[163]66 (downloaded on Jan 24, 2023). For validating our results on the lung samples from GTEx, we used [164]GSE47460, which consisted of microarray gene expression for 582 samples in total from the Lung Genomics Research Consortium (LGRC). This study used Agilent-014850 Whole Human Genome Microarray 4x44K G4112F and Agilent-028004 SurePrint G3 Human GE 8x60K Microarray for sequencing. Among these 582 samples, we used only 108 samples (59 female and 49 male), who have no chronic lung disease by CT or pathology and hence, were designated as “controls” within the study. [165]GSE68465 consists of microarray gene expression for 443 LUAD samples (220 female and 222 male). This study used Affymetrix Human Genome U133A Array for expression profiling. Table [166]1 summarizes the clinical characteristics of the datasets analyzed. Normalized expression data and clinical data were downloaded from GEO using R package “GEOquery” version 2.62.2. Within every dataset, for genes with multiple probe sets, we kept the probe set with the highest standard deviation in expression levels across samples. We discarded any genes that were not in the TF/target gene regulatory network prior that we used for creating the gene regulatory networks. This process left 13,575 genes in [167]GSE47460 (LGRC) dataset and 11,725 genes in [168]GSE68465 dataset, that we used to build gene regulatory network models. The LGRC data were not batch corrected because no visible batch effect was detected from PCA. The [169]GSE68465 dataset includes LUAD specimens from the following sources: University of Michigan Cancer Center (100 samples), University of Minnesota VA/CALGB (77 samples), Moffitt Cancer Center (79 samples), Memorial Sloan-Kettering Cancer Center (104 samples), and Toronto/Dana-Farber Cancer Institute (82 samples). The [170]GSE68465 data were batch corrected for these sources using “ComBat” function implemented in the R package “sva” (version 3.42.0). Constructing sample-specific gene regulatory networks Gene regulatory networks for each sample were reconstructed by the PANDA^[171]19 and LIONESS^[172]20 algorithms using Python package netZooPy^[173]67 version 0.9.10, in both the discovery and the validation datasets. A schematic diagram of our network construction pipeline is given in Fig. [174]1. Three types of data were integrated to derive the regulatory networks: TF/target gene regulatory prior (obtained by mapping TF motifs from the Catalog of Inferred Sequence Binding Preferences (CIS-BP)^[175]68 to the promoter of their putative