Abstract Purpose Acute respiratory distress syndrome (ARDS) is a rapidly progressive diffuse lung injury that is characterized by high mortality and acute onset. The pathological mechanisms of ARDS are still unclear. But alveolar macrophages have been shown to play an important role in inflammatory responses during ARDS. We aimed to find the biomarkers for ARDS for early diagnosis, to give ARDS patients timely treatment. Methods Gene expression profiles were downloaded from Gene Expression Omnibus (GEO) and screened for differentially expressed genes (DEGs). The common upregulated genes in all the datasets were defined as circulating ARDS alveolar macrophage-related genes (cARDSAMGs). We performed a functional enrichment analysis to explore potential biological functions of cARDSAMGs, and we built protein–protein interaction networks. Gene set variation analysis (GSVA) was used to calculate the core gene set variation analysis (CGSVA) score for individual samples. Receiver operating characteristic (ROC) curve analysis was applied on the CGSVA score to evaluate its ability for diagnosis of ARDS. Results A total of 60 genes were upregulated in all ARDS datasets and were therefore denominated as cARDSAMGs. The cARDSAMGs were significantly involved in multiple inflammation-, immunity- and phagocytosis-related biological processes and pathways. In the protein–protein interaction network associated with host responses to ADRS, eight genes were identified as a core gene set: PTCRA, JAG1, C1QB, ADAM17, C1QA, MMP9, VSIG4 and TNFAIP3. ROC curve analysis showed that the CGSVA score may be considered as a biomarker for ARDS: it was significantly higher in patients with ARDS than those in healthy in both alveolar lavage fluid and whole blood. Conclusion The ARDS alveolar macrophage-related CGSVA score may be useful as a biomarker for ARDS. Keywords: acute respiratory distress syndrome, lung injury, gene expression omnibus, differentially expressed genes Introduction Sepsis is a syndrome in which the body’s immune system overreacts to infection. There may be several life-threatening complications. One of them is acute respiratory distress syndrome (ARDS).[42]^1^,[43]^2 ARDS is an acute diffuse lung injury with high incidence. It has been shown that the pathophysiological basis of ARDS includes an excessive and protracted systemic inflammation,[44]^3 but the exact mechanisms remain unknown. The main risk factors that may cause ARDS include drinking, smoking, air pollution, hypoproteinemia, and diabetes mellitus.[45]^4–6 Currently, urgent treatment strategies mainly include low tidal mechanical ventilation, but the outcome is very poor.[46]^3 Although some progress has been made in treatment options in recent years, the mortality rate remains high.[47]^7 Previous studies have identified some genes used as biomarkers for ARDS. The high expression of interleukin (IL)-33 can elevate the expression of matrix metallopeptidases (MMP) 2 and 9 in patients with acute lung injury.[48]^8 SLC2A6 may be a critical biomarker for predicting survival of sepsis patients.[49]^9 V-set immunoglobulin-domain-containing 4 (VSIG4) may be involved in lung injury by inducing phagocytosis.[50]^10 Activation of ERK1/2 by NGF1B depends on MAPKKK c-Raf, eventually inducing inflammation. The NGF1B in the cytoplasm may be involved in the occurrence and development of ARDS.[51]^11 Down-regulation of the long non-coding RNA GAS5 decreases the expression of angiotensin-converting enzyme 2 (ACE 2), leading to increased levels of microRNA (miR) miR-200c-3p, which promotes the progression of ARDS.[52]^12 MYC and STAT3 may be the key regulatory genes in the underlying dysfunction of sepsis-induced ARDS.[53]^13 In addition, a previous study showed that gene set variation index can be used the potential diagnostic tool for sepsis.[54]^14 And gene set variation index may be considered as the biomarker of bacterial and fungal sepsis.[55]^15 However, we found that a majority of previous studies mostly focused on single genes or molecules for ARDS. There are only a few studies involving in a set of genes related to ARDS. Several studies confirmed that alveolar macrophage transcriptional programs are associated with ARDS.[56]^16–18 We hypothesized that these alveolar macrophage-related ARDS-specific transcriptional programs may be also reflected in blood cells. To investigate our hypothesis in the present study, we identified genes upregulated both in alveolar macrophages and blood cells and constructed a corresponding protein–protein interaction network associated with host response to ADRS. We found that eight genes formed a core gene set, and that the core gene set variation score may be used as a circulating biomarker for ARDS. Materials and Methods Data Collection and Processing The data was downloaded from the Gene Expression Omnibus (GEO) database ([57]https://www.ncbi.nlm.nih.gov/geo/): [58]GSE116560,[59]^16 [60]GSE89953,[61]^19 [62]GSE76293[63]^20 and [64]GSE32707.[65]^21 [66]GSE116560 and [67]GSE89953 were based on [68]GPL6883. [69]GSE116560 was obtained from alveolar macrophages of 68 ARDS samples, while [70]GSE89953 was obtained from peripheral blood monocytes of 26 ARDS samples. The two datasets were combined and then batch effects were removed using the sva routine in R.[71]^22 From [72]GSE76293, based on [73]GPL570, the gene profiles of neutrophils from 12 ARDS samples and 12 healthy controls were extracted for subsequent analysis. From [74]GSE32707, based on [75]GPL10558, the gene profiles of whole blood cells were used to perform the subsequent analysis, from 18 ARDS samples and 34 healthy controls. The data were normalized using the “normalizeBetweenArrays” function of the limma routine[76]^23 in R. The probe set was converted into gene symbols, according to the annotation information of the three platforms. The workflow of the study is shown in [77]Figure 1. Figure 1. [78]Figure 1 [79]Open in a new tab Workflow chart of the study. Principal Component Analysis (PCA) and Screening Differentially Expressed Genes (DEGs) The ggbiplot routine[80]^24 in R was applied to perform the PCA.[81]^25 In [82]GPL6883 datasets ([83]GSE116560 and [84]GSE89953), the DEGs in alveolar macrophages were screened in comparison with peripheral blood monocytes. These DEGs may reflect the response of host monocyte-macrophages to ARDS.[85]^18 In [86]GSE76293, DEGs in ARDS neutrophils were screened compared to healthy controls, since neutrophils are the most frequent nucleated cell in whole blood. In [87]GSE32707, we screened the DEGs in whole blood between ARDS patients and healthy controls. A total of the DEGs were screened using the limma routine in R. The cut-off criteria are a false discovery rate (FDR)-adjusted P < 0.05 and |log2 (fold-change)| > 1. In the study, the common upregulated genes were defined as circulating ARDS alveolar macrophage-related genes (cARDSAMGs). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analysis To explore the potential biological functions of the cARDSAMGs, GO and KEGG pathway enrichment analysis were performed using the clusterProfiler routine in R.[88]^26 GO and KEGG networks were clustered using the plug-in ClueGO[89]^26 in Cytoscape software.[90]^27 Differences associated with an FDR-adjusted P < 0.05 was considered statistically significant. Analysis of Core Genes Using STRING database[91]^28 ([92]https://string-db.org/) and KEGG pathway enrichment analysis, a protein–protein interactions (PPIs) network was constructed for the cARDSAMGs. If genes were involved in ARDS-related pathways of interest, we defined them as core genes in the network. The GOsemsim routine inR[93]^29 was applied to analyze the semantic similarities among GO terms for the cARDSAMGs, applying the following formula: Pi = -lg (p) * |log2 (fold-change)|, weight (W) value = semantic similarities * mean value of Pi. Gene Set Variation Analysis (GSVA) and Receiver Operating Characteristic (ROC) Curve Analysis The GSVA routine inR[94]^30 was applied to calculate the core gene set variation analysis (CGSVA) score for individual samples to assess the concerted functional behavior of the core gene set. To further evaluate the potential diagnostic value of the CGSVA score for ARDS, an ROC curve analysis was performed using the pROC routine in R.[95]^31 Results Multiple Genes Identified as cARDSAMGs PCA showed that global gene expression patterns could distinguish ARDS from controls in [96]GSE116560, [97]GSE89953 and [98]GSE76293, but not in [99]GSE32707 ([100]Figure S1). There were 8522 DEGs in [101]GSE116560 and [102]GSE89953, 6064 in [103]GSE76293, and 7713 in [104]GSE32707 ([105]Figure 2A). A set of 60 DEGs was overexpressed in all the datasets, defined as cARDSAMGs ([106]Figure 2B). Moreover, we obtained the p.value and rank of 60 cARDSAMGs in above three datasets and the ranking based on the p.value ([107]Table S1). PCA showed that the expression patterns of the cARDSAMGs could not distinguish ARDS from healthy controls ([108]Figure S2). However, we found that cARDSAMGs can better distinguish ARDS from control compared with the global gene expression patterns ([109]Figures S1 and S2). Figure 2. [110]Figure 2 [111]Open in a new tab Multiple genes identified as cARDSAMGs. (A) Manhattan plot for DEGs. The three most significantly overexpressed genes are marked in yellow and labeled with their names. (B) Venn diagram to identify upregulated DEGs common to datasets from the three platforms. Abbreviation: DEGs, differentially expressed genes. Biological Significance of cARDSAMGs GO enrichment analysis showed that changes in expression involved mainly genes related to metabolism- and inflammation-related biological processes ([112]Figure 3A). KEGG result showed that DEGs were mainly involved in pathways related to ARDS, such as Notch the tumor necrosis factor (TNF) and the tumor necrosis factor (TNF) signaling pathways, as well as complement and coagulation cascades ([113]Figure 3B). In addition, we performed the enrichment analysis using the 60 top ranked up-regulated genes in each of the datasets. As shown in [114]Figure S3, we found that the biological processes (BPs) and KEGG pathways were different in each of datasets. But most of them were related to immunity, inflammation and phagocytosis. Furthermore, we randomly selected 60 genes in the three datasets and thereby calculated the enriched GO term. We found that the enriched GO terms mostly were not significantly related with inflammation, immunity and phagocytosis, which further evidenced that the gene set-cARDAMGs was not simply a randomly overlapping set ([115]Figure S4). The ClueGo analysis indicated that DEGs were clustered mainly in biological processes, such as bone resorption, collagen catabolic process, regulation of cartilage development, positive regulation of ATP biosynthetic process, negative regulation of IL-2 production and positive regulation of animal organ morphogenesis ([116]Figure 3C). No significant clustering of KEGG pathways was observed, based on a threshold of FDR-adjusted P < 0.05. Figure 3. [117]Figure 3 [118]Open in a new tab Biological significance of cARDSAMGs. (A) The significantly enriched biological processes, (B) KEGG pathways and (C) GO network in circulating ARDS involving alveolar macrophage-related genes in the three datasets. Different colors of nodes represent different functional groups. Abbreviations: ARDS, acute respiratory distress syndrome; KEGG, Enrichment in Kyoto Encyclopedia of Genes and Genomes; GO, Gene Ontology. Core Genes in the PPIs Network of ARDS We constructed the PPIs network of cARDSAMGs based on the STRING database. There were 25 nodes and 189 interaction pairs ([119]Figure 4A). Previous studies showed that some of the biological processes and pathways were related to ARDS, including immunity, inflammation and phagocytosis.[120]^32^,[121]^33 Therefore, we selected the following ARDS-related pathways as pathways of interest: Notch, TNF and IL-17 signaling pathways, as well as complement and coagulation cascades. Subsequently, we identified eight cARDSAMGs as core genes involved in these pathways. The core genes included PTCRA, JAG1, C1QB, ADAM17, C1QA, MMP9, VSIG4 and TNFAIP3. In the complement and coagulation cascades, antigen-antibody complexes act on C1QA and C1QB. They further act on VSIG4, thereby inducing phagocytosis.[122]^34^,[123]^35 In the TNF signaling pathway, TNF acts on TNF receptor 1 (TNFR1), ultimately inducing tissue remodeling, autoimmune pathology, neutrophil recruitment and immunity to extracellular pathogens.[124]^36^,[125]^37 In the Notch signaling pathway, ADAM17 and JAG1 act on the cell surface receptor Notch, which in turn affects PTCRA. In the TNF signaling pathway, TNF binds to TNFR1, which signals to surface receptors, intracellular signaling and remodeling of extracellular matrix.[126]^38^,[127]^39 In the IL-17 signaling pathway, the IL-17 family signals activate downstream pathways that include NF-kappaB, MAPKs and C/EBPs, which further induces the expression of chemokines, cytokines and antimicrobial peptides[128]^40^,[129]^41 ([130]Figure 4B). The semantic similarities among the GO terms of the eight core genes are ranked in [131]Figure 4C, and the W value of core genes, which may suggest the importance of molecules in this dysfunction, are displayed in [132]Figure 4D. Figure 4. [133]Figure 4 [134]Open in a new tab Core genes in the PPIs network of ARDS. (A) PPIs network for alveolar macrophage-related genes in circulating acute respiratory distress syndrome. Ellipses represent genes; diamonds, pathways; and font size, relative degree. (B) Core genes and pathways of interest. (C) Semantic similarities among GO terms for the eight core genes. (D) Weight value of the eight core genes in the three datasets. Abbreviations: PPIs, protein-protein interactions; GO, Gene Ontology. CGSVA Score May Be Used as a Circulating Biomarker for ARDS As shown in [135]Figure S5, the C1QA, C1QB, MMP9, PTCRA and VSIG4 may be serve as circulating diagnostic markers for ARDS. Their areas under the ROC curves (AUCs) were higher than 0.7 in the three datasets. In addition, the CGSVA score was higher in the ARDS group than in the control group in all datasets ([136]Figure 5A). The expression of the core genes is shown in [137]Figure 5A. AUC for CGSVA score was higher than 0.7 in all three datasets, which suggests that the score may be useful as a diagnostic marker for ARDS ([138]Figure 5B). The score in macrophages from the alveolar lavage fluid ([139]GSE116560 and [140]GSE89953) and whole blood ([141]GSE76293 and [142]GSE32707) was higher in ARDS individuals than in healthy controls ([143]Figure 5C). Figure 5. [144]Figure 5 [145]Open in a new tab CGSVA score may serve as a circulating biomarker for ARDS. (A) a, GSVA scores of the core gene set in the three datasets. Expression of core genes in (b) [146]GSE76293, (c) [147]GSE116560 & [148]GSE89953 and (d) [149]GSE32707. (B) ROC curves of the gene set GSVA score in the three datasets. (C) The GSVA scores in alveolar whole blood ([150]GSE76293 and [151]GSE32707) and lavage fluid ([152]GSE116560 and [153]GSE89953) were higher for patients with ARDS than for controls. Abbreviations: GSVA, gene set variation analysis; ROC, receiver operating characteristic; ARDS, acute respiratory distress syndrome. Discussion In this study, the differences in gene expression were explored between whole blood and alveolar lavage in ARDS samples and healthy controls. We selected four gene expression datasets from three platforms, where we found a common set of 60 DEGs upregulated in ARDS patients compared to healthy controls. Some among them have been previously informed to be related to ARDS. For example, immunohistochemical results showed that CD163 is overexpressed in ARDS patients.[154]^42 The highly expressed IL-33 can drive expression of MMP9 and MMP2 by activating STAT3. The process mainly happens in alveolar macrophages for acute lung injury patients induced by lipopolysaccharide.[155]^8 Mutations in the ALOX5 gene appear to be most likely cause of inter-individual differences in ARDS progression.[156]^43 Subsequently, we performed a function enrichment analysis for cARDSAMGs. The results showed that these cARDSAMGs were involved mainly in biological processes and pathways related to immunity, inflammation, phagocytosis and nucleic acid turnover. Some of these pathways have been related with ARDS in previous studies. Production of IL-1 and low concentrations of anti-inflammatory cytokines, such as IL-10 and IL-1 receptor antagonist, in bronchoalveolar lavage fluid of early ARDS patients have been closely related to poor prognosis.[157]^44 Numerous studies have linked inflammation and phagocytosis to ARDS.[158]^32^,[159]^33^,[160]^45^,[161]^46 Moreover, we constructed a protein–protein interaction network to analyze the relationship among genes and pathways. We focused on TNF, IL-17 and Notch signaling pathways, as well as complement and coagulation cascades. In complement and coagulation cascades, complement activation may induce the occurrence of phagocytosis,[162]^47 which plays an essential role in ARDS.[163]^48 TNF acts as an inflammatory factor, up-regulating the expression of surface receptors, promoting intracellular signaling and remodeling of the extracellular matrix.[164]^49 Some biomarkers are closely related to ARDS, including surfactant-related proteins and cytokines (IL-1, −2, −6, −8, −10 and −15, and TNF-α), as well as neutrophil activation markers (MMP9, leukotriene B4 and ferritin).[165]^32 In the IL-17 signaling pathway, IL-17A and IL-17B act through surface receptors IL-17-17RA and IL-17RC to induce immunity and recruit neutrophils.[166]^49^,[167]^50 Neutrophil-derived mediators induce epithelial cell death by oxidation of soluble Fas ligand.[168]^33^,[169]^45 Eight genes corresponding to the selected pathways were identified as core genes: PTCRA, JAG1, C1QB, ADAM17, C1QA, MMP9, VSIG4 and TNFAIP3. VSIG4 is the main part of phagocytic system: it rapidly removes C3-modified particles.[170]^10 MMP9 is a secreted zinc metallopeptidase.[171]^51 MMP9 mediates protein degradation in the extracellular matrix of alveolar epithelial cells, mainly in intercellular junction proteins and proteins that anchor cells to the basement membrane.[172]^52 In addition, MMP9 is enriched mainly in macrophages and neutrophils,[173]^53^,[174]^54 and involved in the occurrence and development of emphysema and asthma.[175]^55^,[176]^56 TNFAIP3 was identified as a gene whose expression is rapidly induced by TNF. The protein encoded by this gene has been shown to inhibit nuclear factor (NF)-kappa B activation as well as TNF-mediated apoptosis.[177]^57 PTCRA encodes a single channel type 1 membrane protein in mature T cells. This protein forms a T-cell pre-receptor complex together with CD3 and T cell receptor beta chain (TCRB) complexes, thereby regulating the early development of T cells.[178]^58–60 C1QB may be associated with the immune response after injury.[179]^61 In addition, CGSVA score may be used as a biomarker for ARDS patients. Furthermore, the CGSVA scores associated with ARDS both in whole blood and alveolar macrophages were significantly higher in ARDS patients than in healthy controls. This may suggest that these alveolar macrophage-related ARDS-specific transcriptional programs are also reflected in blood cells. Our study presents several limitations. Firstly, our predictions are based on bioinformatic analyses, and therefore further experimental verification is needed. Secondly, most of alveolar macrophage genes have a dynamic pattern, but we only select genes that are significantly up-regulated in all time point without the consider of time changes. Thirdly, the whole transcriptome sequencing data of macrophages in whole blood and alveolar lavage fluid are difficult to find, currently. Whether these 8 cARDSAMGs may be used as diagnostic markers needs to verify in a larger independent clinical sample. Fourthly, it is shown that the gene set has a diagnostic performance in each of datasets in this study. Therefore, we believe that this gene set is specific for ARDS to some extent. But whether it is specific enough to ARDS, any inflammation or infection, still needs to be further clarified. Finally, we need to further explore the upstream regulators of core genes to reveal in more detail the imbalance network in ARDS. Conclusions We identified a core gene set (PTCRA, JAG1, C1QB, ADAM17, C1QA, MMP9, VSIG4 and TNFAIP3) that may help predict ARDS. The ARDS alveolar macrophage-related CGSVA score may be helpful for the diagnosis of ARDS. Funding Statement This study was supported by the National Natural Science Foundation of China (81660132 and 81960343), the Reserve Cadre Training Program Science Foundation of the Second Affiliated Hospital of Guangxi Medical University (No. HBRC201805), Guangxi Health Commission key Laboratory of Emergency and Critical Medicine (The Second Affiliated Hospital of Guangxi Medical University) and the High-level Medical Expert Training Program of Guangxi “139” Plan Funding (G201903027). Data Sharing Statement The data was downloaded from the Gene Expression Omnibus (GEO) database ([180]https://www.ncbi.nlm.nih.gov/geo/): [181]GSE116560, [182]GSE89953, [183]GSE76293 and [184]GSE32707. Disclosure The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. References