Abstract Background Asthma, a principally T helper 2 (Th2) cell mediated immunological disease, is categorized into Th2-high and Th2-low endotypes. The influence of these endotypes on clinical characteristics and treatment responsiveness in asthma is yet to be completely understood. This study delves into the underlying molecular mechanisms of Th2 endotypes on asthma. Methods Transcriptomics data of airway epithelial and corresponding clinical information were sourced from the GEO. The co-expression modules were established by WGCNA. Cytoscape was applied to construct PPI networks, and hub genes were determined via the Cytohubba plugin. Additionally, a functional enrichment analysis was conducted on the co-expressed genes from the relevant modules. The relative abundances levels of 22 different types of immune cells in asthma patients were evaluated by CIBERSORT algorithm. Results There were 471 genes in the pink module significantly correlated with Th2 endotype. Overall, 151 DEGs were identified in the various Th2 endotypes, and 66 were obtained through intersection with the pink module. In the PPI network, the ten most important genes that regulate Th2 endotypes were selected as hub genes. In Th2-high endotype asthma, the hub genes were significantly related to γ-aminobutyric acid (GABA) pathways, indicating that hub genes can mainly regulate Th2-high endotype asthma through GABAergic system. Conclusions The severity of asthma is influenced by different Th2 endotypes. GABAergic related hub genes may provide innovative insights for the treatment of Th2-high asthma. Keywords: Asthma, Th2 cells, Gamma-aminobutyric acid (GABA), Mast cells 1. Introduction Asthma is a complex and highly heterogeneous disease defined by chronic inflammation of the airways, predominantly triggered by allergens [[29]1]. Chronic inflammation often culminates in airway remodeling, causing dyspnea, wheezing, and hypoxia due to narrowed airways [[30]2]. While most asthmatic patients can effectively relieved symptoms by medication, a subset of approximately 5–10% suffer from uncontrolled asthma [[31]3]. A deeper understanding of the influencing factors in disease severity and the mechanisms of different asthma endotypes may lead to improved therapeutic strategies. Recent advances have deepened our understanding of the immunological processes involved, in particular the role of T helper 2 (Th2) cells in the pathophysiology of asthma. Th2 cells play a central role in coordinating immune responses and have been implicated in driving the inflammatory process that characterizes asthma [[32]4]. Asthma has traditionally been viewed as a Th2-mediated disease, with its endotypes Th2-high and Th2-low distinguished by the influence of Th2 cells and related cytokines [[33]5]. Th2-high asthma is frequently linked with inflammation of the eosinophils [[34]6], while Th2-low comprises other forms such as neutrophilic asthma and paucigranulocytic asthma [[35]7]. This classification into endotype groups is advantageous as it allows for targeted therapy through distinct pathogenic molecular mechanisms [[36]8]. However, further investigation is required to determine how these asthma endotypes affect disease severity and which treatments are optimal for each endotype. In this context, our research aims to identify key genes and pathways that are strongly correlated with different asthma endotypes. Although the phenotypic features of asthma have been extensively studied, the specific gene expression profiles that dictate these features remain to be fully elucidated. We expect to uncover novel biomarkers that could serve both diagnostic and therapeutic purposes, thus advancing the field towards more personalized management strategies for asthma. 2. Materials and methods 2.1. Data retrieval Transcriptomic data and corresponding patient information were obtained from the GEO ([37]http://www.ncbi.nlm.nih.gov/geo/) [[38]9]. After exhaustive data filtering in the GEO, [39]GSE43696 and [40]GSE67472 dataset were selected in the study ([41]Supplementary Table 1). The [42]GSE43696 dataset, based on the Agilent-014850 Whole Human Genome Microarray 4 × 44 K G4112 F platform, discloses gene expression patterns of bronchial epithelial cells samples collected from 20 normal controls, 50 mild to moderate asthmatics, and 38 severe asthmatics. Furthermore, 105 gene expression profiles from the [43]GSE67472 dataset were acquired from the GEO using the Affymetrix Human Genome U133 Plus 2.0 Array platform. Asthmatic subjects in this dataset were categorized into Th2-high and Th2-low. The above data were obtained from diverse platforms using different methodologies; hence, the R package sva was used to eliminate batch effects within each group. 2.2. Construction analysis of co-expression module of asthma The WGCNA was employed to identify co-expression modules in tandem with the clinical variables of asthma. The WGCNA includes a set of functions designed to facilitate various facets of weighted correlation network analysis [[44]10]. The power value β for soft-thresholding was computed for each module using the pickSoftThreshold function within WGCNA, which offers a spectrum of power values (1–20) appropriate for network construction. A soft threshold of six was chosen due to it meeting the degree of independence of 0.90 at the minimum power value (R^2 = 0.90). Subsequently, hierarchical clustering using dynamic tree-cutting techniques separated gene clusters into distinct modules. The correlation strength of the modules was evaluated and depicted using a heatmap. Module-trait interactions were assessed by Pearson correlations between module eigengenes and endotypes, with the goal of identifying the specific gene set (module) that correlates highly with the endotypes [[45]11]. 2.3. Functional enrichment analysis The online resource DAVID was used to conduct functional annotation and pathway enrichment analysis, including GO enrichment and KEGG pathway analysis [[46][12], [47][13], [48][14]]. Statistical significance was ascribed to a P-value threshold of less than 0.05. 2.4. Identification of DEGs between Th2-high and Th2-low endotypes The DEGs between Th2-high and Th2-low groups were identified by use of the limma package [[49]15]. To adjust for potential false discoveries, we applied the Benjamini-Hochberg procedure, setting the FDR threshold below 0.01 and selecting for |fold change (FC)| greater than 2. Furthermore, functional annotation and pathway enrichment analyses were conducted on the DEGs using the WebGestaltR package [[50]13,[51]14,[52]16]. An FDR below 0.05 was regarded as statistically significant. 2.5. Construction of PPI network and identification of hub genes We mapped the genes from the key module that correlated with Th2 endotypes to the STRING database ([53]http://string-db.org) to assess their functional relationships, with a combination score above 0.4 was deemed significant [[54]17]. Visualization of the PPI network, which outlines the interaction topology among genes, was facilitated using Gephi software. GO enrichment were done with the ClueGO plug-in of Cytoscape [[55]18]. An FDR <0.05 was considered significant. 2.6. Gene set variation analysis (GSVA) GSVA was applied to ascertain the molecular pathways most significantly enriched within the Th2 endotype [[56]19]. A comparative enrichment score analysis for KEGG pathways of the two Th2 endotypes was conducted using the limma package [[57]15]. KEGG pathways exhibiting a |log2FC| greater than 0.1 and FDR below 0.05 were designated as the most differentially enriched molecular pathways between the two groups. 2.7. Correlation analysis of immune microenvironment and Th2 endotypes To quantify the composition of 22 immune cell types from gene expression patterns, we applied CIBERSORT [[58]20]. The correlations between immune cells and the two Th2 endotypes were explored to identify the most relevant immune cell. Further analysis was conducted to probe the associations between these immune cells and the identified hub genes. 2.8. Connectivity map (CMap) analysis The CMap ([59]https://clue.io/) was leveraged to investigate potential compounds that may modulate molecular pathways and genes implicated in the Th2 endotypes of asthma [[60]21]. This resource enables drug prediction according to gene expression profiles and discloses the mode of action (MoA) of compounds aimed at specific molecular pathways. The DEGs between the Th2-high and Th2-low groups were used to interrogate the CMap, with an emphasis on the most significantly overexpressed genes in each endotype as potential drug targets. Compound enrichment scores were calculated, identifying compounds with a negative enrichment score and a P-value of less than 0.05 as putative therapeutic agents for each asthma endotype. 3. Results 3.1. Identification of the Th2-related gene module in asthma via WGCNA The WGCNA tool was utilized to generate the co-expression module, using expression values of 15,603 genes from 213 asthma samples, and the impact of batch effect removal is illustrated in [61]Supplementary Fig. 1. The power value, an important parameter influencing the independence and average connectivity of the co-expression modules, was set to six. This threshold ensured an independence degree of up to 0.9 and facilitated enhanced connectivity within modules. This power value aided in constructing the co-expression module, leading to the identification of 20 unique gene co-expression modules in asthma ([62]Fig. 1A). Each module represented by a distinct color ([63]Supplementary Table 2). The clustering dendrogram, produced using the Dynamic Tree Cut package, showcased the hierarchical arrangement of gene modules ([64]Fig. 1B). Modules not part of this array, denoted in gray, were omitted from further analysis. The network heatmap plot revealed strong intramodular correlations, implying that genes with high co-expression may share analogous biological significance and function. The dendrogram and module assignment were depicted on the heatmap's left side, with the top representing different genes. The intensity of the yellow hue denoting the strength of the connection ([65]Fig. 1C). Moreover, an analysis of the modules' co-expression similarity was visualized, with modules clustered into two main groups based on their intercorrelations ([66]Fig. 1D). The module-trait heatmap indicated the pink module as having the most substantial correlation, with a Pearson correlation coefficient of 0.54 and P = 6 × 10^−13, identifying it as the pivotal module associated with the Th2 endotype in asthma ([67]Fig. 1E). Fig. 1. [68]Fig. 1 [69]Open in a new tab Identification of the Th2-related gene module in asthma via WGCNA. (A) Analysis of network topology for a range of soft-thresholding powers. (B) Clustering dendrograms of genes. Different colors represent various co-expression modules. (C) Heatmap illustrated the interactions among co-expressed genes, which depicted the Topological Overlap Matrix (TOM) across all analyzed genes. (D) Upper panel: Hierarchical clustering of module genes. Lower panel: Heatmap of the adjacencies in the module gene network. (E) Module-trait plot displayed the correlations between module eigengenes and the clinical traits of asthmatic patients. The table displays module eigengenes and their corresponding correlations and P-values for each trait. (For interpretation of the references to