Abstract Lung cancer is one of the most invasive cancers affecting over a million of the population. Non-small cell lung cancer (NSCLC) constitutes up to 85% of all lung cancer cases, and therefore, it is essential to identify predictive biomarkers of NSCLC for therapeutic purposes. Here we use a network theoretical approach to investigate the complex behavior of the NSCLC gene-regulatory interactions. We have used eight NSCLC microarray datasets [46]GSE19188, [47]GSE118370, [48]GSE10072, [49]GSE101929, [50]GSE7670, [51]GSE33532, [52]GSE31547, and [53]GSE31210 and meta-analyzed them to find differentially expressed genes (DEGs) and further constructed a protein–protein interaction (PPI) network. We analyzed its topological properties and identified significant modules of the PPI network using cytoscape network analyzer and MCODE plug-in. From the PPI network, top ten genes of each of the six topological properties like closeness centrality, maximal clique centrality (MCC), Maximum Neighborhood Component (MNC), radiality, EPC (Edge Percolated Component) and bottleneck were considered for key regulator identification. We further compared them with top ten hub genes (those with the highest degrees) to find key regulator (KR) genes. We found that two genes, CDK1 and HSP90AA1, were common in the analysis suggesting a significant regulatory role of CDK1 and HSP90AA1 in non-small cell lung cancer. Our study using a network theoretical approach, as a summary, suggests CDK1 and HSP90AA1 as key regulator genes in complex NSCLC network. Keywords: non-small cell lung cancer, key regulator, differentially expressed genes, protein–protein interaction 1. Introduction Lung Cancer is one of the most invasive cancer types causing more than 1.38 million deaths worldwide [[54]1]. Non-Small Cell Lung Cancer (NSCLC) is a kind of epithelial lung cancer. It is markedly different from Small Cell Lung Carcinoma (SCLC), and accounts for about 85% of all lung cancer cases [[55]2]. NSCLCs can be further sub-categorized into adenocarcinoma (32–40%), squamous cell carcinoma (25–30%), and large cell carcinoma (8–16%) based on the type of lung cells undergoing uncontrolled proliferation [[56]3]. In adenocarcinomas, cells that secrete mucus proliferate uncontrollably. This type of cancer occurs mainly among non-smokers [[57]4]. It is more common in women than in men and has a higher probability of occurring in younger people, among other types of lung cancers. It is usually found in the outer parts of the lung; thus, it is likely to be detected before it metastasizes [[58]5]. On the other hand, in squamous cell carcinoma, flat squamous cells that line the inside of the airways in the lungs undergo proliferation. Here, the tumor is usually found in the central part of the lungs, near the bronchi. In large cell carcinoma, the tumor can grow in any part of the lung [[59]6]. It grows and spreads very fast, making it harder to treat. Other subtypes of NSCLCs, such as adeno squamous carcinoma and sarcomatoid carcinoma, are relatively less common. Out of various reasons, smoking history is highly correlated to the pathogenesis of lung cancer [[60]7]. It is, so far, the most crucial risk factor for lung cancer. Cigarette smoke contains more than 6000 components, most of which result in DNA damage [[61]7,[62]8]. Moreover, other sources of lung cancer include subjection to passive smoke, radon, exposure to materials such as soot, beryllium, nickel, chromium, asbestos, or tar, genetic predisposition to lung cancer, and air pollution [[63]9,[64]10]. DNA damage appears to be the primary reason for pathogenesis in multiple different cancers at the genomic level, let alone NSCLC cases. Even though the DNA repair system can repair most of the damage, frequent exposure to smoke and other causative factors makes it vulnerable [[65]11]. Additionally, epigenetic gene silencing of DNA repair genes can happen during incompletely finished repair sites formed during DNA double-strand breaks or other DNA damage repair [[66]12,[67]13]. The epigenetic gene silencing of DNA repair genes plays a crucial role in molecular pathogenesis of NSCLC. At minimum, nine DNA repair genes that usually function in DNA repair pathways are often suppressed by promoter hypermethylation in NSCLC. They are namely NEIL1, WRN, MGMT, ATM, MLH1, MSH2, BRCA1, BRCA2, and XRCC5 [[68]14,[69]15,[70]16,[71]17,[72]18,[73]19,[74]20]. FEN1, a component of the DNA repair pathway, is expressed at an increased level due to hypomethylation at its promoter region in NSCLC [[75]21]. The complex molecular underpinning of NSCLC is yet to be fully understood for better treatment approaches and drug target identification. There are not many full-proof treatment options available. Current treatment options include surgical intervention at the early stages like radical mastectomy. Cisplatin-based chemotherapy coupled with radiotherapy has been the go-to treatment with the advancement in the cancer stages and onset of metastasis. However, the multimodal approach is taken in certain cases, where third generation cytotoxic and cytostatic agents such as anti-VGFR and anti-EGFR drugs are prescribed [[76]22]. Thus, further research into recognizing new players, drugs, and combinatorial therapies is necessary to expand the clinical interest to a broader patient population and better outcomes in NSCLC. This study utilizes a network theoretical approach to unveil the complexity of non-small cell lung cancer. 2. Materials and Methods 2.1. Data Collection Non-small cell lung cancer data sets were retrieved from Gene Expression Omnibus (GEO), NCBI (GEO, [77]https://www.ncbi.nlm.nih.gov/geo/, accessed on 13 September 2020). A total of eight datasets, [78]GSE19188 (65 control and 91 cancer patients’ samples), [79]GSE118370 (six control and six cancer patients’ samples), [80]GSE10072 (49 control and 58 cancer patients’ samples), [81]GSE101929 (34 control and 32 cancer patients’ samples), [82]GSE7670 (27 control and 27 cancer patients’ samples), [83]GSE33532 (20 control and 80 cancer patients’ samples), [84]GSE31547 (20 control and 30 cancer patients’ samples), [85]GSE31210 (20 control and 226 cancer patients’ samples), were taken into this study. The description of datasets and controls and patient numbers are described in [86]Table 1. Table 1. List of datasets used in the meta-analysis. Microarray Datasets Platforms Control Cases References