Abstract Objective This study aims to construct a knowledge map of polycystic ovary syndrome (PCOS)-cancer research through bibliometric analysis to elucidate its developmental trajectory and global research landscape, and further employ bioinformatics approaches to investigate the underlying molecular mechanisms linking PCOS and related cancer. Methods Utilizing the Web of Science Core Collection as the data source, English-language publications from 2015 to 2024 were retrieved. CiteSpace and VOSviewer were employed for co-occurrence analysis, co-citation network construction, cluster identification, and keyword burst detection. PCOS and endometrial cancer-related genes were extracted from the Genecards database, followed by screening of overlapping genes for protein-protein interaction (PPI) network analysis to identify key targets. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed to pinpoint critical signaling pathways. Results Publications on PCOS and cancer exhibited a significant and steady growth over the past decade, with the United States and China demonstrating prominent contributions in both output volume and collaborative networks. Frontiers in Endocrinology and Gynecological Endocrinology jointly ranked first in publication count, while The Journal of Clinical Endocrinology & Metabolism received the highest citations. Keyword co-occurrence cluster analysis revealed major research hotspots including endometrial cancer, gene expression, and cardiovascular disease. Bioinformatics analysis identified 250 overlapping genes between PCOS and endometrial cancer. PPI network analysis highlighted TP53 as the most critical hub gene, and KEGG enrichment analysis underscored the pivotal role of the PI3K/AKT signaling pathway. Conclusion By integrating bibliometric analysis with bioinformatics, this study systematically maps the knowledge structure, emerging trends, and molecular mechanisms linking PCOS and cancer. Our findings specifically highlight the association between PCOS and endometrial cancer, may driven by dysregulation of the TP53 and PI3K/AKT signaling pathways. This work provides valuable insights for researchers to understand the foundational knowledge framework, identify emerging trends, potential collaborators, and mechanistic targets for future studies. Keywords: polycystic ovary syndrome, cancer, PI3K/AKT signaling pathway, bibliometric analysis, CiteSpace, VOSviewer Introduction Polycystic Ovary Syndrome (PCOS) is a prevalent endocrine and metabolic disorder characterized by ovulatory dysfunction, hyperandrogenism, and polycystic ovarian morphology, predominantly affecting women of reproductive age.[34]^1 The Global Burden of Disease (GBD) study reports a global age-standardized prevalence of PCOS at 1.68%, with marked geographical disparities—surpassing 7.9% in some countries.[35]^2 Although PCOS is traditionally perceived as a disorder predominantly affecting the reproductive system, it is in fact a systemic condition encompassing metabolic, reproductive, cardiovascular, and psychological disturbances, accompanied by a spectrum of comorbidities.[36]^3 The association between PCOS and neoplasms has garnered increasing attention, with compelling evidence from epidemiological investigations demonstrating a significantly elevated incidence of endometrial cancer, breast cancer, and ovarian cancer among affected individuals.[37]^4 Bibliometrics, a discipline employing statistical methods to quantitatively analyze academic literature, investigates patterns in publication volume, authorship, citation networks, and keyword distribution.[38]^5^,[39]^6 It serves to uncover disciplinary trends, identify core research communities, assess scholarly impact, and inform scientific decision-making. Despite rapid growth in research on PCOS and cancer, the global knowledge structure, research trajectories, and frontier topics linking PCOS to cancer remain systematically underexplored. To address this gap, this study leverages the Web of Science Core Collection (WoSCC) database and employs CiteSpace, VOSviewer and Scimago Graphica to analyze and visualize scholarly outputs from 2015 to 2024. Complementarily, we overlap genes linked to PCOS and cancer are extracted from the Genecards database, followed by protein-protein interaction (PPI) network analysis to identify key targets and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment to pinpoint critical signaling pathways. By synthesizing these dual perspectives, this work aims to aims to provide a macroscopic perspective and strategic guidance for future investigations into the PCOS-cancer association. Materials and Methods Literature Search Strategy The data collection for this study was conducted on April 3, 2025, through the WoSCC ([40]https://www.webofscience.com/wos/woscc/), with the search parameters configured as follows: databases were set to Science Citation Index Expanded (SCI-EXPANDED) and Social Sciences Citation Index (SSCI); the timeframe spanned from January 1, 2015, to December 31, 2024; document types were limited to research articles and reviews; and the language was restricted to English. The specific keyword combinations and detailed search strategy are provided in [41]Table 1. The exported data were saved in plain text format and tab-delimited format to meet the input requirements of CiteSpace (6.4.R1) and VOSviewer, respectively. Table 1. Literature Search Strategy Set Results Search Query #1 16692 TS = (“Polycystic Ovary Syndrome” OR “PCOS” OR “Polycystic Ovarian Syndrome” OR “Ovary Syndrome, Polycystic” OR “Syndrome, Polycystic Ovary” OR “Ovarian Syndrome, Polycystic” OR “Polycystic Ovary Syndrome 1” OR “Sclerocystic Ovarian Degeneration” OR “Ovarian Degeneration, Sclerocystic” OR “Sclerocystic Ovary Syndrome” OR “Stein-Leventhal Syndrome” OR “Stein Leventhal Syndrome” OR “Syndrome, Stein-Leventhal” OR “Sclerocystic Ovaries” OR “Ovary, Sclerocystic” OR “Sclerocystic Ovary” OR “Ovarian Dysfunction” OR “Androgen Excess” OR “Hyperandrogenism” OR “Chronic Anovulation” OR “Polycystic Ovarian Morphology” OR “Polycystic Ovary Disease” OR “Insulin Resistance Syndrome”) #2 2220761 TS = (“Cancer” OR “Neoplasm” OR “Malignancy” OR “Tumor” OR “Carcinoma*” OR “Oncogenesis” OR “Oncology” OR “Metastasis” OR “Endometrial Cancer” OR “Endometrial Neoplasm*” OR “Uterine Cancer” OR “Uterine Neoplasm*” OR “Ovarian Cancer” OR “Ovarian Neoplasm*” OR “Breast Cancer” OR “Breast Neoplasm*” OR “Gynecologic Cancer” OR “Gynecologic Neoplasm*” OR “Hormone-dependent Cancer” OR “Hormone-related Malignancy” OR “Reproductive Cancer” OR “Reproductive Neoplasm*”) #3 1869 #1 AND #2 [42]Open in a new tab Analytical Tools CiteSpace Parameter Settings CiteSpace 6.4.R1 was used to construct literature co-citation networks and analyze keyword clustering and evolution.[43]^7 The data retrieved from WoSCC were imported into CiteSpace, with parameters set as follows: time slicing spanned from 2015 to 2024 (annual slices), threshold selection used the g-index (K=10), network optimization applied the Pathfinder algorithm to prune slice networks followed by secondary pruning after merging the networks, and visualization metrics linked node diameter to frequency of occurrence and line width to co-occurrence strength. VOSviewer Analysis Workflow VOSviewer 1.6.19 was utilized to process collaboration networks and co-occurrence analysis.[44]^8 WoSCC export files were converted to UTF-8 encoding, and the network layout was optimized using the Linlog modularity algorithm. In the visualized networks, node size was proportional to either document count (publications) or citation count. Application of Scimago Graphica Scimago Graphica 1.0.25 was primarily employed to visualize country/region collaboration networks.[45]^9 The GML-formatted country collaboration table obtained from VOSviewer was imported into Scimago Graphica, with parameters configured as follows: the label menu corresponded to “Country” and the cluster menu corresponded to “String”. The generated country collaboration network mapped node diameter to national publication volume and line thickness to inter-country collaboration frequency. WPS Excel WPS Excel 2023 was used for basic statistical analysis, creating visualizations such as donut charts, bar charts, and column charts based on the extracted data. Identification of Key Gene Targets and Enrichment Analysis Target Gene Identification High-relevance genes were retrieved from the GeneCards database ([46]https://www.genecards.org/) using the keywords “Polycystic Ovary Syndrome” and “Endometrial Cancer”[47]^10 Genes with a relevance score >25 were selected. Overlapping high-score genes shared between both diseases were identified as candidate genes. The intersecting genes was obtained through the Bioinformatics.com.cn platform ([48]http://www.bioinformatics.com.cn/).[49]^11^,[50]^12 PPI Network Construction Overlapping genes were imported into the STRING database ([51]https://cn.string-db.org/)[52]^13 with the following parameters: Organism: Homo sapiens; Minimum interaction score: High confidence (0.700); Disconnected nodes hidden. The PPI network was exported in TSV format and visualized using Cytoscape software (version 3.7.2, [53]https://cytoscape.org/).[54]^14^,[55]^15 KEGG Pathway Enrichment Analysis The overlapping genes were subjected to KEGG pathway analysis using the clusterProfiler package[56]^16^,[57]^17 (v4.6.2) on R 4.4.2 ([58]https://www.r-project.org/). Significant pathways were filtered using the Benjamini-Hochberg method (FDR<0.05). Disease-related pathways were excluded. The top 10 enriched pathways were selected to construct a “gene-pathway” regulatory network, which was visualized in Cytoscape. Results Number of Articles Published in the Past Decade A total of 1869 publications related to PCOS and cancer were identified through a comprehensive search of WoSCC database spanning from January 2015 to December 2024. Following rigorous screening procedures, 1746 articles were ultimately included in this study. The annual publication count exhibited two distinct phases. During the first phase (2015–2020), the number of articles showed a gradual upward trend. In the second phase (2021–2024), a rapid surge in publications was observed in 2021 compared to 2020 (68 articles, a 39.3% increase), followed by a plateau in subsequent years. In terms of citations, the cumulative number of citations for these articles demonstrated an exponential growth trend ([59]Figure 1). Figure 1. [60]Figure 1 [61]Open in a new tab Trends in Annual Publications and Total Citations. Country/Region Collaboration Analysis The collaboration networks among countries/regions were clustered and visualized using Scimago Graphica ([62]Figure 2A). The six collaboration clusters included the following countries/regions: Cluster 1: USA, Canada, United Kingdom, Spain, Italy, Poland, Finland, Denmark, China, and Australia; Cluster 2: Brazil and South Africa; Cluster 3:Turkey, India and Egypt; Cluster 4: Iran and malaysia; Cluster 5: Romania; Cluster 6: China Taiwan. The collaboration frequency and linkages between countries are shown in [63]Figure 2B. China and the USA exhibited the highest international collaboration frequency. Additionally, China established close collaborations with Iran, the United Kingdom, and Canada, while the USA maintained frequent interactions with Turkey, Iran, and other countries. The overall network displayed a multi-hub structure, with key nodes such as China, the USA, Iran, and the United Kingdom, reflecting their dominant roles and connective capabilities in international collaboration. Annual publication counts by country are presented in [64]Figure 2C. China’s output surged rapidly after 2015, increasing from 14 articles in 2015 to 95 in 2021 and remaining high in 2024 (84 articles). In contrast, the USA maintained stable output with minimal fluctuations. National contributions to the field are detailed in [65]Figure 2D. China’s share rose from approximately 14% in 2015 to 39% in 2024, becoming the largest contributor, while the USA declined from 38% to 17% during the same period. To further quantify research performance, key metrics (publications, total citations, and average citations) for major countries are listed in [66]Table 2. China ranked first in total publications (562 articles) with 11216 total citations but had a lower average citation rate (20.0 citations/article). The USA ranked second in publications (347 articles) but achieved the highest total citations (15929) and average citations (45.9 citations/article), reflecting its leading research quality and influence. Other countries, such as Australia (62.2 citations/article), France (57.4), and Italy (49.6), also demonstrated strong citation performance. Figure 2. [67]Figure 2 [68]Open in a new tab Global contribution and collaboration in PCOS and cancer research. (A) Country-level collaboration network. (B) Chord diagram of inter-country cooperation. (C) Annual publication trends of top 10 countries. (D) Proportional contributions of top countries over time. Table 2. Top 10 Countries by Publication Volume in PCOS and Cancer Research Rank Country/Region Documents Citations Average Citations 1 China 562 11,216 20.0 2 USA 347 15,929 45.9 3 Italy 110 5454 49.6 4 Iran 107 2360 22.1 5 India 93 2188 23.5 6 United Kingdom 91 3917 43.0 7 Canada 66 1948 29.5 8 Australia 56 3483 62.2 9 Poland 54 1639 30.4 10 France 53 3042 57.4 [69]Open in a new tab Academic Institution Influence and Collaboration Analysis Collaboration networks among institutional affiliations were visualized using VOSviewer. Chinese universities and research institutions dominated in quantity, forming dense collaborative clusters centered around institutions such as Fudan University and Shanghai Jiao Tong University, which not only established close collaborations with domestic partners but also developed stable international linkages with universities in Iran, Sweden, the USA, and other countries ([70]Figure 3A). The top 10 institutions by publication volume included Fudan University (34 articles), Tehran University of Medical Sciences (31 articles), Zhejiang University (28 articles), Shanghai Jiao Tong University (27 articles), Sichuan University (25 articles), Shandong University (23 articles), Zhengzhou University (22 articles), Karolinska Institute (21 articles), China Medical University (20 articles), and Harvard Medical School (20 articles) ([71]Figure 3B). These institutions demonstrated strong research activity in the field, with their total citation counts shown in [72]Figure 3C and [73]Table 3. Despite Fudan University ranking first in output, Shandong University led in academic impact with 1293 total citations. Similarly, Karolinska Institute and Harvard Medical School, despite moderate publication volumes, achieved 794 and 545 citations, respectively, highlighting their significant scholarly influence in this research domain. Figure 3. [74]Figure 3 [75]Open in a new tab Institutional performance and collaboration in PCOS and cancer research. (A) Co-authorship network of research institutions. (B) Top 10 institutions ranked by publication volume. (C) Bubble chart of publications and citations by institution. Table 3. Top 10 Research Institutions by Publication Volume and Citation Count Rank Organization Documents Citations Country 1 Fudan University 34 603 China 2 University of Tehran Medical Sciences 31 757 Iran 3 Zhejiang University 28 518 China 4 Shanghai Jiao Tong University 27 574 China 5 Sichuan University 25 435 China 6 Shandong University 23 1293 China 7 Zhengzhou University 22 500 China 8 Karolinska Institute 21 794 Sweden 9 China Medical University 20 243 China 10 Harvard Medical School 20 545 USA [76]Open in a new tab Journal Distribution and Reference Analysis The collaboration networks and citation networks of publication journals were visualized using VOSviewer. In terms of journal publication volume, the top 10 journals included Frontiers in Endocrinology (51 articles), Gynecological Endocrinology (51 articles), Journal of Ovarian Research (38 articles), International Journal of Molecular Sciences (36 articles), Reproductive Sciences (32 articles), Nutrients (26 articles), Human Reproduction (24 articles), Journal of Clinical Endocrinology & Metabolism (22 articles), Molecular and Cellular Endocrinology (19 articles), and PLOS ONE (19 articles). These journals demonstrated significant academic influence in the research field, with detailed metrics including Total Citations, Total Link Strength, Impact Factor, and Average Citations per Article provided in [77]Table 4. Notably, Journal of Clinical Endocrinology & Metabolism, despite its moderate publication count (22 articles), high total citations (971 citations) and highest average citations per article (73 citations/article), reflecting its strong scholarly authority. The inter-journal collaboration network is shown in [78]Figure 4A, and the bubble map illustrating journal publication volume versus citation counts is presented in [79]Figure 4B. Table 4. Top 10 Journals by Publication Volume in PCOS and Cancer Research Rank Journal Name Publications Citations Total Link Strength IF (2023) Average Citations 1 Frontiers in Endocrinology 51 1110 53 5.2 21.8 2 Gynecological Endocrinology 51 556 70 2.0 10.9 3 Journal of Ovarian Research 38 634 80 3.8 16.7 4 International Journal of Molecular Sciences 36 2467 51 4.9 68.5 5 Reproductive Sciences 32 636 80 2.6 19.9 6 Nutrients 26 1187 24 4.8 45.7 7 Human Reproduction 24 679 46 6.0 28.3 8 Journal of Clinical Endocrinology & Metabolism 22 971 73 5.0 44.1 9 Molecular and Cellular Endocrinology 19 644 56 3.8 33.9 10 PLOS ONE 19 388 29 2.9 20.4 [80]Open in a new tab Figure 4. [81]Figure 4 [82]Open in a new tab Journal visualization in PCOS and cancer research. (A) Co-citation network of source journals (B) Bubble chart of publication volume and citation count (C) Co-citation network of highly referenced journals. Regarding cited references, the top 10 journals were The Journal of