Abstract Background: The relationship between oral contraceptive (OC) use and pancreatic cancer (PC) risk remains controversial, with inconsistent findings reported in observational studies. To clarify this relationship and better identify potential risk factors for PC prevention, more unbiased and robust approaches are needed. Methods: We investigated the potential causal relationship between OC use and PC risk using a two-sample Mendelian randomization (MR) analysis, with blood protein quantitative trait loci (pQTLs) as instrumental variables. To ensure the robustness of our findings, we performed a series of sensitivity analyses, colocalization analyses, and reverse MR. The causal effects of protein-coding genes on PC risk, as well as their expression patterns across different single-cell types, were subsequently investigated. To elucidate the potential pathogenic pathways, we conducted pathway enrichment analysis, protein–protein interaction (PPI) network analysis, and causal inference. Results: Our MR analysis identified five drug-targeted proteins significantly associated with PC risk. Higher levels of COMT, AGT, FN1, and UGT1A1, as well as lower levels of SERPINC1, were associated with an increased risk of PC. Among these, AGT, FN1, and COMT demonstrated consistent associations across sensitivity analyses and downstream analyses, providing robust evidence supporting their involvement in PC risk. Conclusions: This study provides genetic evidence suggesting, in European groups, a potential causal link between OC use and increased PC risk, possibly mediated through drug-targeted proteins such as AGT and FN1. These results highlight the importance for further research to elucidate the underlying mechanisms and assess the implications of OC use on PC risk. Keywords: oral contraceptive (OC), pancreatic cancer (PC), mendelian randomization (MR), protein quantitative trait loci (pQTLs), causal inference 1. Background Pancreatic cancer (PC) is a highly aggressive and lethal malignancy, characterized by an alarmingly high mortality rate [[38]1]. In the United States, the 5-year survival rate for PC remains approximately 10% [[39]2], and its incidence is increasing annually by 0.5% to 1.0%. It is projected to become the second leading cause of cancer-related deaths by 2030 [[40]3,[41]4,[42]5]. However, due to the absence of effective early diagnostic tools and preventive strategies, the public health burden of PC continues to grow. In recent years, the potential association between medication use and PC risk has garnered increasing attention. In particular, understanding how pharmacological agents may influence the development of PC through metabolic or hormonal regulation is critical for identifying risk factors and developing effective prevention strategies. Oral contraceptives (OCs) are widely prescribed medications for both contraception and endocrine disorder management, particularly for patients with polycystic ovary syndrome (PCOS), where OCs are a common therapeutic option [[43]6,[44]7]. Studies have suggested that OCs influence cancer progression by altering the estradiol-to-progesterone ratio, modulating immune responses, and affecting one-carbon metabolism [[45]8,[46]9]. In recent years, numerous observational studies, including cohort and case–control designs, have investigated the potential association between OC use and PC risk. However, the results remain inconsistent: while some studies suggest that long-term OC use may increase PC risk [[47]9,[48]10], others indicate a potential protective effect [[49]11,[50]12], and some report no significant association at all [[51]13]. These conflicting findings may result from various biases, including selection bias, information bias, and confounding by indication. The interpretation of the relationship between OC use and PC risk remains challenging due to the presence of multiple potential sources of bias. Given these discrepancies, there is a pressing need for more robust and rigorous approaches to assess the potential relationship between OCs and PC. Mendelian randomization (MR) [[52]14] offers a robust approach to address this issue. MR employs genetic variants as instrumental variables (IVs) to infer causal relationships between exposure factors and disease outcomes. Unlike conventional observational studies, MR minimizes the impact of confounding bias and reverse causation by leveraging the random assignment of genetic variants during conception. This method significantly enhances the validity of causal inference and has been extensively applied in pharmacological epidemiology to investigate drug–disease relationships [[53]15,[54]16,[55]17,[56]18,[57]19,[58]20,[59]21]. Recent advancements in technology have led to significant breakthroughs in large-scale proteomics studies, particularly in the identification of protein quantitative trait loci (pQTLs), which have opened new possibilities for inferring the effect of plasma drug proteins on PC using a two-sample MR method [[60]22,[61]23]. This study aimed to investigate the potential causal relationship between OC use and PC risk. A two-sample MR analysis was employed to examine the relationship between OCs and PC risk at the protein level by identifying relevant drug targets as exposure. To strengthen the validation of the findings, colocalization and sensitivity analyses were conducted. Additionally, we conducted differential expression analysis to validate the gene expression differences of drug target genes in pancreatic cancer tissues compared with normal pancreatic tissues. Furthermore, we integrated single-cell RNA sequencing data with enrichment analyses to identify the specific cell types enriched for drug target genes in pancreatic tissues, as well as the biological pathways through which these genes may influence pancreatic cancer progression. These comprehensive analyses provide valuable insights into the potential impact of OC use on PC risk. 2. Methods 2.1. Overall Design This study was based on publicly available summary-level datasets, and the overall study design is illustrated in [62]Figure 1. Briefly, the study was divided into three major components: In the first part, we employed pQTLs of drug-targeted proteins as potential instrumental variables, with PC as the outcome for MR analysis. The second part involved a series of sensitivity analyses and Bayesian colocalization analyses to validate the robustness and specificity of the MR findings. In the third part, we conducted differential expression analysis at the bulk-tissue level, investigated the cell type-specific expression of target genes using single-cell transcriptomic data, and explored their functional interactions through a protein–protein interaction (PPI) network analysis. All reported p-values were adjusted for multiple testing by the Benjamini–Hochberg (BH) procedure. Figure 1. [63]Figure 1 [64]Open in a new tab Flowchart of the study design. pQTL, protein quantitative trait loci; IV, instrumental variable; LD, linkage disequilibrium; MHC, major histocompatibility complex; GWAS, genome-wide association study; MR, Mendelian randomization; GEO, Gene Expression Omnibus. 2.2. Study Population and Datasets We defined PC as a malignant neoplasm of the pancreas, and identified patient cases based on the disease codes ICD-10 C25, ICD-9 157, and ICD-8 157. Summary-level genome-wide association study (GWAS) data for PC risk were obtained from the UK Biobank [[65]23] and FinnGen [[66]24] cohorts; after meta-analysis, the combined dataset included 772,167 individuals of European ancestry, comprising 2213 PC cases and 769,954 controls. In this study, plasma proteins were considered as the primary endpoints, with drug-targeted proteins serving as exposures. pQTL data for plasma protein levels were obtained from two large-scale proteomic studies: deCODE (covering 4907 proteins from 35,559 individuals) and UKB-PPP (covering 2923 proteins from 54,219 individuals). For each protein, only variants with a reference SNP ID (rsID) were retained. We selected instrumental variables for drug targets based on pQTL data from these two databases. 2.3. Identification of Drug Target Proteins We included commonly prescribed OCs for analysis and categorized them into subtypes—including combined oral contraceptives, progestin-only pills, and emergency contraceptive pills—to minimize potential confounding factors and improve the precision of the association assessment. Drug targets were obtained using from the DrugBank database to ensure accurate identification of drug–protein interactions [[67]25]. Given the widespread use and complex composition of combined oral contraceptives, we also considered additional proteins potentially affected by these formulations. To enhance coverage and ensure comprehensiveness, supplementary protein targets were retrieved from the Comparative Toxicogenomics Database (CTD) [[68]26]. 2.4. Selection of Instrumental Variables To ensure the validity of MR, IVs must satisfy three core assumptions during selection: (1) the IVs are robustly associated with the exposure; (2) the effect of the IVs on the outcome is mediated only through the exposure variable; and (3) the effect of the IVs is not confounded by other variables. In this study, we defined the transcriptional start site of a protein-coding gene within 1 Mb as cis, and prioritized cis-pQTLs as IVs due to their direct relevance to protein expression. The following criteria were applied to screen for IVs: (1) Selection of single nucleotide polymorphisms (SNPs) commonly associated with the exposure (protein) as having genome-wide significance (p-value < 5 × 10^−8) and with a minor allele frequency (MAF) > 0.01; (2) Independent SNPs were selected to reduce the risk of pleiotropy due to the linkage disequilibrium (LD) effect, with an LD threshold of r^2 < 0.01, and the genetic distance was set at 10 Mb [[69]27,[70]28]; (3) SNPs within the major histocompatibility complex (MHC) region (chr 6: 25.5–34.0 Mb) were excluded due to their complex LD structure and high polymorphism; (4) For R^2 and F-statistics, we used the following equations: R^2 = 2 × MAF × (1 − MAF) × β^2; F = R^2 × (N − 2)/(1 − R^2) [[71]29], where the F-statistic measures the strength of the genetic instruments and R^2 represents the proportion of variance in the exposure explained by the SNP. IVs with an F-statistic < 10 were removed to avoid weak instrument bias. 2.5. Mendelian Randomization To ensure the scientific validity and robustness of the analysis, data preprocessing and harmonization were performed prior to the formal MR analysis. Specifically, we harmonized the protein IVs with PC meta-GWAS data, extracted overlapped IVs between the filtered exposure and the outcome datasets, and removed incompatible or palindromic SNPs. Subsequently, appropriate MR methods were selected based on the number of IVs available to estimate causal effects and assess potential pleiotropy. When only a single IV was available, the Wald ratio method was applied. For exposures with two or more IVs, the inverse-variance weighted (IVW) method was employed, as it is the most efficient approach under the assumption that all IVs are valid [[72]30]. In cases with more than three IVs, the MR-Egger method was conducted to detect potential horizontal pleiotropy. If the MR-Egger intercept test results indicated no pleiotropy (p-value > 0.05), the IVW method was adopted. Otherwise, causal inference was based on MR-Egger results. Heterogeneity among IVs was assessed using Cochrane’s Q-test. When both the UKB and deCODE datasets provided significant results, the dataset with a greater number of IVs was selected for causal estimation. All MR analyses were conducted using the “TwoSampleMR” R package [[73]31]. 2.6. Sensitivity Analysis MR results can be influenced by the selection of IVs. In the sensitivity analysis, we re-selected the IVs using a relaxed LD threshold of r^2 < 0.1. Although using a larger set of IVs would increase statistical power, it can also introduce potential confounding effects [[74]16,[75]32]. Therefore, it is essential to ensure that the causal direction observed in the sensitivity analysis remained consistent with that of the primary MR analysis. To control for multiple testing, the Bonferroni correction was applied to identify significant associations. Furthermore, to minimize the risk of potential reverse causality bias, we conducted a reverse causality analysis by using PC as the exposure and cis-pQTLs as the outcome. For selecting IVs for PC, we set a p-value threshold of <5 × 10^−8 and an r^2 < 0.01 within a 10 MB window to ensure the validity of the IVs. 2.7. Colocalization Analysis To further validate the results, we performed a Bayesian colocalization analysis, which estimates the posterior probability that a given genomic locus contains a single causal variant influencing both protein levels and disease risk. This approach helps assess whether the observed associations between drug target proteins and PC risk are confounded by LD. For each protein, we selected independent cis-pQTL within a 500 kb window as the candidate colocalized regions. The analysis evaluates five hypotheses: (1) H[0]: no causal variant for either drug target protein or PC; (2) H[1]: a causal variant only for protein; (3) H[2]: a causal variant only for diseases; (4) H[3]: two different causal variants for protein and diseases, respectively; and (5) H[4]: one shared causal variant influencing for both traits. The posterior probability (PP) for each hypothesis is denoted as PPH[0], PPH[1], PPH[2], PPH[3], and PPH[4] [[76]33]. For proteins with multiple independent loci, colocalization analysis was conducted separately for each locus using meta summary-level data. Regions with the highest posterior probability of H[4] (PPH[4]) were considered to provide the strongest evidence for colocalization. We defined medium-support evidence of colocalization as a PPH[4] between 0.50 and 0.75, and high-support colocalization as a PPH[4] > 0.75 [[77]34]. Bayesian colocalization analysis was conducted using the ‘coloc’ package. 2.8. Differentially Expressed Genes To further validate the MR findings and explore whether the causal targets identified were differentially expressed in PC, we conducted a differential expression analysis between pancreatic tumor and normal pancreatic tissues. RNA transcriptomic data were obtained from 178 pancreatic cancer samples in The Cancer Genome Atlas (TCGA) “[78]https://www.cancer.gov/tcga” (accessed on 19 September 2024) and 167 normal pancreatic tissue samples from the Genotype-Tissue Expression (GTEx) project. Normalized data were retrieved from the UCSC Xena platform [[79]35]. Genes were defined as differentially expressed (DEGs) if they exhibited a Log[2] fold change (Log[2] FC) > 2 and an adjusted p-value < 0.05. Differential expression analysis was conducted using the Limma package to identify DEGs that met these criteria [[80]36]. This analysis provided additional evidence supporting the causal relevance of the drug targets implicated by MR. 2.9. Single Cell-Type Expression Analysis Single-cell RNA sequencing (scRNA-seq) provides valuable insights into the cell type-specific effects of drug targets and the underlying pathways involved in the development of PC. We analyzed scRNA-seq data derived from human pancreatic tumor tissues and adjacent normal tissues, available from the Gene Expression Omnibus (GEO) database [[81]37] ([82]GSE155698) [[83]38,[84]39]. This dataset included 16 pancreatic cancer samples and 3 adjacent normal samples. Prior to analysis, rigorous data preprocessing was performed to ensure high data quality for downstream applications. Specifically, using the “Seurat” package, we applied the following quality control criteria: retention of cells with gene counts between 600 and 4000, total RNA count per cell greater than 1000 but below the 97th percentile of the dataset, mitochondrial gene content below 10%, and hemoglobin gene percentage below 1%. After filtering, 23,432 cells and 24,725 genes were retained for analysis. Gene expression values were normalized and scaled using the NormalizeData and ScaleData functions in Seurat. Cell clustering was performed, and clusters were manually annotated based on canonical marker genes to facilitate cell type-specific downstream analyses. To investigate the role of genes encoding target proteins in specific cell types, we performed a differential expression analysis between pancreatic cancer and adjacent normal tissue cells within each cell cluster. This analysis was conducted using the Wilcoxon rank-sum test, implemented in the FindMarkers function in Seurat. Genes were considered differentially expressed if they exhibited an average |Log[2]FC| > 1 and an adjusted p-value < 0.05. Subsequently, we performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses on the identified differentially expressed genes. Particular attention was given to the biological process (BP) category in the GO enrichment to elucidate the potential biological functions and mechanisms underlying the cell type-specific DEGs [[85]40]. To further explore the cell type-specific expression of genes encoding pathogenic proteins within pancreatic cancer tissues, we performed differential expression analysis across single-cell clusters. Specifically, we compared gene expression levels between different cell types using the Wilcoxon rank-sum test. Genes with an average |Log2 FC| > 1 and an adjusted p-value < 0.05 were considered significantly enriched in specific cell types. 2.10. Protein Interaction Networks and Protein Function Queries To explore the known functions and biological roles of the target proteins, we utilized the STRING database [86]https://string-db.org/ (accessed on 26 December 2024) and conducted a comprehensive literature review. This approach enabled the exploration of protein–protein interactions (PPIs), the identification of potential functional associations, and the recognition of key molecules involved in relevant biological processes. By integrating evidence from both experimental studies and computational predictions, this provided valuable insights into the involvement of these proteins in cellular mechanisms and disease pathophysiology. 3. Results 3.1. MR Results Identified PC-Related OC Target Proteins In this study, the oral contraceptives investigated include ethinyl estradiol, levonorgestrel, desogestrel, cyproterone acetate, medroxyprogesterone acetate, dienogest, estradiol valerate, gestodene, spironolactone, finasteride, letrozole, and drospirenone. Drug-related targets were identified using the DrugBank database, resulting in 118 drug-related targets. Additionally, combined oral contraceptives, which are defined as compound prescriptions containing estrogen and progesterone, were also considered, and 48 interaction targets were retrieved from the CTD. Detailed drug target information is provided in [87]Supplementary Tables S1 and S2. After applying thresholds for p-values and r^2, we calculated F-statistics, with all genetic instruments having F-statistics > 10. Following the selection of IVs, a total of 319 SNPs were available to proxy 32 proteins in deCODE, and 256 SNPs were available to proxy 35 proteins in UKB-PPP. The primary analysis identified five proteins that were causally associated with PC at the blood level, as determined using IVW as the main MR method. These significant findings, along with the analytical approach, are illustrated in [88]Figure 2. Specifically, the genetically predicted levels of four proteins (AGT, COMT, FN1, and UGT1A1) were positively associated with increased PC risk, while SERPINC1 was negatively associated with PC risk. Detailed information on the corresponding drug target interactions and their effects on PC risk are provided in [89]Table 1. COMT and UGT1A1 are involved as substrates or inducers in the metabolism of hormonal drugs such as ethinyl estradiol, desogestrel, and estradiol valerate. In contrast, AGT, FN1, and SERPINC1 are known targets modulated by ethinyl estradiol [[90]41] and combined oral contraceptives, including formulations co-administered with levonorgestrel, gestodene, norgestimate, and dienogest [[91]42,[92]43,[93]44,[94]45,[95]46,[96]47]. Notably, while ethinyl estradiol combined with dienogest has been reported to influence SERPINC1 activity [[97]48], other COC formulations have been associated with either the upregulation or downregulation of these protein expression levels, depending on the specific drug combination and context. Figure 2. [98]Figure 2 [99]Open in a new tab MR estimation of causal relationship between drug target and PC. IVs: Number of instrumental variables; OR: Odds Ratio; CI: Confidence Interval. Table 1. Drug Target interactions. Protein Drug Drug Target Interactions Effect of the Protein on PC COMT ethinyl estradiol substrate Risk factors AGT ethinyl estradiol increased expression Risk factors ethinyl estradiol co-treated with levonorgestrel increased expression Risk factors FN1 ethinyl estradiol co-treated with gestodene increased expression Risk factors UGT1A1 ethinyl estradiol substrate|inducer Risk factors desogestrel inducer Risk factors estradiol valerate inducer Risk factors SERPINC1 ethinyl estradiol co-treated with levonorgestrel decreased expression Protective factors ethinyl estradiol co-treated with norgestimate decreased expression Protective factors ethinyl estradiol co-treated with dienogest decreased activity Protective factors ethinyl estradiol co-treated with gestodene decreased expression Protective factors [100]Open in a new tab 3.2. Sensitivity Analysis and Colocalization Analysis We conducted a series of sensitivity analyses to assess the robustness and reliability of our MR findings. First, the MR was repeated using additional sets of threshold-screened IVs (r^2 < 0.1 within 10 Mb). The directions of causal effects for all initially significant proteins remained consistent with the primary MR result. To account for multiple testing, all p-values were adjusted using the BH procedure, and four targets (AGT, COMT, FN1, and SERPINC1) passed the corrected significance threshold of 0.01 (calculated as 0.05/5). It is worth noting that although UGT1A1 was significant in the primary MR analysis, its association did not remain significant after multiple testing correction (adjusted p-value > 0.05), likely due to the large number of comparisons. Accordingly, we prioritized the exclusion of targets—such as UGT1A1—that failed to pass multiple testing correction or exhibited signs of heterogeneity or directional pleiotropy. We further performed reverse MR analysis using 1401 IVs for PC, and found no significant evidence of reverse causality influencing drug target protein levels, suggesting that the observed associations were not biased by reverse causal effects. Moreover, no heterogeneity was detected using Cochran’s Q test, and no horizontal pleiotropy was observed through the MR-Egger intercept test. These analyses provide strong evidence that the MR results are both robust and reliable. Despite the consistency across sensitivity analyses, a causal relationship between genetically determined protein levels and PC could still be influenced by LD, or it could reflect true shared causal variation between the two. To address these possibilities, we performed a colocalization analysis, with the results summarized in [101]Table 2. Among the five candidate proteins, the PPH[4] values for AGT (PPH[4] = 0.93) and SERPINC1 (PPH[4] = 0.99) showed high-support evidence for an association linked to a shared causal variant. Additionally, medium-support evidence of colocalization (PPH[4] > 0.5) was observed for FN1 (PPH[4] = 0.51) and UGT1A1 (PPH[4] = 0.59) with PC. These results suggest a high likelihood of shared causal variants between AGT, SERPINC1, FN1, and UGT1A1 protein levels and PC risk, indicating that the MR findings for these proteins not attributable to LD. However, the absence of significant colocalization results does not undermine the validity of the MR analysis [[102]49]. Table 2. Sensitivity analysis and colocalization analysis. Primary Step MR Sensitivity Step MR Colocalization Protein OR p-Value OR p-Value Adjust p-Value Revers Effect Sensitivity Analyses PPH[4] FN1 1.20 0.0086 1.71 0.003 0.015 No Passed 51% COMT 2.31 0.032 1.15 0.001 0.005 No Passed 21% AGT 2.52 0.035 2.16 0.002 0.01 No Passed 93.20% UGT1A1 1.31 0.045 1.26 0.02 0.1 No Failed 59.90% SERPINC1 0.12 0.0007 0.12 0.0007 0.0035 No Passed 100% [103]Open in a new tab 3.3. Identified Protein-Coding Genes Differentially Expressed Between Tumor and Normal Tissue To investigate whether the identified protein-coding genes are differentially expressed in cancer and normal tissues, we conducted a comprehensive differentially expressed analysis. We applied a stringent correction for the FDR to control for multiple testing, with an adjusted p-value threshold of <0.05. The analysis revealed marked differences in expression levels for several target genes. Specifically, FN1 (Log[2] FC = 6.2), UGT1A1 (Log[2] FC = 6.1), AGT (Log[2] FC = 3.31), and COMT (Log[2] FC = 2.4) were significantly upregulated in tumor tissues compared to normal tissues. In contrast, SERPINC1 showed no significant differential expression, indicating that its expression levels remained relatively stable between tumor and normal samples. As shown in [104]Figure 3A, nearly all of the identified protein-coding genes exhibited significant differential expression between tumor and normal tissues, with the notable exception of SERPINC1. Figure 3. [105]Figure 3 [106]Open in a new tab The results of DEGs and specific expression protein coding genes. (A) The DEGs in PC tissue for the protein-coding genes of the identified OCs; (B) Single cell-type expression in PC tissue represented the 11 cell clusters labeled and annotated; (C) The expression of identified protein coding gene in each cell type. 3.4. Single-Cell Type Expression in PC Tissues and Pathway Enrichment To investigate the cell type-specific expression patterns of the identified protein-coding genes and their potential roles in PC development within distinct cellular contexts, we analyzed scRNA-seq data from 16 samples obtained from the [107]GSE155698 dataset [[108]38,[109]39]. As shown in [110]Figure 3B, all cells were manually annotated and classified into 11 clusters: T cells, epithelial cells, NK cells, neutrophils, monocytes, mast cells, plasma cells, fibroblasts, B cells, endothelial cells, and unknown cells. Cell type-specific gene expression profiles are shown in [111]Figure 3C. After adjusting for p-value and performing the Wilcoxon rank-sum test, COMT was found to be ubiquitously expressed across all cell types, with the exception of monocytes. In contrast, AGT and FN1 were observed to exhibit highly cell type-specific expression patterns, with both genes predominantly expressed in fibroblasts compared to other cell clusters (Log[2] FC values of 1.28 and 3.94, respectively). Additionally, FN1 exhibited notable expression in endothelial cells. No significant or specific expression of SERPINC1 or UGT1A1 was detected in any particular cell type in the single-cell dataset. To further explore how the identified cell type-specific protein-coding gene influenced the progression of PC, we conducted a differential expression analysis between tumor and adjacent normal samples for each cell type, followed by KEGG and GO enrichment analyses. As displayed in [112]Figure 4A, the KEGG analysis revealed that DEGs were broadly involved in pancreatic secretion pathways across all cell types. Additionally, protein digestion and uptake pathways were enriched in nearly all cell types, except for acinar cells. Focusing on the BP enriched within each cell type, we observed distinct functional signatures. In B cells, mast cells, macrophages, neutrophils, NK cells, plasma cells, T cells, and endothelial cells, DEGs were significantly enriched in processes related to digestion, leukotriene metabolism, regulation of lipase activity, lipid catabolism, and the positive regulation of lipase activity. Notably, fibroblasts exhibited a distinct enrichment pattern, with DEGs predominantly associated with structural and extracellular organization, including collagen fibril organization and extracellular matrix (ECM) organization. These pathways have been well-documented in pancreatic cancer progression, particularly in facilitating tumor growth, metastasis, and chemoresistance [[113]38,[114]50,[115]51]. Figure 4. [116]Figure 4 [117]Open in a new tab Results of KEGG and PPI network. (A) Cluster-specific DEGs KEGG pathway. (B) The top 10 DEGs identified in fibroblasts ranked by |Log2 FC|. Green indicates that Log2 FC is negative, and red indicates Log2 FC is positive. (C) Identified protein interaction network. To further evaluate the involvement of the identified protein-coding genes in specific biological pathways, we calculated the Log[2] FC of DEGs in fibroblasts. Among these DEGs, two identified genes (AGT and FN1) demonstrated Log[2] FC values of 1.16 and 3.86, respectively. Notably, as shown in [118]Figure 4B, FN1 exhibited an exceptionally high expression level, ranking fourth in terms of |Log[2] FC| among all fibroblast DEGs. These findings suggest that FN1 and AGT may play critical roles in the progression of pancreatic cancer, particularly within the tumor microenvironment, and may represent promising therapeutic targets. 3.5. Protein PPI Network and Gene Function To explore the interactions between the identified plasma proteins, PPI network analysis was performed using the STRING database. As shown in [119]Figure 4C, with a confidence score of 0.4, interactions were observed between AGT, FN1, and SERPINC1, as well as between UGT1A1 and COMT. To further examine the roles of the identified proteins in specific biological pathways, we investigated their known functions. COMT is involved in neurotransmitter metabolism, specifically degrading catecholamine transmitters through methylation [[120]52]. AGT plays a critical role in maintaining fluid and electrolyte homeostasis and has been implicated in the pathogenesis of essential hypertension [[121]53]. UGT1A1 participates in the glucuronidation pathway, which is essential for detoxification and the regulation of various substances in metabolism [[122]54,[123]55]. SERPINC1 is associated with coagulation, functioning primarily by inhibiting the activation of serine proteases and thrombin. Importantly, FN1 is crucial for cell adhesion and migration, playing a vital role in the ECM [[124]56,[125]57,[126]58]; it contributes to the formation of structural fibers that facilitate tissue repair and supports the continuous remodeling of the ECM, thereby maintaining tissue strength and structural integrity [[127]59]. 4. Discussion PC remains a major global health challenge due to its high mortality rate and limited improvements in survival, despite recent medical advances. This highlights the urgent need for effective preventive strategies [[128]60]. In this context, identifying and mitigating risk factors, particularly the potential impact of OCs on PC, has become increasingly important. In recent years, numerous studies have investigated the relationship between OCs on PC risk. However, findings from observational studies have been inconsistent, largely due to inherent limitations such as selection bias, information bias, and confounding by indication [[129]13,[130]61,[131]62]. These methodological weaknesses impede the ability to draw reliable causal inferences. Given these issues, using advanced analytical methods is essential to gain a more accurate understanding of the causal relationship between OCs and PC. MR offers a robust method of addressing confounding by leveraging genetic variants as instrumental variables, allowing for a more accurate assessment of the causal relationship between OCs and PC [[132]63,[133]64]. In this study, we systematically investigated the causal relationship between OC drug target proteins and PC risk using MR. Our analysis was structured into three kay sections, each addressing a specific objective: First, we aimed to identify drug target proteins that are causally linked to PC and evaluate the potential effects of OCs on these targets using MR. Second, we performed sensitivity analyses and colocalization analyses to assess the robustness of our MR findings. Third, we conducted additional validation through differential expression analysis, single-cell transcriptomic analysis, and enrichment analysis to explore the cell-specific expression of the identified protein targets and their potential biological functions, particularly in pathways relevant to PC progression. Our results identified key drug targets, including FN1, AGT, and COMT, that may mediate the effect of OCs on PC risk. Specifically, our findings suggest that OCs increase the expression of FN1 and AGT and utilize COMT as a substrate. FN1, a crucial component of extracellular matrix (ECM) organization, plays a pivotal role in cancer progression and metastasis [[134]65,[135]66,[136]67,[137]68,[138]69,[139]70,[140]71,[141]72,[142]7 3]. Its upregulation in fibroblasts contributes to the fibrotic stroma of PC [[143]74,[144]75], thereby promoting tumor progression. Additionally, FN1 has been associated with drug resistance in PC and is considered a potential therapeutic target [[145]76,[146]77]. AGT has similarly been implicated in tumor growth in various cancers [[147]78,[148]79,[149]80], and our findings further support its role in PC. Our findings suggest that co-treatment with different contraceptive drugs upregulates the expression of both FN1 and AGT, which may contribute to an increased risk of PC. Although Bayesian colocalization analysis for COMT showed negative results, both tissue and single-cell analyses indicated a strong correlation between COMT expression and PC. COMT overexpression has been linked to early-stage carcinogenesis in PC and is markedly elevated in PC tissues, correlating with features such as early T stage [[150]81,[151]82,[152]83]. While the role of OCs in modulating COMT remains uncertain, evidence from animal models suggests that ethinyl estradiol can upregulate COMT expression, highlighting the need for further validation in human studies [[153]84,[154]85]. In conclusion, our study provides compelling evidence that OCs, particularly combined oral contraceptives, may increase PC risk by upregulating key drug target proteins, including FN1 and AGT. By leveraging MR to minimize confounding, we establish a robust causal relationship between OC use and elevated PC risk, supporting and extending findings from previous research. These results offer important insights into the underlying biological mechanisms of OC-associated PC risk. Further studies are warranted to validate these findings and to elucidate the precise molecular pathways involved. However, several limitations should be considered: First, our analyses were limited to individuals of European ancestry, which may restrict the generalizability of the findings to other ethnic populations. Second, we focused on the pQTLs of drug-targeted proteins by leveraging the most comprehensive and up-to-date datasets currently available, aiming to ensure maximal coverage. However, we acknowledge that some relevant pQTLs may have been excluded, which may provide further insights into the pathogenesis of PC. Third, although the use of stringent criteria for instrument selection enhances validity, it may also reduce statistical power and potentially overlook relevant associations. For example, while UGT1A1 and SERPINC1 showed significant associations in the initial MR analysis, these findings were not supported by sensitivity analyses or DEG analyses. In addition, we did not include trans-pQTLs, which may have limited a more comprehensive understanding of the roles of these proteins in PC. To address the limitations of our study, future research should prioritize the inclusion of ethnically diverse populations, particularly non-European groups, to improve the generalizability of the findings. Expanding the analysis to incorporate a wider range of pQTLs, including trans-pQTLs, may uncover additional drug targets involved in PC progression. Broader IV selection criteria and larger datasets will enhance the statistical power and may identify previously overlooked targets. Moreover, experimental studies are necessary to validate the regulation effects of OCs on COMT expression, especially in humans. Lastly, large-scale longitudinal cohort or case–control studies are also essential to confirm the causal relationship between OC use and PC risk, and to differentiate the effects of various OC formulations. Ultimately, these efforts will strengthen the evidence base for refining prevention strategies and therapeutic interventions. In summary, our study utilized MR to investigate the causal effects of plasma drug targets associated with OCs on PC risk. Among the identified targets, FN1 and AGT demonstrated the strongest association, suggesting that combined oral contraceptives may contribute to increased PC risk through the modulation of these proteins. These findings suggest that oral contraceptives, particularly combined oral contraceptives, may influence the risk of PC development. However, further experimental and clinical validation is necessary to confirm these associations and clarify the underlying mechanisms. Acknowledgments