Graphical abstract graphic file with name fx1.jpg [38]Open in a new tab Highlights * • RAMEN enables efficient and scalable construction of Bayesian networks from clinical data * • RAMEN integrates absorbing random walks and genetic algorithms to generate these networks * • RAMEN outperforms conventional statistical and network-based approaches * • RAMEN uncovers key disease indicators across diverse diseases and datasets Motivation Given patient clinical records, understanding the interactions among clinical variables and their impact on disease outcomes is crucial for advancing diagnostics and personalized medicine. Traditional statistical methods fail to capture indirect relationships, while Bayesian network learning methods are computationally inefficient and only infer general associations without prioritizing disease-relevant variables. Random walk- and genetic algorithm-based network inference (RAMEN) overcomes these limitations by integrating absorbing random walks and genetic algorithms, which efficiently learn Bayesian network structures while ensuring the network is target-variable oriented—focusing on disease outcomes. By leveraging clinical records, RAMEN uncovers complex variable interactions, enhancing disease understanding and informing the development of improved diagnostics and treatments. __________________________________________________________________ Xiong et al. present RAMEN, an approach that integrates absorbing random walks and genetic algorithms to enable efficient and scalable construction of Bayesian networks from clinical data. Validated using data from diverse complex diseases, RAMEN achieves state-of-the-art accuracy and uncovers key variables linked to disease outcomes, enabling broad biomedical applications. Introduction In recent years, large-scale outbreaks and chronic conditions have posed serious challenges to global health, reshaping lives and imposing substantial socioeconomic burdens.[39]^1^,[40]^2 Despite numerous efforts, our understanding of the underlying mechanisms of many diseases remains incomplete, hindering accurate diagnosis, prediction of disease trajectories, and the development of effective treatments. Many diseases exhibit high variability among patients, with distinct symptoms and clinical outcomes,[41]^3 yet the mechanisms driving this heterogeneity remain unclear. This gap in understanding restricts early diagnosis and targeted interventions for the most vulnerable individuals,[42]^4^,[43]^5 prolonging suffering and delaying care. The rapid accumulation of clinical and population-level datasets presents unprecedented opportunities to uncover disease mechanisms. Large-scale biobanks now integrate clinical, proteomic, and transcriptomic data, driving computational advancements.[44]^6^,[45]^7 In Quebec, Canada, the Biobanque québécoise de la COVID-19 (BQC19) has collected clinical data from over 6,000 COVID-19 patients, alongside proteomic and transcriptomic profiles for a subset.[46]^8 Similarly, the Lawson Health Research Institute in Ontario has collected clinical records on COVID-19 patients[47]^9 for studying long COVID. Beyond COVID-19, datasets such as MIMIC-III,[48]^10 which focuses on intensive care unit (ICU) patients, and CanCOLD,[49]^11 which tracks chronic obstructive pulmonary disease (COPD)-related variables, extend computational applications to other chronic diseases. Despite these resources, extracting actionable insights remains challenging and requires innovative computational approaches. Integrating clinical and large-scale datasets can help identify diagnostic markers and therapeutic targets. Advancing methods to bridge these data types and uncover disease mechanisms has the potential to transform precision medicine and global health. With the growing availability of clinical datasets, many studies have explored relationships between clinical variables and disease outcomes[50]^12^,[51]^13^,[52]^14^,[53]^15 using simple statistical methods (e.g., Pearson correlation[54]^15 and mutual information[55]^16). These approaches map direct associations between clinical variables but do not capture directionality, indirect interactions through intermediate variables, and complex interactions among multiple variables. These drawbacks limit their utility in uncovering underlying mechanisms. On the other hand, Bayesian networks (BNs),[56]^17^,[57]^18^,[58]^19 a class of probabilistic graphical models, address these limitations by inferring clinical variables indicative of disease outcomes (e.g., severity or mortality). BNs have demonstrated success in disease diagnostics, often outperforming physicians.[59]^20^,[60]^21 For example, BN-based models achieved state-of-the-art performance in neurodegenerative disease diagnosis while maintaining interpretability[61]^22^,[62]^23^,[63]^24^,[64]^25^,[65]^26^,[66]^27 but also provided very good interpretability. However, constructing BNs, particularly learning their structure, is computationally challenging due to the vast discrete search space—an NP-hard problem.[67]^28^,[68]^29 The combination of high-dimensional data and relatively low sample sizes makes conventional optimization methods such as backpropagation ineffective.[69]^30^,[70]^31^,[71]^32 In practice, structure learning is often regularized with prior knowledge to constrain the search space.[72]^33^,[73]^34 However, this is not a feasible approach for our study, as our objective is to identify previously unrecognized clinical variables influencing disease outcomes, including COVID-19 severity, long COVID, septicemia mortality, and COPD exacerbation. Imposing prior constraints would not only introduce bias but also be impractical, as many of these diseases remain poorly understood. To address the above limitations and fill the gap, here we introduce RAMEN (random walk- and genetic algorithm-based network inference), which glues absorbing random walks and a genetic algorithm to infer a BN representing the relationships between clinical variables and the disease outcome. The random walks are employed to rank and select the most relevant variables and connections to the disease-outcome variable to reduce the network complexity. A significant aspect of our methodology is the incorporation of a terminal absorbing node, symbolizing the disease outcome of interest, such as COVID-19 severity, within our clinical variable network. Following the preliminary network reconstruction through the absorbing random walk process, we further employ a genetic algorithm to refine and identify an optimized network structure. This optimized structure is more accurately aligned with the observed clinical variables, ensuring a precise representation of the relationships and interactions within the dataset. The choice of genetic algorithm is based on its suitability for exploring large discrete search space in our task,[74]^35^,[75]^36^,[76]^37 flexibility,[77]^36^,[78]^38 and the empirical evidence of its effectiveness.[79]^39^,[80]^40 After these two stages, RAMEN outputs a BN that models the complex relationship between the disease-outcome variable (e.g., COVID-19 severity) and other variables that are directly or indirectly connected to it. To examine the performance of RAMEN, we applied the method to three different COVID-19 cohorts from the BQC19 project and Lawson Health Research Institute, a septicemia cohort from MIMIC-III, and a COPD cohort (CanCOLD), examined the resulting network with multi-omics measurements and computational simulations, and compared RAMEN with other methods. We show that the resulting networks capture important disease-outcome indicators that can be validated via multi-omics, simulation, or literature. Moreover, RAMEN demonstrated superior performance over simple statistical methods by finding more relevant variables and indirect variables that cannot be found using simple statistical methods such as mutual information and Pearson correlation. Furthermore, this model has the potential to be generalized for analyzing clinical variable networks across a wide range of diseases, provided similar records of clinical variables are available. This broad applicability enhances its utility across diverse areas of medical research. Results Overview of RAMEN The RAMEN method operates in two sequential phases: an absorbing random walk and a genetic algorithm ([81]Figure 1). In phase 1, the algorithm initializes a fully connected network comprising all clinical variables, with edge weights corresponding to mutual information between node pairs. Random walks are then conducted using normalized mutual information as transition probabilities, terminating either after a predefined number of steps or upon reaching the target disease-outcome variable, which serves as an absorbing state. To construct the network skeleton, edges with significantly higher visit counts (q value ≤ 0.05) are retained. The statistical significance of edge visits is determined through a permutation test against a randomized background (see [82]STAR Methods and [83]Figure S1A for details). In phase 2, a genetic algorithm refines the network skeleton by iteratively generating and evaluating candidate BNs. The initial population consists of network structures derived from the absorbing random walk phase. New candidate networks are generated through crossover, whereby subnetworks from high-performing candidates are recombined, and mutation, which introduces small topological changes to explore alternative structures. A selection process prioritizes networks with higher scores based on an entropy-based objective function, which assesses the network’s ability to effectively model the observed clinical data while maintaining interpretability. Over successive generations, the algorithm converges toward an optimized BN that captures the most informative relationships among clinical variables. This evolutionary approach allows for robust exploration of the network space, avoiding local optima and ensuring that the final network reflects biologically meaningful and statistically significant associations. Figure 1. [84]Figure 1 [85]Open in a new tab Overview of the RAMEN methodology The RAMEN approach constructs Bayesian networks from clinical data through a sequential two-phase process. (Phase 1) Establishing the initial network via absorbing random walk-based permutation test. Beginning with preprocessed clinical data, this stage implements a permutation test via a random-walk strategy across a comprehensive network of all included variables, where nodes symbolize variables and edge weights indicate the mutual information among variable pairs. The process identifies stronger variable connections by tracking the frequency of edge traversal in successful random walks (ending at the target node). Edges with significantly higher traversal frequencies, as established through permutation testing, lay the groundwork for the network, preparing it for further enhancement. (Phase 2) Enhancing the network with a genetic algorithm. This stage further refines the Bayesian network structure. Starting with a set of initial network configurations derived from the early framework, the genetic algorithm applies crossover (merging two configurations) and mutation (applying random changes) to evolve these structures. Each cycle assesses the network structures against a specific scoring function, prioritizing those with superior scores for subsequent iterations. This cycle of refinement, through modification, assessment, and selection, persists until a stable score is achieved, culminating in an optimized network structure. Building a COVID-19 severity network using the BQC19 hospitalized patient dataset In our study, the RAMEN methodology was applied to the hospitalized patient cohort data from BQC19, resulting in the development of a BN. This BN delineates the complex interplay among clinical variables and their association with COVID-19 severity ([86]Figure 2A). Our analysis encompassed 2,018 hospitalized patients, with the dataset including 880 clinical variables and 297 clinical variables after data cleaning. The severity of COVID-19 within this cohort was classified into three categories: “not infected or mild,” “moderate,” and “severe or deceased.” The inferred clinical variable network captures relationships with COVID-19 severity that align with findings in the existing literature. Among the variables identified, several key examples linked to “COVID-19 severity” include “sex,”[87]^13^,[88]^41 “age,”[89]^13^,[90]^41 “BMI,”[91]^42^,[92]^43 “arterial hypertension,”[93]^44 “ALT,”[94]^45 “C-reactive protein (CRP) (highest value),”[95]^46 and “albumin (lowest value).”[96]^47 These variables represent just a sample of the broader network, illustrating the diverse factors impacting COVID-19 severity as observed in our study. Figure 2. [97]Figure 2 [98]Open in a new tab RAMEN unveils indicators of COVID-19 severity in BQC19 hospitalized patient data (A) A streamlined network showcasing 231 of the most significant connections identified by RAMEN, indicative of COVID-19 severity. The full names of the variables are provided in [99]Data S1. The color and thickness of edges signify the connection strength (blue for weaker, red for stronger) based on mutual information metrics. Nodes are colored according to categories of clinical variables, with their size reflecting the strength of their correlation with COVID-19 severity. The diamond-shaped node represents the outcome variable, which is COVID-19 severity. (B) Comparison of AUROC for predicting COVID-19 severity using indicator variables, contrasting RAMEN-identified indicators against those identified through mutual information and Pearson correlation methods, with predictions made by support vector machines (SVMs). A higher AUROC suggests a greater relevance of the identified variables for severity prediction. Indicator variable selection by RAMEN is detailed in [100]STAR Methods and, to ensure a fair comparison, all compared methods use the same number (161) of top indicators. (C) Analysis of Shapley additive explanations (SHAP) values, providing one possible explanation of the significance of clinical variables identified by RAMEN in SVM-based predictions. These values illustrate the potential impact of variables on the model’s prediction, indicating whether they contribute toward a positive or negative outcome. The consistent color scheme across the x axis highlights variables identified as dependable predictors by SHAP. For clarity, the importance ranking assigned by RAMEN is shown in parentheses after each variable name. (D) Heatmaps illustrating the conditional distribution of COVID-19 severity levels (SEV) across the values of direct indicator variables, where the heatmap colors represent the proportion of patients within each severity category for given indicator values. This visual representation aids in understanding the correlation between specific clinical indicators and severity outcomes. To demonstrate the superior capability of RAMEN in identifying relevant indicators for COVID-19 outcomes compared to conventional statistical methods, we conducted a benchmark analysis focused on predicting COVID-19 severity based on the early record (within 1 month after diagnosis) of clinical variables. This involved training a support vector machine (SVM) classifier using indicator variables from the COVID-19 severity network established by RAMEN, and two additional SVMs, each utilizing the top variables identified by mutual information and Pearson correlation methods, respectively. The effectiveness of these variable selection methods was assessed based on the predictive performance of the SVMs. As shown in [101]Figure 2B, the area under the receiver-operating characteristic (AUROC) curves compare the performance of SVM classifiers trained on variables identified by each method. This result underscores RAMEN’s ability to uncover more pertinent COVID-19 outcome indicators than traditional statistical methods. The developed classifier can also be used to predict the outcome of the disease (such as the severity of COVID-19) based on the early clinical variable records from the first month of patient care. To assess the effectiveness of the indicators identified by RAMEN, we visualized the Shapley additive explanations (SHAP)[102]^48^,[103]^49 values in [104]Figure 2C as a reference point. This visualization details how each indicator contributes to the SVM’s positive or negative predictions, with the y axis representing the contribution magnitude and the x axis listing the RAMEN-identified indicator variables. The plot reveals a consistent pattern of the value distribution (as indicated by the colors) across both sides of the x axis, with many variables situated significantly away from the axis. In addition, among the top 20 important features identified by SHAP and RAMEN, there is substantial overlap: 19 of them are shared, and the remaining one ranks 23rd ([105]Figure S1B visualizes the overlap between top indicators found by RAMEN and SHAP). It is important to note, however, that while SHAP values provide a ranking of feature importance, it is not necessarily the ground truth.[106]^50^,[107]^51 Additionally, they are not designed to infer network structures or identify complex relationships between variables (e.g., edges in a BN). RAMEN, by contrast, constructs BNs that capture both direct and indirect relationships between clinical variables. The observed consistency in rankings reinforces the reliability of the variables identified in the network but does not diminish the capability of RAMEN to infer network structure, which SHAP cannot achieve. Thus, the strong agreement between RAMEN’s indicators and those highlighted by SHAP thus serves as complementary evidence rather than a replacement. The association between identified indicators and COVID-19 severity is further elucidated through heatmaps, as shown in [108]Figure 2D. These heatmaps detail the relationship between COVID-19 severity levels and the pertinent indicators identified by RAMEN. Each heatmap illustrates the variation in the percentage of patients across different severity levels in relation to the values of variables directly linked to COVID-19 severity. Generally, for variables that are strongly connected to COVID-19 severity within our reconstructed network, there is a significant shift in the distribution of severity levels corresponding to the values of these indicator variables. Conversely, for variables deemed irrelevant, the severity distribution remains largely unaffected by changes in these variables. The four heatmaps showcased underscore RAMEN’s efficacy in pinpointing highly relevant indicators of COVID-19 severity. Systematic validation of COVID-19 severity indicators identified by RAMEN using BQC19 multi-omics data To validate the reliability of the COVID-19 severity indicators identified by RAMEN, we utilized the BQC19 multi-omics dataset. Our validation approach included a comparison of differentially expressed (DE) genes and proteins associated with each severity indicator with those associated with various levels of COVID-19 severity. This process involved examining the overlap between DE genes (from RNA-sequencing [RNA-seq] data) or proteins (identified through SomaScan 5K array) related to the indicators and those distinguishing between mild and severe COVID-19 cases. The heatmaps depicted in [109]Figure 3A demonstrate a significant overlap of DE genes between the indicators and COVID-19 severity, revealing distinct expression patterns across the range of indicator values. These findings suggest that a common set of genes may be involved in linking these indicators to COVID-19 severity, indicating underlying biological pathways. Furthermore, the BQC19 multi-omics dataset provides insights into the biological mechanisms potentially governing these relationships. [110]Figure S2A shows an additional example between “ARDS” and “severity.” Pathway enrichment analysis of the “common” DE genes associated with COVID-19 severity and its primary indicators, shown in [111]Figure 3B, identified significant pathways including “neutrophil degradation,”[112]^52^,[113]^53 “innate immune system,”[114]^54 “antimicrobial peptides,”[115]^55 and “heme signaling.”[116]^56 These pathways are implicated in modulating COVID-19 severity, suggesting mechanisms through which the indicators may influence disease severity. Severe COVID-19 is often characterized by acute respiratory distress syndrome (ARDS) associated with abnormal coagulation.[117]^57^,[118]^58 Moreover, several studies have pointed to neutrophilia, release of their granules, and neutrophil extracellular traps (NETs) as key pathological features of thrombotic complications driven by the immune system (also called immunothrombosis) in severe COVID-19,[119]^3^,[120]^59^,[121]^60^,[122]^61^,[123]^62^,[124]^63^,[12 5]^64^,[126]^65^,[127]^66^,[128]^67^,[129]^68^,[130]^69 linking “ARDS,” “neutrophil degradation,” “innate immune system,” and “heme signaling.” Taken together, the pathway enrichment performed using the severity indicators identified by RAMEN is congruent with the existing literature, supporting the validity and reliability of RAMEN. We have also conducted this analysis using SomaScan data, which is illustrated in [131]Figures S2B, S2C, and [132]S3. Figure 3. [133]Figure 3 [134]Open in a new tab Support for the COVID severity network edges from the RNA-seq data (A) Analysis of gene expression across three groups of differentially expressed (DE) genes linked to example nodes “ARDS,” “Albumin,” and “BMI” that directly connect to COVID severity. For example, with “Albumin,” we first pinpoint DE genes associated with albumin variability (i.e., genes with expression changes in patients with varying albumin levels, denoted as [MATH: G1 :MATH] ). Next, we identify DE genes linked to COVID severity [MATH: (G2) :MATH] . The “Common” group represents DE genes common to both sets [MATH: (G1G2) :MATH] ; the “Albumin” group illustrates DE genes exclusive to the albumin variable [MATH: (G1¬G2< /mrow>) :MATH] ; and the “Severity” group shows DE genes unique to COVID-19 severity [MATH: (G2¬G1< /mrow>) :MATH] . (B) Identification of the top enriched pathways for each variable based on their common DE genes with the severity variable (the “Common” group). The x axis shows the negative log[10] of FDR-corrected p values. From the top to bottom are enrichment analyses we carried out for DE genes identified from the edge between variable “Acute Respiratory Distress Syndrome (ARDS)?” and variable “Severity,” the edge between “BMI” and “Severity,” and the edge between “Albumin” and “Severity.” (C) Validation of COVID-19 severity indicators using RNA-seq highlights RAMEN’s ability to uncover additional insights beyond those revealed by conventional statistical methods such as Pearson correlation and mutual information. Each method on the x axis (MI, mutual information; RAM, RAMEN; COR, Pearson correlation) classifies variables into indicators or non-indicators, with RNA-seq data providing the basis for ground truth. A variable is considered an indicator if its DE genes significantly overlap with those associated with COVID-19 severity, assessed via a hypergeometric test. The performance of each method is quantified using the F1 score from verifying the variables found by each method against the ground truth. RAMEN achieves a higher F1 score compared to statistics-based methods, indicating its ability to uncover relationships that extend beyond these methods. In addition, we performed systematic benchmarking of RAMEN against other statistical methods, such as mutual information and Pearson correlation, to assess its effectiveness in identifying severity indicators ([135]Figure 3C). In this benchmarking exercise, RNA-seq data served as the basis for establishing a definitive classification of ground truth. The hypergeometric test was applied to assess the congruence between DE genes from selected indicators and those associated with COVID-19 severity, using p values to determine statistical significance (see [136]STAR Methods). Clinical variables demonstrating a significant overlap of DE genes with COVID-19 severity were acknowledged as true indicators of severity for benchmarking purposes. The effectiveness of each method was assessed through the F1 score. According to [137]Figure 3C, correlation exhibited the lowest F1 score by a considerable margin, with mutual information showing significant improvement yet still trailing behind RAMEN. The combination of correlation and mutual information was also evaluated, resulting in a marginally improved F1 score, though still not surpassing RAMEN. These results underscore RAMEN’s capacity to identify severity indicators that traditional statistical methods may fail to detect. We have also carried out the same benchmarking using the SomaScan data, which is shown in [138]Figure S2D. Similarly, correlation exhibited the lowest F1 score by a considerable margin, with mutual information showing significant improvement. The combination of correlation and mutual information resulted in an improvement from the two methods individually. However, RAMEN still demonstrated a better score than all three methods. In addition to demonstrating the overall quality of the final network output, we conducted an ablation study to evaluate the contributions of the genetic algorithm (GA) beyond those of the random walk component in constructing the COVID-19 severity network. As shown in [139]Figure S1C, the random walk algorithm initially generates a skeleton network comprising 194 edges. The GA refines this skeleton by removing 112 edges and adding 814 new edges, significantly modifying and enriching the network structure. To quantify the importance of the GA’s modifications, we performed a binomial test to assess whether its operations were supported by RNA-seq data. Correct modifications were defined as adding RNA-supported edges or removing non-RNA-supported edges, while incorrect modifications involved removing RNA-supported edges or adding non-RNA-supported edges. The resulting binomial test yielded a highly significant p value of [MATH: 1.29×1025 :MATH] , confirming that the GA’s modifications significantly improve the network beyond random actions. Beyond RNA-seq validation, the GA introduces indirect edges that provide additional biological and clinical insights, which are often missed by random walks. For instance, while random walks effectively capture direct connections, such as between CRP and COVID-19 severity, they fail to uncover indirect connections that further contextualize CRP-related mechanisms. The GA compensates for this by identifying indirect edges such as those linking CRP to “total WBC count,”[140]^70 “temperature:.1,”[141]^71 and “APTT (activated partial thromboplastin time) (HIGHEST value).”[142]^72 These additional connections, supported by literature, provide richer insights into CRP’s clinical role. Similarly, while random walks identify the direct connection between “creatinine” and COVID-19 severity,[143]^73^,[144]^74^,[145]^75 they do not capture related variables such as “does the patient have other comorbidities?” (e.g., chronic kidney disease[146]^76), “sex,”[147]^77^,[148]^78 and “non-ST-elevation myocardial infarction (NSTEMI)”[149]^79^,[150]^80 The GA fills this gap by uncovering these indirect relationships, all of which are supported by existing studies. This ablation study demonstrates that while random walks excel at identifying direct relationships, they lack the capacity to reveal many clinically relevant indirect edges. The GA complements the random walks by refining the network and incorporating these indirect connections, significantly enhancing the network’s utility and interpretability. In addition to analyzing COVID-19 severity, we also applied RAMEN to the BQC19 outpatient COVID-19 cohort and a dataset from Lawson Health Research Institute[151]^9 to investigate long COVID. Using the BQC19 dataset, RAMEN constructed a network of long-COVID-related variables. We assessed key early indicators recorded within 1 month after diagnosis through SVM-based disease-outcome prediction, SHAP analysis, visualization, and literature validation ([152]Figures S4A–S4D). The network successfully identified clinically relevant variables, such as “age,”[153]^81 “chest pain,”[154]^5 “joint pain,”[155]^82 “runny nose,”[156]^82 and “shortness of breath,”[157]^5 while excluding irrelevant ones such as “chronic kidney disease,” which lacks established links to long COVID.[158]^83 We further performed RNA-seq and SomaScan heatmap analyses along with pathway enrichment analysis ([159]Figures S2E–S2I and [160]S5). RAMEN’s selected indicators improved SVM classification performance compared to other statistical methods ([161]Figure S4B), with SHAP analysis confirming their predictive strength ([162]Figure S4C). Distribution shifts of long COVID values based on indicator presence ([163]Figure S4D) and Pearson’s chi-squared test ([164]Table S1) further validated their relevance. To assess the robustness of RAMEN, we applied it to an independent long COVID dataset from the Lawson Health Research Institute[165]^9 and output a long COVID network ([166]Figure S4E). Among the eight overlapping variables with the BQC19 dataset, RAMEN consistently identified four key indicators—“chest pain,” “anosmia/ageusia,” “dyspnea,” and “headache”—supported by prior studies.[167]^5^,[168]^84^,[169]^85^,[170]^86 The impact of these indicators on long COVID is visualized through heatmaps ([171]Figure S4F), highlighting their significance. RAMEN unveils variable relationships beyond mutual information or Pearson correlation for COVID-19 outcomes To demonstrate that RAMEN can uncover additional information beyond conventional statistical methods, particularly in identifying network edges that cannot be detected by other approaches, we compared RAMEN against Pearson correlation and mutual information. [172]Figures 4A and 4B illustrate that RAMEN identified numerous edges that the other two methods failed to capture. This is particularly evident in [173]Figure 4A, where 58.1% of the network edges are unique to RAMEN. In the full severity network, this proportion increases to 79.8%. It is important to note that [174]Figure 4B only displays the top 25% of edges ranked by mutual information. As a result, the proportion of RAMEN-unique edges in this subset is reduced to 22.5%. Figure 4. [175]Figure 4 [176]Open in a new tab RAMEN identifies effective indicator variables that cannot be found using mutual information or Pearson correlation (A) The long COVID network, where purple edges represent connections significant only to RAMEN, and green edges are also identified by mutual information or correlation. (B) Similar network for COVID-19 severity, with purple indicating edges found exclusively by RAMEN and green representing those also found by mutual information or correlation. The full names of the variables are provided in [177]Data S1. (C) Heatmaps visualizing DE genes associated with “Platelets” and “COVID-19 severity.” The three groups of DE genes correspond to the unique DE genes of the two variables and common DE genes. (D) Pathway enrichment based on the common DE genes in (C). (E) A barplot demonstrating RAMEN’s ability to detect disease-relevant edges missed by Pearson correlation and mutual information. Using RNA-seq data as ground truth (see [178]STAR Methods for details), among all the edges that cannot be found using Pearson correlation, the column “Not Corr, RAMEN” shows the percentage of disease-outcome-relevant edges found by RAMEN. “Not Corr, Not RAMEN” shows those that also cannot be found using RAMEN. Likewise, “Not MI, RAMEN” corresponds to the percentage of true edges missed by mutual information but found by RAMEN, and “Not MI, Not RAMEN” are the ones that are not found by both. “Random” is the performance of randomly selecting edges. The p values of the binomial tests (see details in the [179]STAR Methods section [180]quantification and statistical analysis) indicate that RAMEN is accurate in finding edges missed by other methods. This suggests that RAMEN has additional power in detecting disease-relevant edges compared to Pearson correlation and mutual information. In [181]Figure 4A, the long COVID network analysis revealed 151 associations (edges) identified by RAMEN that were not detected by traditional statistical methods. By leveraging an absorbing random walk approach combined with a genetic algorithm, RAMEN uncovered numerous associations that, despite lacking a strong direct correlation, significantly influenced the random walk’s progression toward the absorbing node (disease outcome). Among these were key indicators of long COVID, such as “BMI:—long Covid,”[182]^87 and clinically relevant edges such as “COPD (emphysema, chronic bronchitis)?—long Covid,” which aligns with known disease associations.[183]^88 This finding raises the question of whether COPD itself may serve as an indicator of long COVID. An alternative explanation is that overlapping respiratory symptoms between COPD and long COVID make it difficult to distinguish their etiologies. Whether this connection represents a true biological association remains to be determined. Nonetheless, this result highlights RAMEN’s ability to critically interrogate data and uncover associations that may offer additional insights into pathogenesis. Beyond direct associations with long COVID, RAMEN also identified indirect connections, such as “BMI:—rheumatologic disease?—long Covid,”[184]^89^,[185]^90 providing deeper insights into how clinical variables interact in long COVID. Supporting this interpretation, Mendelian randomization studies have demonstrated a causal link between BMI and rheumatoid arthritis.[186]^91^,[187]^92 The inflammatory component of rheumatologic disease driven by BMI could potentially interact with the long-term manifestations of SARS-CoV-2 infection. RAMEN-identified relationships thus provide testable hypotheses that may enhance our understanding of long COVID across different patient subgroups. In [188]Figure 4B, in the compact COVID severity network, correlational methods failed to detect 52 edges that RAMEN identified (they failed to detect 739 in the complete network), for example, “creatinine (HIGHEST value)—COVID severity,”[189]^93 “respiratory rate (associated with BP above):—COVID severity,”[190]^94 and “COVID severity—acute kidney injury?”[191]^95 Acute kidney injury is a condition that often develops in patients affected with COVID-19, and not only was RAMEN able to capture it while naive methods cannot, RAMEN was also able to capture the correct edge direction. Another noteworthy edge that only RAMEN identified is “platelet (LOWEST value)—COVID severity.”[192]^96 Platelets play a major role in the immune system and have been found to be an indicator of COVID severity. The inability of naive methods to detect such edges highlights their limitations compared to RAMEN. In [193]Figure 4C, similar to the previous figures, we demonstrate the overlap of DE genes between “platelets” and “COVID severity,” revealing distinct expression patterns associated with the variables’ differing values. This analysis suggests the biological mechanisms underlying the predicted relationship between platelet levels and COVID-19 outcomes. In [194]Figure 4D, a pathway enrichment analysis of the “common” DE genes related to both platelets and COVID severity identified several significant pathways. These pathways include “reactome interferon alpha beta signaling,”[195]^97 “reactome interferon signaling,” “reactome cytokine signaling in the immune system,”[196]^98 “reactome antiviral mechanism by IFN-stimulated genes,”[197]^99 “reactome DDX58 IFIH1-mediated induction of interferon alpha beta,”[198]^100 “WP type I interferon induction and signaling during SARS-CoV-2 infection,”[199]^101 “Kegg Medicus reference hydrolysis of sphingomyelin,”[200]^102 and “reactome OAS antiviral response.”[201]^103 Type I interferons are established immunological mediators of COVID-19 severity.[202]^104^,[203]^105^,[204]^106 Interestingly, platelets are key regulators of coagulation, a process that can be severely disrupted in severe COVID-19, leading to life-threatening conditions.[205]^107 These identified pathways provide a biological context for the edges predicted by the analysis, confirming their relevance to the severity of COVID-19. We further utilized RNA-seq data to validate the edges identified by RAMEN, particularly those missed by traditional correlational methods (mutual information and Pearson correlation). Our validation method is based on the principle that two variables are considered connected if they exhibit significant overlap in their DE genes, as outlined in [206]STAR Methods. Notably, RAMEN demonstrated a significantly greater capability to identify genomics-supported edges missed by correlation methods. This contrast becomes even more pronounced when examining edges that both RAMEN and the correlational methods failed to predict; these missed edges do not exhibit any enrichment in the number of supported edges compared to random selection, indicating no significant genomics support. [207]Figure 4E shows that RAMEN’s precision in detecting genomics-supported edges far exceeds that of random chance, as validated by the p values from binomial tests. Conversely, the edges missed by RAMEN showed no significant difference in support from the RNA data compared to randomly selected edges. These results affirm RAMEN’s effectiveness in uncovering relevant and supported edges overlooked by conventional statistical methods. RAMEN demonstrates broad applicability across diverse disease studies To demonstrate RAMEN’s adaptability beyond COVID-19, we applied it to two additional disease cohorts: septicemia and COPD. RAMEN requires no disease-specific constraints or prior medical knowledge, making it broadly applicable across clinical contexts. For septicemia, we used the MIMIC-III database,[208]^10 a publicly available ICU patient record dataset. From this, a subset of 715 samples with 227 variables was selected, with patient mortality (STATUS: “ALIVE”/“DEAD”) as the outcome. For COPD, we analyzed the CanCOLD dataset, a study capturing 3,778 clinical visit records of over 1,500 individuals across nine Canadian sites. After filtering, the dataset included 220 variables, with COPD exacerbation in the past 12 months (Exa12: 1 = yes, 0 = no) as the outcome. RAMEN was applied to both datasets using the same preprocessing and analysis workflow as for COVID-19 datasets. [209]Figure 5A illustrates the networks inferred by RAMEN for septicemia (left) and COPD (right), capturing direct and indirect relationships. Node sizes in the networks reflect feature importance ranked by RAMEN, aligning closely with SHAP analysis results ([210]Figure 5B). RAMEN identifies clinically relevant variables supported by literature and visual analysis. For example, in the septicemia network, lower platelet counts are strongly associated with “DEAD” status, as shown in the leftmost heatmap in [211]Figure 5C. This finding aligns with studies identifying thrombocytopenia as a predictor of mortality in sepsis.[212]^108^,[213]^109 Similarly, higher AST (aspartate aminotransferase) levels, linked to liver dysfunction, are associated with increased mortality,[214]^110^,[215]^111 as shown in the second heatmap in [216]Figure 5C. For the COPD dataset, RAMEN identifies significant relationships between clinical variables and disease outcomes. For instance, the third heatmap in [217]Figure 5C demonstrates that higher SGRQ (St. George’s Respiratory Questionnaire) scores are associated with more frequent COPD exacerbations, consistent with studies linking SGRQ scores to exacerbation severity.[218]^112^,[219]^113 Similarly, the final heatmap in [220]Figure 5C shows that lower post-bronchodilator FEV[1]/FVC (forced expiratory volume in 1 s/forced vital capacity) values (“MAXFEV[1]FVCP_POST”) are strongly associated with COPD exacerbations. This is consistent with the established role of FEV[1]/FVC ratios in COPD diagnosis and management.[221]^114^,[222]^115 Figure 5. [223]Figure 5 [224]Open in a new tab RAMEN identifies indicator variables and constructs disease-relevant networks across diseases using MIMIC-III and CanCOLD data (A) RAMEN-derived networks for septicemia (136 outcome-relevant variables, left) and COPD (22 outcome-relevant variables, right). Node colors represent variable types, and edge colors indicate connection intensity, as shown in the legend. Node sizes reflect RAMEN’s importance scores (for details, see [225]STAR Methods), indicating the relevance of each variable to the disease outcome. Diamond-shaped nodes represent outcome variables, specifically septicemia death and COPD exacerbation. These results demonstrate RAMEN’s applicability to multiple diseases. (B) SHAP values quantify the importance of indicator variables based on their impact on disease-outcome prediction. Values in parentheses indicate RAMEN’s feature importance rankings. The alignment between SHAP rankings and RAMEN’s selections underscores the method’s robustness in identifying key variables. (C) Heatmaps illustrating the distribution of informative indicator variables for septicemia and COPD, further emphasizing RAMEN’s ability to uncover disease-relevant insights across a range of diseases. The plots reveal significant shifts in patient distributions across different disease outcomes based on the values of key indicator variables. RAMEN’s performance in these diverse datasets demonstrates its ability to construct meaningful networks for septicemia and COPD, uncovering disease-relevant interactions supported by clinical and biological evidence. Beyond identifying key variables, RAMEN outperforms benchmarked methods in finding informative disease indicators, as shown in our systematic benchmarking in the next subsection. These findings underscore RAMEN’s versatility in adapting to heterogeneous datasets and its capability to uncover clinically significant insights across diverse disease contexts. By extending the analysis beyond COVID-19, we establish RAMEN as a broadly applicable tool for clinical network reconstruction and analysis. Systematic benchmarking highlights RAMEN’s superiority over alternative methods Building on previous sections demonstrating RAMEN’s ability to construct meaningful networks across diverse disease datasets, we conducted comprehensive benchmarking to evaluate its performance against established statistical and BN learning methods. Specifically, we compared RAMEN to mutual information and Pearson correlation as representative statistical measures, and to two BN learning frameworks, pgmpy[226]^116 and bnlearn.[227]^117 These comparisons underscore RAMEN’s enhanced performance as an irreplaceable tool for uncovering complex relationships in disease-related network structures. [228]Figure 6A compares edge connection predictions on the COVID-19 severity dataset using RNA-seq as ground truth. RAMEN achieved the highest F1 score, outperforming all other methods, including pgmpy and bnlearn. To further evaluate edge prediction accuracy with known ground truth, we used a simulation dataset with a known ground-truth network generated via the Erdos-Rényi model[229]^118 using NetworkX.[230]^119 The dataset contained 100 nodes and 1,000 samples. [231]Figure 6B shows the edge connection benchmarking performed using this simulation dataset, whereby RAMEN significantly outperforms pgmpy, bnlearn, mutual information, and correlation. Furthermore, using this simulation dataset with known ground-truth BN, we performed an edge direction evaluation comparison. In this task, a correct edge requires both correct connection and correct direction. Traditional methods, such as mutual information and Pearson correlation, cannot be included in this comparison because they cannot provide directional edges. RAMEN demonstrated superior performance compared to BN learning methods ([232]Figure 6C). [233]Figures 6D–6F show a comprehensive disease-outcome indicator prediction benchmarking, which extends the comparison to BN methods and datasets from all diseases. The results, based on SVM performance trained on the selected indicators, show that RAMEN consistently outperforms pgmpy and bnlearn, further highlighting its ability to identify highly informative variables across diverse datasets. Figure 6. [234]Figure 6 [235]Open in a new tab RAMEN outperforms other methods in systematic benchmarking (A and B) Comparison of edge connection prediction performance using the COVID-19 dataset (A) and the simulation dataset with a known ground-truth network (B). The y axis represents the F1 score for edge connection prediction across the methods listed on the x axis. For the COVID-19 dataset, RNA-seq data are used to validate the predicted edges (as detailed in [236]STAR Methods), while the simulation dataset provides a known ground truth for edge connections. RAMEN achieves superior performance compared to all methods, particularly excelling over other Bayesian network learning approaches. (C) Edge direction prediction performance using the simulation dataset with a known ground truth for edge directions. A true positive requires the correct prediction of both the edge connection and its direction. The comparison is restricted to Bayesian network methods capable of predicting edge direction. RAMEN demonstrates a significant performance advantage over these methods. (D–F) Evaluation of indicator variables identified by different methods based on the classification performance of SVM models trained with these variables. RAMEN achieves results that are comparable to or better than those of all other methods, further emphasizing its superior ability to identify informative variables across different disease studies. p values (∗ [MATH: p<0.05 :MATH] , ∗∗ [MATH: p<0.01 :MATH] , ∗∗∗ [MATH: p<0.001 :MATH] , ∗∗∗∗ [MATH: p<0.0001 :MATH] ) were generated by Student’s t tests with n = 5 technical replicates. Boxplots show the interquartile range (IQR), with the median represented by a solid line. Whiskers extend to the most extreme data points within 1.5 times the IQR from the first and third quartiles. (D) COVID-19, (E) septicemia, and (F) COPD. Through extensive benchmarking across multiple datasets—including COVID-19, septicemia, COPD, and a simulation dataset—RAMEN consistently outperforms other methods in edge connection and direction prediction, as well as disease indicator identification, in our evaluation. Furthermore, RAMEN excels in computational efficiency, addressing critical limitations of BN structure learning, which is known to be NP-hard.[237]^29^,[238]^120 Many existing methods fail to handle large-scale networks due to excessive runtime and memory requirements. RAMEN mitigates the computational challenges associated with the exponential search space of Bayesian structures by integrating an absorbing random walk with a genetic algorithm, improving scalability in practical scenarios involving large clinical datasets. For instance, on the largest COVID-19 severity dataset with 880 initial variables (297 variables after filtering) and 2,018 samples, RAMEN completed the analysis in 2 h 46 min. Compared to other BN learning methods, RAMEN is 51.4 times faster than bnlearn (142 h 15 min) and 28.3 times faster than pgmpy (78 h 10 min). Detailed runtime and space complexity comparisons are provided in [239]Table S2. These advantages make RAMEN particularly effective for uncovering meaningful clinical insights in large-scale datasets. Discussion In this study we developed RAMEN, a scalable and efficient framework for BN structure learning to uncover complex relationships in clinical records data. Traditional statistical methods often fail to capture indirect associations, while BN approaches are hindered by high computational complexity. RAMEN overcomes these limitations by integrating absorbing random walks and a genetic algorithm, enabling the efficient discovery of complex clinically relevant interactions. We validated RAMEN’s ability to identify disease-associated variables and reconstruct meaningful clinical networks using both clinical variable datasets and multi-omics data. RAMEN is a scalable and efficient framework for learning target-variable-oriented BN structures. By integrating absorbing random walks with genetic algorithm, RAMEN circumvents the exponential complexity of traditional network structure learning, enabling substantially faster runtimes and improved performance. For instance, on the largest COVID-19 severity dataset, RAMEN is more than 25 times faster than the second-fastest method pgmpy. This computational efficiency makes RAMEN one of the few methods capable of handling large, real-world clinical datasets within practical time frames. Compared to traditional statistical methods, a distinguishing feature of RAMEN is its ability to reconstruct both direct and indirect relationships, along with accurately inferring directional edges. This capability provides a more comprehensive and informative network representation compared to methods such as Pearson correlation and mutual information, which often fail to capture such nuances. The accurate identification of directional edges enables deeper insights into disease mechanisms, offering valuable information for hypothesis generation and experimental design. RAMEN’s target-variable-oriented design ensures that the inferred networks are directly tied to clinical or biological outcomes, enhancing their interpretability and practical utility. This focus makes RAMEN particularly suited for identifying disease mechanisms and actionable targets, facilitating meaningful applications in precision medicine and translational research. To substantiate its predictions, RAMEN leverages multi-omics validation, integrating diverse datasets to cross-verify network structures. This approach bridges computational predictions with biological evidence, enhancing confidence in the results and uncovering deeper insights into the molecular mechanisms underlying complex diseases. In comprehensive benchmarking, RAMEN consistently outperformed statistical and BN tools across diverse datasets, demonstrating superior accuracy, reliability, and relevance. Its combination of computational efficiency and scalability, ability to accurately identify complex relationships, focus on target-variable-oriented analysis, and incorporation of multi-omics validation establish RAMEN as a transformative tool for analyzing complex disease datasets and advancing biological and clinical research. While RAMEN has been demonstrated within the context of COVID-19, septicemia, and COPD, its framework is designed with the flexibility to examine relationships among clinical variables across diverse diseases. This is because RAMEN does not incorporate any dataset-specific priors, so it can theoretically be applied to any dataset with a target variable. This opens the possibility of extending RAMEN to other domains, including diseases not covered in this study and datasets with different types of outcomes. Second, although BNs do not inherently represent causal relationships, future iterations of RAMEN could be developed to integrate additional information for discovering causal relationships between variables, further enhancing its utility and interpretability. RAMEN provides a scalable framework for analyzing complex clinical datasets, enabling the identification of biomarkers directly tied to clinically relevant outcomes. Its target-variable-oriented design ensures meaningful insights, while multi-omics validation enhances biological relevance, bridging computational predictions and experimental evidence. RAMEN is particularly well suited for studying complex diseases by uncovering variable interactions that inform personalized diagnostics, risk stratification, and therapeutic strategies. By addressing challenges in biomarker discovery and large-scale clinical data analysis, RAMEN offers researchers a practical tool to improve disease research and support meaningful clinical applications. Limitations of the study Although RAMEN has been comprehensively evaluated using clinical data, multi-omics data, and simulated data across various diseases, further analysis with larger multi-omics datasets is warranted. The current sparsity of matched multi-omics and clinical data limits broader validation across additional diseases and in-depth mechanistic insights. Future work focusing on more comprehensive joint analyses of clinical and multi-omics data could uncover previously unrecognized disease mechanisms and biomarkers. Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Jun Ding (jun.ding@mcgill.ca). Materials availability This study did not generate new unique reagents. Data and code availability * • The BQC19, CanCOLD, and Lawson Health Research Institute Long COVID datasets contain patient information and are not publicly available. Researchers interested in accessing the BQC19 dataset can request access through the Biobanque Québécoise de la COVID-19 ([240]https://en.quebeccovidbiobank.ca/). The CanCOLD dataset is available for purchase via the CanCOLD research program ([241]https://cancold.ca/). For inquiries regarding the Lawson Health Research Institute Long COVID dataset, please contact the authors directly. The MIMIC-III dataset is publicly available at [242]https://mimic.mit.edu/. * • All original code has been deposited at Zenodo ([243]https://doi.org/10.5281/zenodo.14879675) and is publicly available as of the date of publication. * • Any additional information required to reanalyze the data reported in this paper is available from the [244]lead contact upon request. Acknowledgments