Graphical abstract
graphic file with name fx1.jpg
[38]Open in a new tab
Highlights
* •
RAMEN enables efficient and scalable construction of Bayesian
networks from clinical data
* •
RAMEN integrates absorbing random walks and genetic algorithms to
generate these networks
* •
RAMEN outperforms conventional statistical and network-based
approaches
* •
RAMEN uncovers key disease indicators across diverse diseases and
datasets
Motivation
Given patient clinical records, understanding the interactions among
clinical variables and their impact on disease outcomes is crucial for
advancing diagnostics and personalized medicine. Traditional
statistical methods fail to capture indirect relationships, while
Bayesian network learning methods are computationally inefficient and
only infer general associations without prioritizing disease-relevant
variables. Random walk- and genetic algorithm-based network inference
(RAMEN) overcomes these limitations by integrating absorbing random
walks and genetic algorithms, which efficiently learn Bayesian network
structures while ensuring the network is target-variable
oriented—focusing on disease outcomes. By leveraging clinical records,
RAMEN uncovers complex variable interactions, enhancing disease
understanding and informing the development of improved diagnostics and
treatments.
__________________________________________________________________
Xiong et al. present RAMEN, an approach that integrates absorbing
random walks and genetic algorithms to enable efficient and scalable
construction of Bayesian networks from clinical data. Validated using
data from diverse complex diseases, RAMEN achieves state-of-the-art
accuracy and uncovers key variables linked to disease outcomes,
enabling broad biomedical applications.
Introduction
In recent years, large-scale outbreaks and chronic conditions have
posed serious challenges to global health, reshaping lives and imposing
substantial socioeconomic burdens.[39]^1^,[40]^2 Despite numerous
efforts, our understanding of the underlying mechanisms of many
diseases remains incomplete, hindering accurate diagnosis, prediction
of disease trajectories, and the development of effective treatments.
Many diseases exhibit high variability among patients, with distinct
symptoms and clinical outcomes,[41]^3 yet the mechanisms driving this
heterogeneity remain unclear. This gap in understanding restricts early
diagnosis and targeted interventions for the most vulnerable
individuals,[42]^4^,[43]^5 prolonging suffering and delaying care. The
rapid accumulation of clinical and population-level datasets presents
unprecedented opportunities to uncover disease mechanisms. Large-scale
biobanks now integrate clinical, proteomic, and transcriptomic data,
driving computational advancements.[44]^6^,[45]^7 In Quebec, Canada,
the Biobanque québécoise de la COVID-19 (BQC19) has collected clinical
data from over 6,000 COVID-19 patients, alongside proteomic and
transcriptomic profiles for a subset.[46]^8 Similarly, the Lawson
Health Research Institute in Ontario has collected clinical records on
COVID-19 patients[47]^9 for studying long COVID. Beyond COVID-19,
datasets such as MIMIC-III,[48]^10 which focuses on intensive care unit
(ICU) patients, and CanCOLD,[49]^11 which tracks chronic obstructive
pulmonary disease (COPD)-related variables, extend computational
applications to other chronic diseases.
Despite these resources, extracting actionable insights remains
challenging and requires innovative computational approaches.
Integrating clinical and large-scale datasets can help identify
diagnostic markers and therapeutic targets. Advancing methods to bridge
these data types and uncover disease mechanisms has the potential to
transform precision medicine and global health. With the growing
availability of clinical datasets, many studies have explored
relationships between clinical variables and disease
outcomes[50]^12^,[51]^13^,[52]^14^,[53]^15 using simple statistical
methods (e.g., Pearson correlation[54]^15 and mutual
information[55]^16). These approaches map direct associations between
clinical variables but do not capture directionality, indirect
interactions through intermediate variables, and complex interactions
among multiple variables. These drawbacks limit their utility in
uncovering underlying mechanisms. On the other hand, Bayesian networks
(BNs),[56]^17^,[57]^18^,[58]^19 a class of probabilistic graphical
models, address these limitations by inferring clinical variables
indicative of disease outcomes (e.g., severity or mortality). BNs have
demonstrated success in disease diagnostics, often outperforming
physicians.[59]^20^,[60]^21 For example, BN-based models achieved
state-of-the-art performance in neurodegenerative disease diagnosis
while maintaining
interpretability[61]^22^,[62]^23^,[63]^24^,[64]^25^,[65]^26^,[66]^27
but also provided very good interpretability. However, constructing
BNs, particularly learning their structure, is computationally
challenging due to the vast discrete search space—an NP-hard
problem.[67]^28^,[68]^29 The combination of high-dimensional data and
relatively low sample sizes makes conventional optimization methods
such as backpropagation ineffective.[69]^30^,[70]^31^,[71]^32 In
practice, structure learning is often regularized with prior knowledge
to constrain the search space.[72]^33^,[73]^34 However, this is not a
feasible approach for our study, as our objective is to identify
previously unrecognized clinical variables influencing disease
outcomes, including COVID-19 severity, long COVID, septicemia
mortality, and COPD exacerbation. Imposing prior constraints would not
only introduce bias but also be impractical, as many of these diseases
remain poorly understood.
To address the above limitations and fill the gap, here we introduce
RAMEN (random walk- and genetic algorithm-based network inference),
which glues absorbing random walks and a genetic algorithm to infer a
BN representing the relationships between clinical variables and the
disease outcome. The random walks are employed to rank and select the
most relevant variables and connections to the disease-outcome variable
to reduce the network complexity. A significant aspect of our
methodology is the incorporation of a terminal absorbing node,
symbolizing the disease outcome of interest, such as COVID-19 severity,
within our clinical variable network. Following the preliminary network
reconstruction through the absorbing random walk process, we further
employ a genetic algorithm to refine and identify an optimized network
structure. This optimized structure is more accurately aligned with the
observed clinical variables, ensuring a precise representation of the
relationships and interactions within the dataset. The choice of
genetic algorithm is based on its suitability for exploring large
discrete search space in our task,[74]^35^,[75]^36^,[76]^37
flexibility,[77]^36^,[78]^38 and the empirical evidence of its
effectiveness.[79]^39^,[80]^40 After these two stages, RAMEN outputs a
BN that models the complex relationship between the disease-outcome
variable (e.g., COVID-19 severity) and other variables that are
directly or indirectly connected to it. To examine the performance of
RAMEN, we applied the method to three different COVID-19 cohorts from
the BQC19 project and Lawson Health Research Institute, a septicemia
cohort from MIMIC-III, and a COPD cohort (CanCOLD), examined the
resulting network with multi-omics measurements and computational
simulations, and compared RAMEN with other methods. We show that the
resulting networks capture important disease-outcome indicators that
can be validated via multi-omics, simulation, or literature. Moreover,
RAMEN demonstrated superior performance over simple statistical methods
by finding more relevant variables and indirect variables that cannot
be found using simple statistical methods such as mutual information
and Pearson correlation. Furthermore, this model has the potential to
be generalized for analyzing clinical variable networks across a wide
range of diseases, provided similar records of clinical variables are
available. This broad applicability enhances its utility across diverse
areas of medical research.
Results
Overview of RAMEN
The RAMEN method operates in two sequential phases: an absorbing random
walk and a genetic algorithm ([81]Figure 1). In phase 1, the algorithm
initializes a fully connected network comprising all clinical
variables, with edge weights corresponding to mutual information
between node pairs. Random walks are then conducted using normalized
mutual information as transition probabilities, terminating either
after a predefined number of steps or upon reaching the target
disease-outcome variable, which serves as an absorbing state. To
construct the network skeleton, edges with significantly higher visit
counts (q value ≤ 0.05) are retained. The statistical significance of
edge visits is determined through a permutation test against a
randomized background (see [82]STAR Methods and [83]Figure S1A for
details). In phase 2, a genetic algorithm refines the network skeleton
by iteratively generating and evaluating candidate BNs. The initial
population consists of network structures derived from the absorbing
random walk phase. New candidate networks are generated through
crossover, whereby subnetworks from high-performing candidates are
recombined, and mutation, which introduces small topological changes to
explore alternative structures. A selection process prioritizes
networks with higher scores based on an entropy-based objective
function, which assesses the network’s ability to effectively model the
observed clinical data while maintaining interpretability. Over
successive generations, the algorithm converges toward an optimized BN
that captures the most informative relationships among clinical
variables. This evolutionary approach allows for robust exploration of
the network space, avoiding local optima and ensuring that the final
network reflects biologically meaningful and statistically significant
associations.
Figure 1.
[84]Figure 1
[85]Open in a new tab
Overview of the RAMEN methodology
The RAMEN approach constructs Bayesian networks from clinical data
through a sequential two-phase process. (Phase 1) Establishing the
initial network via absorbing random walk-based permutation test.
Beginning with preprocessed clinical data, this stage implements a
permutation test via a random-walk strategy across a comprehensive
network of all included variables, where nodes symbolize variables and
edge weights indicate the mutual information among variable pairs. The
process identifies stronger variable connections by tracking the
frequency of edge traversal in successful random walks (ending at the
target node). Edges with significantly higher traversal frequencies, as
established through permutation testing, lay the groundwork for the
network, preparing it for further enhancement. (Phase 2) Enhancing the
network with a genetic algorithm. This stage further refines the
Bayesian network structure. Starting with a set of initial network
configurations derived from the early framework, the genetic algorithm
applies crossover (merging two configurations) and mutation (applying
random changes) to evolve these structures. Each cycle assesses the
network structures against a specific scoring function, prioritizing
those with superior scores for subsequent iterations. This cycle of
refinement, through modification, assessment, and selection, persists
until a stable score is achieved, culminating in an optimized network
structure.
Building a COVID-19 severity network using the BQC19 hospitalized patient
dataset
In our study, the RAMEN methodology was applied to the hospitalized
patient cohort data from BQC19, resulting in the development of a BN.
This BN delineates the complex interplay among clinical variables and
their association with COVID-19 severity ([86]Figure 2A). Our analysis
encompassed 2,018 hospitalized patients, with the dataset including 880
clinical variables and 297 clinical variables after data cleaning. The
severity of COVID-19 within this cohort was classified into three
categories: “not infected or mild,” “moderate,” and “severe or
deceased.” The inferred clinical variable network captures
relationships with COVID-19 severity that align with findings in the
existing literature. Among the variables identified, several key
examples linked to “COVID-19 severity” include “sex,”[87]^13^,[88]^41
“age,”[89]^13^,[90]^41 “BMI,”[91]^42^,[92]^43 “arterial
hypertension,”[93]^44 “ALT,”[94]^45 “C-reactive protein (CRP) (highest
value),”[95]^46 and “albumin (lowest value).”[96]^47 These variables
represent just a sample of the broader network, illustrating the
diverse factors impacting COVID-19 severity as observed in our study.
Figure 2.
[97]Figure 2
[98]Open in a new tab
RAMEN unveils indicators of COVID-19 severity in BQC19 hospitalized
patient data
(A) A streamlined network showcasing 231 of the most significant
connections identified by RAMEN, indicative of COVID-19 severity. The
full names of the variables are provided in [99]Data S1. The color and
thickness of edges signify the connection strength (blue for weaker,
red for stronger) based on mutual information metrics. Nodes are
colored according to categories of clinical variables, with their size
reflecting the strength of their correlation with COVID-19 severity.
The diamond-shaped node represents the outcome variable, which is
COVID-19 severity.
(B) Comparison of AUROC for predicting COVID-19 severity using
indicator variables, contrasting RAMEN-identified indicators against
those identified through mutual information and Pearson correlation
methods, with predictions made by support vector machines (SVMs). A
higher AUROC suggests a greater relevance of the identified variables
for severity prediction. Indicator variable selection by RAMEN is
detailed in [100]STAR Methods and, to ensure a fair comparison, all
compared methods use the same number (161) of top indicators.
(C) Analysis of Shapley additive explanations (SHAP) values, providing
one possible explanation of the significance of clinical variables
identified by RAMEN in SVM-based predictions. These values illustrate
the potential impact of variables on the model’s prediction, indicating
whether they contribute toward a positive or negative outcome. The
consistent color scheme across the x axis highlights variables
identified as dependable predictors by SHAP. For clarity, the
importance ranking assigned by RAMEN is shown in parentheses after each
variable name.
(D) Heatmaps illustrating the conditional distribution of COVID-19
severity levels (SEV) across the values of direct indicator variables,
where the heatmap colors represent the proportion of patients within
each severity category for given indicator values. This visual
representation aids in understanding the correlation between specific
clinical indicators and severity outcomes.
To demonstrate the superior capability of RAMEN in identifying relevant
indicators for COVID-19 outcomes compared to conventional statistical
methods, we conducted a benchmark analysis focused on predicting
COVID-19 severity based on the early record (within 1 month after
diagnosis) of clinical variables. This involved training a support
vector machine (SVM) classifier using indicator variables from the
COVID-19 severity network established by RAMEN, and two additional
SVMs, each utilizing the top variables identified by mutual information
and Pearson correlation methods, respectively. The effectiveness of
these variable selection methods was assessed based on the predictive
performance of the SVMs. As shown in [101]Figure 2B, the area under the
receiver-operating characteristic (AUROC) curves compare the
performance of SVM classifiers trained on variables identified by each
method. This result underscores RAMEN’s ability to uncover more
pertinent COVID-19 outcome indicators than traditional statistical
methods. The developed classifier can also be used to predict the
outcome of the disease (such as the severity of COVID-19) based on the
early clinical variable records from the first month of patient care.
To assess the effectiveness of the indicators identified by RAMEN, we
visualized the Shapley additive explanations (SHAP)[102]^48^,[103]^49
values in [104]Figure 2C as a reference point. This visualization
details how each indicator contributes to the SVM’s positive or
negative predictions, with the y axis representing the contribution
magnitude and the x axis listing the RAMEN-identified indicator
variables. The plot reveals a consistent pattern of the value
distribution (as indicated by the colors) across both sides of the x
axis, with many variables situated significantly away from the axis. In
addition, among the top 20 important features identified by SHAP and
RAMEN, there is substantial overlap: 19 of them are shared, and the
remaining one ranks 23rd ([105]Figure S1B visualizes the overlap
between top indicators found by RAMEN and SHAP). It is important to
note, however, that while SHAP values provide a ranking of feature
importance, it is not necessarily the ground truth.[106]^50^,[107]^51
Additionally, they are not designed to infer network structures or
identify complex relationships between variables (e.g., edges in a BN).
RAMEN, by contrast, constructs BNs that capture both direct and
indirect relationships between clinical variables. The observed
consistency in rankings reinforces the reliability of the variables
identified in the network but does not diminish the capability of RAMEN
to infer network structure, which SHAP cannot achieve. Thus, the strong
agreement between RAMEN’s indicators and those highlighted by SHAP thus
serves as complementary evidence rather than a replacement.
The association between identified indicators and COVID-19 severity is
further elucidated through heatmaps, as shown in [108]Figure 2D. These
heatmaps detail the relationship between COVID-19 severity levels and
the pertinent indicators identified by RAMEN. Each heatmap illustrates
the variation in the percentage of patients across different severity
levels in relation to the values of variables directly linked to
COVID-19 severity. Generally, for variables that are strongly connected
to COVID-19 severity within our reconstructed network, there is a
significant shift in the distribution of severity levels corresponding
to the values of these indicator variables. Conversely, for variables
deemed irrelevant, the severity distribution remains largely unaffected
by changes in these variables. The four heatmaps showcased underscore
RAMEN’s efficacy in pinpointing highly relevant indicators of COVID-19
severity.
Systematic validation of COVID-19 severity indicators identified by RAMEN
using BQC19 multi-omics data
To validate the reliability of the COVID-19 severity indicators
identified by RAMEN, we utilized the BQC19 multi-omics dataset. Our
validation approach included a comparison of differentially expressed
(DE) genes and proteins associated with each severity indicator with
those associated with various levels of COVID-19 severity. This process
involved examining the overlap between DE genes (from RNA-sequencing
[RNA-seq] data) or proteins (identified through SomaScan 5K array)
related to the indicators and those distinguishing between mild and
severe COVID-19 cases. The heatmaps depicted in [109]Figure 3A
demonstrate a significant overlap of DE genes between the indicators
and COVID-19 severity, revealing distinct expression patterns across
the range of indicator values. These findings suggest that a common set
of genes may be involved in linking these indicators to COVID-19
severity, indicating underlying biological pathways. Furthermore, the
BQC19 multi-omics dataset provides insights into the biological
mechanisms potentially governing these relationships. [110]Figure S2A
shows an additional example between “ARDS” and “severity.” Pathway
enrichment analysis of the “common” DE genes associated with COVID-19
severity and its primary indicators, shown in [111]Figure 3B,
identified significant pathways including “neutrophil
degradation,”[112]^52^,[113]^53 “innate immune system,”[114]^54
“antimicrobial peptides,”[115]^55 and “heme signaling.”[116]^56 These
pathways are implicated in modulating COVID-19 severity, suggesting
mechanisms through which the indicators may influence disease severity.
Severe COVID-19 is often characterized by acute respiratory distress
syndrome (ARDS) associated with abnormal coagulation.[117]^57^,[118]^58
Moreover, several studies have pointed to neutrophilia, release of
their granules, and neutrophil extracellular traps (NETs) as key
pathological features of thrombotic complications driven by the immune
system (also called immunothrombosis) in severe
COVID-19,[119]^3^,[120]^59^,[121]^60^,[122]^61^,[123]^62^,[124]^63^,[12
5]^64^,[126]^65^,[127]^66^,[128]^67^,[129]^68^,[130]^69 linking “ARDS,”
“neutrophil degradation,” “innate immune system,” and “heme signaling.”
Taken together, the pathway enrichment performed using the severity
indicators identified by RAMEN is congruent with the existing
literature, supporting the validity and reliability of RAMEN. We have
also conducted this analysis using SomaScan data, which is illustrated
in [131]Figures S2B, S2C, and [132]S3.
Figure 3.
[133]Figure 3
[134]Open in a new tab
Support for the COVID severity network edges from the RNA-seq data
(A) Analysis of gene expression across three groups of differentially
expressed (DE) genes linked to example nodes “ARDS,” “Albumin,” and
“BMI” that directly connect to COVID severity. For example, with
“Albumin,” we first pinpoint DE genes associated with albumin
variability (i.e., genes with expression changes in patients with
varying albumin levels, denoted as
[MATH: G1 :MATH]
). Next, we identify DE genes linked to COVID severity
[MATH: (G2) :MATH]
. The “Common” group represents DE genes common to both sets
[MATH: (G1∩G2) :MATH]
; the “Albumin” group illustrates DE genes exclusive to the albumin
variable
[MATH: (G1∩¬G2<
/mrow>) :MATH]
; and the “Severity” group shows DE genes unique to COVID-19 severity
[MATH: (G2∩¬G1<
/mrow>) :MATH]
.
(B) Identification of the top enriched pathways for each variable based
on their common DE genes with the severity variable (the “Common”
group). The x axis shows the negative log[10] of FDR-corrected p
values. From the top to bottom are enrichment analyses we carried out
for DE genes identified from the edge between variable “Acute
Respiratory Distress Syndrome (ARDS)?” and variable “Severity,” the
edge between “BMI” and “Severity,” and the edge between “Albumin” and
“Severity.”
(C) Validation of COVID-19 severity indicators using RNA-seq highlights
RAMEN’s ability to uncover additional insights beyond those revealed by
conventional statistical methods such as Pearson correlation and mutual
information. Each method on the x axis (MI, mutual information; RAM,
RAMEN; COR, Pearson correlation) classifies variables into indicators
or non-indicators, with RNA-seq data providing the basis for ground
truth. A variable is considered an indicator if its DE genes
significantly overlap with those associated with COVID-19 severity,
assessed via a hypergeometric test. The performance of each method is
quantified using the F1 score from verifying the variables found by
each method against the ground truth. RAMEN achieves a higher F1 score
compared to statistics-based methods, indicating its ability to uncover
relationships that extend beyond these methods.
In addition, we performed systematic benchmarking of RAMEN against
other statistical methods, such as mutual information and Pearson
correlation, to assess its effectiveness in identifying severity
indicators ([135]Figure 3C). In this benchmarking exercise, RNA-seq
data served as the basis for establishing a definitive classification
of ground truth. The hypergeometric test was applied to assess the
congruence between DE genes from selected indicators and those
associated with COVID-19 severity, using p values to determine
statistical significance (see [136]STAR Methods). Clinical variables
demonstrating a significant overlap of DE genes with COVID-19 severity
were acknowledged as true indicators of severity for benchmarking
purposes. The effectiveness of each method was assessed through the F1
score. According to [137]Figure 3C, correlation exhibited the lowest F1
score by a considerable margin, with mutual information showing
significant improvement yet still trailing behind RAMEN. The
combination of correlation and mutual information was also evaluated,
resulting in a marginally improved F1 score, though still not
surpassing RAMEN. These results underscore RAMEN’s capacity to identify
severity indicators that traditional statistical methods may fail to
detect. We have also carried out the same benchmarking using the
SomaScan data, which is shown in [138]Figure S2D. Similarly,
correlation exhibited the lowest F1 score by a considerable margin,
with mutual information showing significant improvement. The
combination of correlation and mutual information resulted in an
improvement from the two methods individually. However, RAMEN still
demonstrated a better score than all three methods.
In addition to demonstrating the overall quality of the final network
output, we conducted an ablation study to evaluate the contributions of
the genetic algorithm (GA) beyond those of the random walk component in
constructing the COVID-19 severity network. As shown in
[139]Figure S1C, the random walk algorithm initially generates a
skeleton network comprising 194 edges. The GA refines this skeleton by
removing 112 edges and adding 814 new edges, significantly modifying
and enriching the network structure. To quantify the importance of the
GA’s modifications, we performed a binomial test to assess whether its
operations were supported by RNA-seq data. Correct modifications were
defined as adding RNA-supported edges or removing non-RNA-supported
edges, while incorrect modifications involved removing RNA-supported
edges or adding non-RNA-supported edges. The resulting binomial test
yielded a highly significant p value of
[MATH: 1.29×10−25 :MATH]
, confirming that the GA’s modifications significantly improve the
network beyond random actions. Beyond RNA-seq validation, the GA
introduces indirect edges that provide additional biological and
clinical insights, which are often missed by random walks. For
instance, while random walks effectively capture direct connections,
such as between CRP and COVID-19 severity, they fail to uncover
indirect connections that further contextualize CRP-related mechanisms.
The GA compensates for this by identifying indirect edges such as those
linking CRP to “total WBC count,”[140]^70 “temperature:.1,”[141]^71 and
“APTT (activated partial thromboplastin time) (HIGHEST value).”[142]^72
These additional connections, supported by literature, provide richer
insights into CRP’s clinical role. Similarly, while random walks
identify the direct connection between “creatinine” and COVID-19
severity,[143]^73^,[144]^74^,[145]^75 they do not capture related
variables such as “does the patient have other comorbidities?” (e.g.,
chronic kidney disease[146]^76), “sex,”[147]^77^,[148]^78 and
“non-ST-elevation myocardial infarction (NSTEMI)”[149]^79^,[150]^80 The
GA fills this gap by uncovering these indirect relationships, all of
which are supported by existing studies. This ablation study
demonstrates that while random walks excel at identifying direct
relationships, they lack the capacity to reveal many clinically
relevant indirect edges. The GA complements the random walks by
refining the network and incorporating these indirect connections,
significantly enhancing the network’s utility and interpretability.
In addition to analyzing COVID-19 severity, we also applied RAMEN to
the BQC19 outpatient COVID-19 cohort and a dataset from Lawson Health
Research Institute[151]^9 to investigate long COVID. Using the BQC19
dataset, RAMEN constructed a network of long-COVID-related variables.
We assessed key early indicators recorded within 1 month after
diagnosis through SVM-based disease-outcome prediction, SHAP analysis,
visualization, and literature validation ([152]Figures S4A–S4D). The
network successfully identified clinically relevant variables, such as
“age,”[153]^81 “chest pain,”[154]^5 “joint pain,”[155]^82 “runny
nose,”[156]^82 and “shortness of breath,”[157]^5 while excluding
irrelevant ones such as “chronic kidney disease,” which lacks
established links to long COVID.[158]^83 We further performed RNA-seq
and SomaScan heatmap analyses along with pathway enrichment analysis
([159]Figures S2E–S2I and [160]S5). RAMEN’s selected indicators
improved SVM classification performance compared to other statistical
methods ([161]Figure S4B), with SHAP analysis confirming their
predictive strength ([162]Figure S4C). Distribution shifts of long
COVID values based on indicator presence ([163]Figure S4D) and
Pearson’s chi-squared test ([164]Table S1) further validated their
relevance. To assess the robustness of RAMEN, we applied it to an
independent long COVID dataset from the Lawson Health Research
Institute[165]^9 and output a long COVID network ([166]Figure S4E).
Among the eight overlapping variables with the BQC19 dataset, RAMEN
consistently identified four key indicators—“chest pain,”
“anosmia/ageusia,” “dyspnea,” and “headache”—supported by prior
studies.[167]^5^,[168]^84^,[169]^85^,[170]^86 The impact of these
indicators on long COVID is visualized through heatmaps
([171]Figure S4F), highlighting their significance.
RAMEN unveils variable relationships beyond mutual information or Pearson
correlation for COVID-19 outcomes
To demonstrate that RAMEN can uncover additional information beyond
conventional statistical methods, particularly in identifying network
edges that cannot be detected by other approaches, we compared RAMEN
against Pearson correlation and mutual information. [172]Figures 4A and
4B illustrate that RAMEN identified numerous edges that the other two
methods failed to capture. This is particularly evident in
[173]Figure 4A, where 58.1% of the network edges are unique to RAMEN.
In the full severity network, this proportion increases to 79.8%. It is
important to note that [174]Figure 4B only displays the top 25% of
edges ranked by mutual information. As a result, the proportion of
RAMEN-unique edges in this subset is reduced to 22.5%.
Figure 4.
[175]Figure 4
[176]Open in a new tab
RAMEN identifies effective indicator variables that cannot be found
using mutual information or Pearson correlation
(A) The long COVID network, where purple edges represent connections
significant only to RAMEN, and green edges are also identified by
mutual information or correlation.
(B) Similar network for COVID-19 severity, with purple indicating edges
found exclusively by RAMEN and green representing those also found by
mutual information or correlation. The full names of the variables are
provided in [177]Data S1.
(C) Heatmaps visualizing DE genes associated with “Platelets” and
“COVID-19 severity.” The three groups of DE genes correspond to the
unique DE genes of the two variables and common DE genes.
(D) Pathway enrichment based on the common DE genes in (C).
(E) A barplot demonstrating RAMEN’s ability to detect disease-relevant
edges missed by Pearson correlation and mutual information. Using
RNA-seq data as ground truth (see [178]STAR Methods for details), among
all the edges that cannot be found using Pearson correlation, the
column “Not Corr, RAMEN” shows the percentage of
disease-outcome-relevant edges found by RAMEN. “Not Corr, Not RAMEN”
shows those that also cannot be found using RAMEN. Likewise, “Not MI,
RAMEN” corresponds to the percentage of true edges missed by mutual
information but found by RAMEN, and “Not MI, Not RAMEN” are the ones
that are not found by both. “Random” is the performance of randomly
selecting edges. The p values of the binomial tests (see details in the
[179]STAR Methods section [180]quantification and statistical analysis)
indicate that RAMEN is accurate in finding edges missed by other
methods. This suggests that RAMEN has additional power in detecting
disease-relevant edges compared to Pearson correlation and mutual
information.
In [181]Figure 4A, the long COVID network analysis revealed 151
associations (edges) identified by RAMEN that were not detected by
traditional statistical methods. By leveraging an absorbing random walk
approach combined with a genetic algorithm, RAMEN uncovered numerous
associations that, despite lacking a strong direct correlation,
significantly influenced the random walk’s progression toward the
absorbing node (disease outcome). Among these were key indicators of
long COVID, such as “BMI:—long Covid,”[182]^87 and clinically relevant
edges such as “COPD (emphysema, chronic bronchitis)?—long Covid,” which
aligns with known disease associations.[183]^88 This finding raises the
question of whether COPD itself may serve as an indicator of long
COVID. An alternative explanation is that overlapping respiratory
symptoms between COPD and long COVID make it difficult to distinguish
their etiologies. Whether this connection represents a true biological
association remains to be determined. Nonetheless, this result
highlights RAMEN’s ability to critically interrogate data and uncover
associations that may offer additional insights into pathogenesis.
Beyond direct associations with long COVID, RAMEN also identified
indirect connections, such as “BMI:—rheumatologic disease?—long
Covid,”[184]^89^,[185]^90 providing deeper insights into how clinical
variables interact in long COVID. Supporting this interpretation,
Mendelian randomization studies have demonstrated a causal link between
BMI and rheumatoid arthritis.[186]^91^,[187]^92 The inflammatory
component of rheumatologic disease driven by BMI could potentially
interact with the long-term manifestations of SARS-CoV-2 infection.
RAMEN-identified relationships thus provide testable hypotheses that
may enhance our understanding of long COVID across different patient
subgroups.
In [188]Figure 4B, in the compact COVID severity network, correlational
methods failed to detect 52 edges that RAMEN identified (they failed to
detect 739 in the complete network), for example, “creatinine (HIGHEST
value)—COVID severity,”[189]^93 “respiratory rate (associated with BP
above):—COVID severity,”[190]^94 and “COVID severity—acute kidney
injury?”[191]^95 Acute kidney injury is a condition that often develops
in patients affected with COVID-19, and not only was RAMEN able to
capture it while naive methods cannot, RAMEN was also able to capture
the correct edge direction. Another noteworthy edge that only RAMEN
identified is “platelet (LOWEST value)—COVID severity.”[192]^96
Platelets play a major role in the immune system and have been found to
be an indicator of COVID severity. The inability of naive methods to
detect such edges highlights their limitations compared to RAMEN.
In [193]Figure 4C, similar to the previous figures, we demonstrate the
overlap of DE genes between “platelets” and “COVID severity,” revealing
distinct expression patterns associated with the variables’ differing
values. This analysis suggests the biological mechanisms underlying the
predicted relationship between platelet levels and COVID-19 outcomes.
In [194]Figure 4D, a pathway enrichment analysis of the “common” DE
genes related to both platelets and COVID severity identified several
significant pathways. These pathways include “reactome interferon alpha
beta signaling,”[195]^97 “reactome interferon signaling,” “reactome
cytokine signaling in the immune system,”[196]^98 “reactome antiviral
mechanism by IFN-stimulated genes,”[197]^99 “reactome DDX58
IFIH1-mediated induction of interferon alpha beta,”[198]^100 “WP type I
interferon induction and signaling during SARS-CoV-2
infection,”[199]^101 “Kegg Medicus reference hydrolysis of
sphingomyelin,”[200]^102 and “reactome OAS antiviral
response.”[201]^103 Type I interferons are established immunological
mediators of COVID-19 severity.[202]^104^,[203]^105^,[204]^106
Interestingly, platelets are key regulators of coagulation, a process
that can be severely disrupted in severe COVID-19, leading to
life-threatening conditions.[205]^107 These identified pathways provide
a biological context for the edges predicted by the analysis,
confirming their relevance to the severity of COVID-19.
We further utilized RNA-seq data to validate the edges identified by
RAMEN, particularly those missed by traditional correlational methods
(mutual information and Pearson correlation). Our validation method is
based on the principle that two variables are considered connected if
they exhibit significant overlap in their DE genes, as outlined in
[206]STAR Methods. Notably, RAMEN demonstrated a significantly greater
capability to identify genomics-supported edges missed by correlation
methods. This contrast becomes even more pronounced when examining
edges that both RAMEN and the correlational methods failed to predict;
these missed edges do not exhibit any enrichment in the number of
supported edges compared to random selection, indicating no significant
genomics support. [207]Figure 4E shows that RAMEN’s precision in
detecting genomics-supported edges far exceeds that of random chance,
as validated by the p values from binomial tests. Conversely, the edges
missed by RAMEN showed no significant difference in support from the
RNA data compared to randomly selected edges. These results affirm
RAMEN’s effectiveness in uncovering relevant and supported edges
overlooked by conventional statistical methods.
RAMEN demonstrates broad applicability across diverse disease studies
To demonstrate RAMEN’s adaptability beyond COVID-19, we applied it to
two additional disease cohorts: septicemia and COPD. RAMEN requires no
disease-specific constraints or prior medical knowledge, making it
broadly applicable across clinical contexts. For septicemia, we used
the MIMIC-III database,[208]^10 a publicly available ICU patient record
dataset. From this, a subset of 715 samples with 227 variables was
selected, with patient mortality (STATUS: “ALIVE”/“DEAD”) as the
outcome. For COPD, we analyzed the CanCOLD dataset, a study capturing
3,778 clinical visit records of over 1,500 individuals across nine
Canadian sites. After filtering, the dataset included 220 variables,
with COPD exacerbation in the past 12 months (Exa12: 1 = yes, 0 = no)
as the outcome. RAMEN was applied to both datasets using the same
preprocessing and analysis workflow as for COVID-19 datasets.
[209]Figure 5A illustrates the networks inferred by RAMEN for
septicemia (left) and COPD (right), capturing direct and indirect
relationships. Node sizes in the networks reflect feature importance
ranked by RAMEN, aligning closely with SHAP analysis results
([210]Figure 5B). RAMEN identifies clinically relevant variables
supported by literature and visual analysis. For example, in the
septicemia network, lower platelet counts are strongly associated with
“DEAD” status, as shown in the leftmost heatmap in [211]Figure 5C. This
finding aligns with studies identifying thrombocytopenia as a predictor
of mortality in sepsis.[212]^108^,[213]^109 Similarly, higher AST
(aspartate aminotransferase) levels, linked to liver dysfunction, are
associated with increased mortality,[214]^110^,[215]^111 as shown in
the second heatmap in [216]Figure 5C. For the COPD dataset, RAMEN
identifies significant relationships between clinical variables and
disease outcomes. For instance, the third heatmap in [217]Figure 5C
demonstrates that higher SGRQ (St. George’s Respiratory Questionnaire)
scores are associated with more frequent COPD exacerbations, consistent
with studies linking SGRQ scores to exacerbation
severity.[218]^112^,[219]^113 Similarly, the final heatmap in
[220]Figure 5C shows that lower post-bronchodilator FEV[1]/FVC (forced
expiratory volume in 1 s/forced vital capacity) values
(“MAXFEV[1]FVCP_POST”) are strongly associated with COPD exacerbations.
This is consistent with the established role of FEV[1]/FVC ratios in
COPD diagnosis and management.[221]^114^,[222]^115
Figure 5.
[223]Figure 5
[224]Open in a new tab
RAMEN identifies indicator variables and constructs disease-relevant
networks across diseases using MIMIC-III and CanCOLD data
(A) RAMEN-derived networks for septicemia (136 outcome-relevant
variables, left) and COPD (22 outcome-relevant variables, right). Node
colors represent variable types, and edge colors indicate connection
intensity, as shown in the legend. Node sizes reflect RAMEN’s
importance scores (for details, see [225]STAR Methods), indicating the
relevance of each variable to the disease outcome. Diamond-shaped nodes
represent outcome variables, specifically septicemia death and COPD
exacerbation. These results demonstrate RAMEN’s applicability to
multiple diseases.
(B) SHAP values quantify the importance of indicator variables based on
their impact on disease-outcome prediction. Values in parentheses
indicate RAMEN’s feature importance rankings. The alignment between
SHAP rankings and RAMEN’s selections underscores the method’s
robustness in identifying key variables.
(C) Heatmaps illustrating the distribution of informative indicator
variables for septicemia and COPD, further emphasizing RAMEN’s ability
to uncover disease-relevant insights across a range of diseases. The
plots reveal significant shifts in patient distributions across
different disease outcomes based on the values of key indicator
variables.
RAMEN’s performance in these diverse datasets demonstrates its ability
to construct meaningful networks for septicemia and COPD, uncovering
disease-relevant interactions supported by clinical and biological
evidence. Beyond identifying key variables, RAMEN outperforms
benchmarked methods in finding informative disease indicators, as shown
in our systematic benchmarking in the next subsection. These findings
underscore RAMEN’s versatility in adapting to heterogeneous datasets
and its capability to uncover clinically significant insights across
diverse disease contexts. By extending the analysis beyond COVID-19, we
establish RAMEN as a broadly applicable tool for clinical network
reconstruction and analysis.
Systematic benchmarking highlights RAMEN’s superiority over alternative
methods
Building on previous sections demonstrating RAMEN’s ability to
construct meaningful networks across diverse disease datasets, we
conducted comprehensive benchmarking to evaluate its performance
against established statistical and BN learning methods. Specifically,
we compared RAMEN to mutual information and Pearson correlation as
representative statistical measures, and to two BN learning frameworks,
pgmpy[226]^116 and bnlearn.[227]^117 These comparisons underscore
RAMEN’s enhanced performance as an irreplaceable tool for uncovering
complex relationships in disease-related network structures.
[228]Figure 6A compares edge connection predictions on the COVID-19
severity dataset using RNA-seq as ground truth. RAMEN achieved the
highest F1 score, outperforming all other methods, including pgmpy and
bnlearn. To further evaluate edge prediction accuracy with known ground
truth, we used a simulation dataset with a known ground-truth network
generated via the Erdos-Rényi model[229]^118 using NetworkX.[230]^119
The dataset contained 100 nodes and 1,000 samples. [231]Figure 6B shows
the edge connection benchmarking performed using this simulation
dataset, whereby RAMEN significantly outperforms pgmpy, bnlearn, mutual
information, and correlation. Furthermore, using this simulation
dataset with known ground-truth BN, we performed an edge direction
evaluation comparison. In this task, a correct edge requires both
correct connection and correct direction. Traditional methods, such as
mutual information and Pearson correlation, cannot be included in this
comparison because they cannot provide directional edges. RAMEN
demonstrated superior performance compared to BN learning methods
([232]Figure 6C). [233]Figures 6D–6F show a comprehensive
disease-outcome indicator prediction benchmarking, which extends the
comparison to BN methods and datasets from all diseases. The results,
based on SVM performance trained on the selected indicators, show that
RAMEN consistently outperforms pgmpy and bnlearn, further highlighting
its ability to identify highly informative variables across diverse
datasets.
Figure 6.
[234]Figure 6
[235]Open in a new tab
RAMEN outperforms other methods in systematic benchmarking
(A and B) Comparison of edge connection prediction performance using
the COVID-19 dataset (A) and the simulation dataset with a known
ground-truth network (B). The y axis represents the F1 score for edge
connection prediction across the methods listed on the x axis. For the
COVID-19 dataset, RNA-seq data are used to validate the predicted edges
(as detailed in [236]STAR Methods), while the simulation dataset
provides a known ground truth for edge connections. RAMEN achieves
superior performance compared to all methods, particularly excelling
over other Bayesian network learning approaches.
(C) Edge direction prediction performance using the simulation dataset
with a known ground truth for edge directions. A true positive requires
the correct prediction of both the edge connection and its direction.
The comparison is restricted to Bayesian network methods capable of
predicting edge direction. RAMEN demonstrates a significant performance
advantage over these methods.
(D–F) Evaluation of indicator variables identified by different methods
based on the classification performance of SVM models trained with
these variables. RAMEN achieves results that are comparable to or
better than those of all other methods, further emphasizing its
superior ability to identify informative variables across different
disease studies. p values (∗
[MATH: p<0.05 :MATH]
, ∗∗
[MATH: p<0.01 :MATH]
, ∗∗∗
[MATH: p<0.001 :MATH]
, ∗∗∗∗
[MATH: p<0.0001 :MATH]
) were generated by Student’s t tests with n = 5 technical replicates.
Boxplots show the interquartile range (IQR), with the median
represented by a solid line. Whiskers extend to the most extreme data
points within 1.5 times the IQR from the first and third quartiles. (D)
COVID-19, (E) septicemia, and (F) COPD.
Through extensive benchmarking across multiple datasets—including
COVID-19, septicemia, COPD, and a simulation dataset—RAMEN consistently
outperforms other methods in edge connection and direction prediction,
as well as disease indicator identification, in our evaluation.
Furthermore, RAMEN excels in computational efficiency, addressing
critical limitations of BN structure learning, which is known to be
NP-hard.[237]^29^,[238]^120 Many existing methods fail to handle
large-scale networks due to excessive runtime and memory requirements.
RAMEN mitigates the computational challenges associated with the
exponential search space of Bayesian structures by integrating an
absorbing random walk with a genetic algorithm, improving scalability
in practical scenarios involving large clinical datasets. For instance,
on the largest COVID-19 severity dataset with 880 initial variables
(297 variables after filtering) and 2,018 samples, RAMEN completed the
analysis in 2 h 46 min. Compared to other BN learning methods, RAMEN is
51.4 times faster than bnlearn (142 h 15 min) and 28.3 times faster
than pgmpy (78 h 10 min). Detailed runtime and space complexity
comparisons are provided in [239]Table S2. These advantages make RAMEN
particularly effective for uncovering meaningful clinical insights in
large-scale datasets.
Discussion
In this study we developed RAMEN, a scalable and efficient framework
for BN structure learning to uncover complex relationships in clinical
records data. Traditional statistical methods often fail to capture
indirect associations, while BN approaches are hindered by high
computational complexity. RAMEN overcomes these limitations by
integrating absorbing random walks and a genetic algorithm, enabling
the efficient discovery of complex clinically relevant interactions. We
validated RAMEN’s ability to identify disease-associated variables and
reconstruct meaningful clinical networks using both clinical variable
datasets and multi-omics data.
RAMEN is a scalable and efficient framework for learning
target-variable-oriented BN structures. By integrating absorbing random
walks with genetic algorithm, RAMEN circumvents the exponential
complexity of traditional network structure learning, enabling
substantially faster runtimes and improved performance. For instance,
on the largest COVID-19 severity dataset, RAMEN is more than 25 times
faster than the second-fastest method pgmpy. This computational
efficiency makes RAMEN one of the few methods capable of handling
large, real-world clinical datasets within practical time frames.
Compared to traditional statistical methods, a distinguishing feature
of RAMEN is its ability to reconstruct both direct and indirect
relationships, along with accurately inferring directional edges. This
capability provides a more comprehensive and informative network
representation compared to methods such as Pearson correlation and
mutual information, which often fail to capture such nuances. The
accurate identification of directional edges enables deeper insights
into disease mechanisms, offering valuable information for hypothesis
generation and experimental design. RAMEN’s target-variable-oriented
design ensures that the inferred networks are directly tied to clinical
or biological outcomes, enhancing their interpretability and practical
utility. This focus makes RAMEN particularly suited for identifying
disease mechanisms and actionable targets, facilitating meaningful
applications in precision medicine and translational research. To
substantiate its predictions, RAMEN leverages multi-omics validation,
integrating diverse datasets to cross-verify network structures. This
approach bridges computational predictions with biological evidence,
enhancing confidence in the results and uncovering deeper insights into
the molecular mechanisms underlying complex diseases. In comprehensive
benchmarking, RAMEN consistently outperformed statistical and BN tools
across diverse datasets, demonstrating superior accuracy, reliability,
and relevance. Its combination of computational efficiency and
scalability, ability to accurately identify complex relationships,
focus on target-variable-oriented analysis, and incorporation of
multi-omics validation establish RAMEN as a transformative tool for
analyzing complex disease datasets and advancing biological and
clinical research.
While RAMEN has been demonstrated within the context of COVID-19,
septicemia, and COPD, its framework is designed with the flexibility to
examine relationships among clinical variables across diverse diseases.
This is because RAMEN does not incorporate any dataset-specific priors,
so it can theoretically be applied to any dataset with a target
variable. This opens the possibility of extending RAMEN to other
domains, including diseases not covered in this study and datasets with
different types of outcomes. Second, although BNs do not inherently
represent causal relationships, future iterations of RAMEN could be
developed to integrate additional information for discovering causal
relationships between variables, further enhancing its utility and
interpretability.
RAMEN provides a scalable framework for analyzing complex clinical
datasets, enabling the identification of biomarkers directly tied to
clinically relevant outcomes. Its target-variable-oriented design
ensures meaningful insights, while multi-omics validation enhances
biological relevance, bridging computational predictions and
experimental evidence. RAMEN is particularly well suited for studying
complex diseases by uncovering variable interactions that inform
personalized diagnostics, risk stratification, and therapeutic
strategies. By addressing challenges in biomarker discovery and
large-scale clinical data analysis, RAMEN offers researchers a
practical tool to improve disease research and support meaningful
clinical applications.
Limitations of the study
Although RAMEN has been comprehensively evaluated using clinical data,
multi-omics data, and simulated data across various diseases, further
analysis with larger multi-omics datasets is warranted. The current
sparsity of matched multi-omics and clinical data limits broader
validation across additional diseases and in-depth mechanistic
insights. Future work focusing on more comprehensive joint analyses of
clinical and multi-omics data could uncover previously unrecognized
disease mechanisms and biomarkers.
Resource availability
Lead contact
Further information and requests for resources and reagents should be
directed to and will be fulfilled by the lead contact, Jun Ding
(jun.ding@mcgill.ca).
Materials availability
This study did not generate new unique reagents.
Data and code availability
* •
The BQC19, CanCOLD, and Lawson Health Research Institute Long COVID
datasets contain patient information and are not publicly
available. Researchers interested in accessing the BQC19 dataset
can request access through the Biobanque Québécoise de la COVID-19
([240]https://en.quebeccovidbiobank.ca/). The CanCOLD dataset is
available for purchase via the CanCOLD research program
([241]https://cancold.ca/). For inquiries regarding the Lawson
Health Research Institute Long COVID dataset, please contact the
authors directly. The MIMIC-III dataset is publicly available at
[242]https://mimic.mit.edu/.
* •
All original code has been deposited at Zenodo
([243]https://doi.org/10.5281/zenodo.14879675) and is publicly
available as of the date of publication.
* •
Any additional information required to reanalyze the data reported
in this paper is available from the [244]lead contact upon request.
Acknowledgments