Abstract
Systems characterization of immune landscapes in health, disease and
clinical intervention cases is a priority in modern medicine.
High-throughput transcriptomes accumulated from gene-knockout (KO)
experiments are crucial for deciphering target KO signaling pathways
that are impaired by KO genes at the systems-level. There is a demand
for integrative platforms. This article describes the PathwayKO
platform, which has integrated state-of-the-art methods of pathway
enrichment analysis, statistics analysis, and visualizing analysis to
conduct cutting-edge integrative pathway analysis in a pipeline fashion
and decipher target KO signaling pathways at the systems-level. We
focus on describing the methodology, principles and application
features of PathwayKO. First, we demonstrate that the PathwayKO
platform can be utilized to comprehensively analyze real-world mouse KO
transcriptomes ([27]GSE22873 and [28]GSE24327), which reveal systemic
mechanisms underlying the innate immune responses triggered by
non-infectious extensive hepatectomy (2 hours after 85% liver resection
surgery) and infectious CASP-model sepsis (12 hours after CASP-model
surgery). Strikingly, our results indicate that both cases hit the same
core set of 21 KO MyD88-associated signaling pathways, including the
Toll-like receptor signaling pathway, the NFκB signaling pathway, the
MAPK signaling pathway, and the PD-L1 expression and PD-1 checkpoint
pathway in cancer, alongside the pathways of bacterial, viral and
parasitic infections. These findings suggest common fundamental
mechanisms between these immune responses and offer informative cues
that warrant future experimental validation. Such mechanisms in mice
may serve as models for humans and ultimately guide formulating the
research paradigms and composite strategies to reduce the high
mortality rates of patients in intensive care units who have undergone
successful traumatic surgical treatments. Second, we demonstrate that
the PathwayKO platform model-based assessments can effectively evaluate
the performance difference of pathway analysis methods when benchmarked
with a collection of proper transcriptomes. Together, such advances in
methods for deciphering biological insights at the systems-level may
benefit the fields of bioinformatics, systems immunology and beyond.
Keywords: systems immunology, signaling pathways, network analysis,
inflammation, innate immunity, microbial immunity, bioinformatics
1. Introduction
Systems characterization of immune landscapes in health, disease, and
clinical intervention cases has become a priority in modern medicine.
For instance, to reduce the high mortality rates of patients in the
intensive care unit (ICU), insights into the complex mechanisms
underlying systems immunology triggered by infectious or non-infectious
traumatic surgical treatments must be further investigated through both
bench-experimental and computational analyses ([29]1–[30]4).
High-throughput transcriptomes accumulated in gene-knockout (KO)
experiments are crucial for deciphering target KO signaling pathways
that are impaired by KO genes at the systems-level ([31]1, [32]5). KEGG
pathways, as dominant examples, are curated manually with literature
and experimental evidence, thus intuitively visualizing the signaling
pathways that are supported by the literature and/or experimental
evidence ([33]6). However, existing methods of pathway enrichment
analysis may not produce consistent results, or identify true target
signaling pathways owing to intrinsic defects, as discussed in the
literature ([34]5, [35]7, [36]8).
Pathway enrichment analysis methods have evolved over decades from
non-topology-based to topology-based approaches; the latter category
generally performs better than the former category ([37]5). The eminent
non-topology-based methods include SAFE ([38]9), GSEA ([39]10), GSA
([40]11) and PADOG ([41]12). These methods employ entire gene sets.
Unlike over-representation methods (e.g., the hypergeometric test)
where a hypergeometric or binomial distribution is normally assumed
([42]13, [43]14), advanced methods (e.g., SAFE and GSEA) are built on
empirical distribution functions ([44]9, [45]10). These statistical
methods may not fit real data, thus preventing accurate predictions, as
discussed in the literature ([46]5, [47]14). The pioneering
topology-based methods include ROntoTools_PE ([48]15), SPIA ([49]16)
and ROntoTools_pDIS ([50]5). These methods deploy differentially
expressed genes (DEGs) with topology information on gene-gene
interactions ([51]5). Nonetheless, the criteria for selecting DEGs
remain debatable ([52]5). For instance, by a traditional approach, an
arbitrary number of genes (e.g., top 5% or 10% of total genes present
in the available KEGG pathways) are selected by using an arbitrary
cutoff p-value (e.g., p<0.05) or fold-change (e.g., absolute
log2FC>1.5) or their conjunction ([53]5, [54]7, [55]8). These arbitrary
cutoffs to select DEGs hinder a fair and reasonable comparisons under
the same context, as discussed in the literature ([56]5, [57]7).
Meanwhile, approaches to evaluating performance differences among
pathway analysis methods have recently evolved from traditional
disease-target-pathways ([58]7) toward known-KO pathways ([59]5,
[60]17). High-throughput transcriptomes accumulated in gene-KO
experiments have offered unique opportunities to decipher target KO
signaling pathways that are truly impaired by KO genes at the
systems-level ([61]1, [62]5). Thus, developing integrative platforms to
conduct cutting-edge integrative pathway analysis and to evaluate the
performance difference of methods has become an imminent frontier of
research in the field.
We recently developed an in-house PathwayKO package and used it to
analyze a transcriptome ([63]GSE24327) that was recovered from mouse
spleens 12 hours after a colon ascendens stent peritonitis (CASP)-model
surgery coupled with Myd88 gene-KO experiments ([64]1, [65]2). The
original bench-experiments suggested that the MyD88-deficient phenotype
attenuated proinflammatory responses and thus reduced the mortality
rate after CASP-model surgery ([66]2). To elucidate the mechanisms
underlying the observed phenomena, we designed the following three
subtypes of [67]GSE24327 data according to the original
bench-experiments ([68]2): GSE24327_A (septic KO MyD88 vs. septic WT)
for comparing septic null (Myd88 ^–/–) with septic wild-type mice,
GSE24327_B (septic KO MyD88 vs. untreated WT) for comparing septic null
(Myd88 ^–/–) with untreated wild-type mice, and GSE24327_C (septic WT
vs. untreated WT) for comparing septic wild-type with untreated
wild-type mice ([69]1). With the PathwayKO package, we successfully
identified 21 KO MyD88-associated signaling pathways from each subtype
and illustrated numerous key regulators (including ligands, receptors,
adapters, transducers, transcription factors and cytokines) that were
coordinately, significantly and differentially expressed at the
systems-level, and were precisely marked on those target KO signaling
pathways ([70]1). Our results revealed the mechanisms underlying
systems immunology triggered by the infectious CASP-model sepsis from
the bioinformatics analysis perspective ([71]1). We discussed the
observed phenomena, including the “systemic syndrome”, “cytokine
storm”, and “KO MyD88 attenuation”, as well as the proposed hypothesis
of “spleen-mediated immune-cell infiltration” ([72]1). We thereby
anticipated that these mechanisms may serve as models for humans, and
ultimately facilitate formulating research paradigms and composite
strategies for the early diagnosis and prevention of sepsis ([73]1).
This case study appeals that the PathwayKO package has the potential to
be constantly updated and widely used by the community.
The present article aims to describe the PathwayKO platform, focusing
on its methodology, principles and application features. We applied the
platform to analyze a real-world transcriptome ([74]GSE22873) as a case
study to elucidate the mechanisms underlying systems immunology
triggered by non-infectious extensive hepatectomy in single- and
double-null mice ([75]3). We illustrate that the platform incorporates
state-of-the-art methods of pathway enrichment analysis, statistics
analysis, and visualizing analysis to conduct the cutting-edge
integrative pathway enrichment analysis, which allows excavating target
KO signaling pathways at the systems-level. We highlight that this
integrated platform possesses but is not limited to the following
advantageous features:
* The differentially expressed genes (DEGs) are statistically
determined from data and are ready for pathway analysis under the
same context;
* The ROC curves and key metrics are simultaneously calculated, both
across methods and across data, in a pipeline fashion;
* The target KO signaling pathways are significantly identified and
differentially marked by key regulators, which are coordinately,
significantly and differentially expressed at the systems-level;
* The ROC curve-based statistics analysis is conducted under the same
conditions to evaluate the performance difference of methods;
* Both interactive and pipeline modes can be chosen for desired
computations.
Moreover, we demonstrate the application of the PathwayKO platform
model-based assessments, and the results suggest that the PathwayKO
platform can effectively evaluate the performance difference of pathway
analysis methods when benchmarked on a collection of proper
transcriptomes.
2. Framework overview of the PathwayKO platform
The PathwayKO platform ([76] Figure 1 ) currently incorporates and
drives a set of internal and external packages ([77] Figure 1A )
coupled with diverse dependencies ([78] Figure 1B ) to pursue the
integrative (I) preprocessing, (II) ROC-AUC calculating, (III)
statistics analyzing, and (IV) visualizing processes. All external
packages are adapted from the literature ([79]9–[80]20). The PathwayKO
platform as systems software has integrated state-of-the-art methods of
pathway enrichment analysis, statistics analysis, and visualizing
analysis ([81] Figures 1A, B ), which work in a pipeline fashion ([82]
Figures 1C, D ). The pathway enrichment analysis methods ([83]
Figure 1A ) include the packages of non-topology- and topology-based
methods, such as SAFE ([84]9), GSEA ([85]10), GSA ([86]11), PADOG
([87]12), ROntoTools_PE ([88]15), ROntoTools_pDIS ([89]5) and SPIA
([90]16), which are widely used by the community. The statistics
analysis methods ([91] Figure 1B ) include the packages of changepoint
([92]18) and pROC ([93]19). The visualizing analysis methods ([94]
Figure 1B ) include the packages of pROC ([95]19) and Pathview
([96]20).
Figure 1.
[97]Figure 1
[98]Open in a new tab
Framework overview of the PathwayKO platform. (A) Framework of the
PathwayKO platform as systems software. (B) Processes, modules, methods
and dependencies integrated in the PathwayKO platform. (C) Pipeline
analysis with a collection of data, both across methods and across
data. (D) Parallel computation.
3. Methodology, principles and application features of the PathwayKO platform
This article aims to describe the PathwayKO platform from the
perspectives of methodology, principles and application features with
results from some of its modules ([99] Supplemental Figures S1 -[100]
S4 ), as exemplified by real-world case studies of [101]GSE24327
([102]1, [103]2) and [104]GSE22873 ([105]3); these data were downloaded
(as of March 19, 2021) from [106]https://ncbi.nlm.nih.gov/geo/. The 333
mouse KEGG signaling pathways were downloaded (as of March 19, 2021)
from [107]https://www.kegg.jp/. Users should install the PathwayKO
platform and utilize this platform to complete tasks following the
tutorials step-by-step, provided as online supplemental materials
([108] Supplemental Figures S1 -[109] S4 ). Users may also follow
instructions in the user’s manual offered at
[110]https://github.com/allenaigit/pathwayko/tree/main/inst/docs/Users_
manual.pdf.
3.1. Preprocessing process
3.1.1. Feature 1: Automatic generation of intermediate data from one data
The preprocess module can preprocess the given GEO data (GSEXXX_RAW.tar
and GSEXXX_series_matrix.txt.gz) stored in the assigned working
directory (e.g., PathwayKO_platform) by utilizing the oligo ([111]21)
and limma ([112]22, [113]23) packages for RMA normalization, which will
generate intermediate data (pData.csv and PREP.RData). The preprocess
module can handle most types of GEO data produced by the major types of
machines thus far. Meanwhile, the makeSPIAdata module adapted from the
SPIA package ([114]16) can parse KEGG pathways to generate intermediate
data (mmuSPIA.RData) ready for pathway enrichment analysis by
topology-based methods (SPIA, ROntoTools_PE, and ROntoTools_pDIS). Only
topology-based methods can utilize the topology information of
gene-gene interactions ([115]5, [116]16). All resulting output files
are stored in the new directories automatically created and named after
the given data ([117] Supplemental Figure S2 ).
3.2. ROC-AUC calculating process
3.2.1. Feature 2: Automatic selection of DEGs from one data
The main module pathwayko can automatically select DEGs through
enabling the HES (high-edge-score) approach ([118]17), in addition to
classical approaches ([119]5, [120]7, [121]8). The main module
pathwayko drives the external changepoint package ([122]18), by which
the change-point analysis method (see [123]Figure 1 ) makes a
statistical decision on choosing the differentially expressed genes
(DEGs) based on the distribution of edge scores ([124] Figure 2 ) that
are constructed from data ([125]17). The resulting output directory
will be automatically created and named after the given data; the
directory contains a set of output files including the list of DEGs,
the list of knockout (KO) KEGG pathways, and the KO gene-associated
subnetwork.
Figure 2.
[126]Figure 2
[127]Open in a new tab
Schema of the high-edge-score (HES) approach with the change-point
analysis method for statistically selecting differentially expressed
genes. Explanations are presented in the main text. (A) A global graph
is constructed. (B) An edge score for two genes (X, Y) in the global
graph is calculated. (C) A change-point is statistically determined.
(D) High score edges are statistically determined. (E) The distribution
of edge scores is generated with each of three optional HES thresholds
(HES1, HES2 and HES3).
The principles behind the high-edge-score (HES) approach and the
change-point analysis method for statistically selecting differentially
expressed genes (DEGs) were modified from the literature ([128]6,
[129]17, [130]18, [131]23, [132]24), as briefly depicted below ([133]
Figure 2 ). First, a global graph is constructed ([134] Figure 2A ) by
the external KEGGgraph package ([135]24) based on available KEGG
pathways ([136]6); this graph comprises all known interactions among
the entire gene sets present in the KEGG pathways. Second, the edge
scores for all edges in the global graph are calculated ([137]
Figure 2B ). Given two genes (X and Y) connected by an edge in the
global graph, suppose FC[X] and FC[Y] are the values of the expression
fold-change (FC) of X and Y, respectively, while pX and pY are the
probability of observing FC[X] and FC[Y] just by chance, then the edge
score between X and Y is calculated by the following formula (1):
[MATH:
EdgeSco
reXY
=|FCX|·(1−pX)+|F
CY|·(1−pY) :MATH]
(1)
where the FC value for gene X (or Y), FC[X] (or FC[Y] ), is calculated
by comparing the expression values of KO samples versus normal samples
in each data. The p-value for gene X (or Y), pX (or pY), is calculated
between the same groups (i.e., KO samples vs. normal samples) by using
a moderated t-test ([138]17). The FC values and p-values for all of
these genes are calculated by using the functions eBayes and topTable
from the external limma package ([139]23). Third, a change-point is
statistically determined ([140] Figure 2C ) by using the change-point
analysis method from the external changepoint package ([141]18), where
a change-point is defined to be an inflection point after which the
distribution curve becomes flat ([142]18). Fourth, important edges are
statistically determined ([143] Figure 2D ). They are connected by DEGs
that are determined by a chosen HES threshold once initialized. Such
DEGs connecting important edges are statistically selected, and ready
for subsequent pathway enrichment analysis. Fifth, the distribution of
edge scores is generated with three optional HES thresholds defined
([144] Figure 2E ). HES1 is the least HES (the lower boundary of the
change point), the beginning of the flat area of a curve, recommended
as default; HES2 is the extra 25% HES (25% safety margin of the change
point), top 75% of the remaining scores to avoid selecting an
overwhelming number of DEGs that may introduce false positives; and
HES3 is the maximal HES (the upper boundary of the change point), the
last edge connected by last two genes with the highest score. Finally,
the differentially expressed genes (DEGs) are statistically selected by
HES1, HES2 or HES3.
3.2.2. Feature 3: Building ROC curves and calculating key metrics in a
pipeline fashion from one data
The main module pathwayko can conduct the desired computations in a
pipeline fashion, i.e., across methods over one data (see [145]Figure 1
). These computations include (i) building an ROC curve, (ii) computing
the key metrics ([146] Table 1 ), (iii) computing the AUC (with 95% CI,
confidence interval) for the full area under the entire ROC curve, (iv)
computing the partial AUCs (pAUC_SP and pAUC_SE) for the specific
regions focusing on 90–100% specificity and sensitivity in both
original and corrected formats ([147]25), and (v) computing the 95% CIs
for specificity and sensitivity. The entire set of key metrics ([148]
Table 1 ) for one data are then formatted as a summary output file
(SUM.RData). The resulting output files are stored in a new directory
automatically generated and named after the data. Such
batch-computations intensively consume computing resources (CPUs,
memory and storage). This is a time-limiting step.
Table 1.
Key metrics used for the ROC curve-based statistics analysis.
Metrics Definition
False discovery rate (FDR)
[MATH:
FDR=FPKOFPK
O+TPKO :MATH]
([149]2)
False positive rate (FPR)
[MATH:
FPR=FPKOFPK
O+TNKO :MATH]
([150]3)
False negative rate (FNR)
[MATH:
FNR=FNKOFNK
O+TPKO :MATH]
([151]4)
True positive rate (TPR)
[MATH: TPR(Sensitivity)=TPKOTPKO<
/mi>+FNKO
mrow> :MATH]
([152]5)
True negative rate (TNR)
[MATH: TNR(Specificity)=TNKOTNKO<
/mi>+FPKO
mrow> :MATH]
([153]6)
Accuracy
[MATH:
Accurac
y=TPKO+TNKO<
mi>TPKO+FPKO+TNKO+FNKO
:MATH]
([154]7)
Precision
[MATH:
Precisi
on=TPKOTPKO<
mo>+FPKO
:MATH]
([155]8)
Recall
[MATH:
Recall=
TPKO
TPKO+FN<
mi>KO :MATH]
([156]9)
p-Threshold Youden’s_Threshold = max(Specificity+Sensitivity) ([157]10)
AUC Full area under the entire ROC curve ([158]11)
pAUC_SP Partial area for a region of 90–100% specificity ([159]12)
pAUC_SE Partial area for a region of 90–100% sensitivity ([160]13)
[161]Open in a new tab
The principles behind the main module pathwayko (see [162]Figure 1 )
are modified from the literature ([163]5, [164]19, [165]26), as briefly
depicted below. (i) By our definition, a true positive KO (TPKO)
signaling pathway is a pathway that contains the knockout (KO) gene and
was correctly identified to be a significantly impacted by that KO gene
(e.g., at the pathway-level p-value < 0.001). (ii) A false positive KO
(FPKO) signaling pathway is a pathway that does not contain the KO
gene, and was not significantly impacted by that KO gene, but was still
identified to be significantly impacted. A true negative KO (TNKO)
signaling pathway is a pathway that does not contain the KO gene and
was not significantly impacted by that KO gene; thus, it was not
reported to be significantly impacted. A false negative KO (FNKO)
signaling pathway is a pathway that contains the KO gene and was
significantly impacted by that KO gene, but was not reported to be
significantly impacted. (iii) For such a true-false case, the response
versus prediction with a probability allows us to employ the external
pROC package ([166]19) to compute a set of key metrics defined by the
above terms ([167] Table 1 ). (iv) An ROC curve represents the tradeoff
between specificity and sensitivity for every possible p-value
([168]19). The Youden’s best p-value threshold (denoted as p-Threshold)
is a p-value that defines an optimal point (specificity, sensitivity)
on an ROC curve ([169]26), where the sum of specificity and sensitivity
is maximal ([170]19, [171]25). Each point on an ROC curve represents a
true KO (both true positive and false negative) signaling pathway in
our cases. (v) Some key metrics with local properties (FDR, FPR, FNR,
specificity, sensitivity, accuracy, precision and recall) collected at
the p-threshold ([172] Table 1 ) can be used to conduct a local
comparison; others with global properties (AUC, pAUC_SP and pAUC_SE)
can be applied to perform a global comparison. And these key metrics
are appropriate for evaluating the performance difference of pathway
analysis methods and for assessing the quality of data, both in terms
of the ROC curve-based statistics analysis.
3.3. Statistics analyzing process
3.3.1. Feature 4: The two-dimensional probability evidence plots
The internal evidenceplot module adapted from the SPIA package
([173]16) can construct the two-dimensional probability evidence plots.
The Adj.p-value is used to control the false discovery rate (FDR) for
multiple testing ([174]27, [175]28). Pathways above the oblique blue
(or red) line are significant at 5% after BH-FDR (or Bonferroni)
correction of the global p-values (pG), i.e., an adjusted Fisher’s
product of pPERT and pNDE ([176]16). Such plots can differentiate each
data under comparison after having been analyzed by the SPIA method in
view of the resolution along the X- and Y-axes ([177]16). The highest
resolution along the Y-axis, as indicated by the fine distribution,
suggests that the most likely target pathways are identified, thus
coinciding with its superiority among top-ranked TPKO signaling
pathways (e.g., a list of top-30 ranked).
3.3.2. Feature 5: The ROC curve-based statistical hypothesis testing
The internal roctest module adapted from the pROC package ([178]19) can
conduct the ROC curve-based statistical hypothesis testing both on the
two ROC curves themselves (via the venkatraman method) and on the
values of AUC, pAUC_SP and pAUC_SE (via the bootstrap method). The
output files are stored in a new directory (roctest) automatically
generated and named after the module.
3.3.3. Feature 6: The ROC curve-based statistics analysis
The internal combineresult module can combine individual results of
each data across methods, and format them to be a summary output file
(STATS.RData). Thereby, the internal violinplot module and the internal
wilcoxtest module can conduct the ROC curve-based statistics analysis,
comparing the individual key metrics ([179] Table 1 ) among the methods
under comparison. All resulting output files are stored in the
respective directories (combineresult, violinplot, and wilcoxtest),
which are automatically created and named after the modules.
3.4. Visualizing process
3.4.1. Feature 7: Highlighting target signaling pathways with key regulators
The internal filtertrue module can extract the signaling pathways, and
the internal pathwayview module adapted from the Pathview package
([180]20) can render them. The resulting output files are stored in the
two directories automatically generated and named after the two modules
(filtertrue and pathwayview). Key regulators, which are impacted by the
same KO gene and are coordinately, significantly up- or down-regulated,
can be differentially marked on the target signaling pathways. These
rendered signaling pathways illuminate putative mechanisms underlying
the KO phenotype at the systems-level.
4. Advanced batch-execution features of the PathwayKO platform
4.1. The work-flow logic of utilizing individual modules
4.1.1. Feature 8: Optional modes for desired tasks
The work-flow logic of utilizing individual modules of the PathwayKO
platform is indicated below ([181] Figure 3 ). The modules can be
executed in interactive mode or in pipelined batch-execution mode.
Figure 3.
Figure 3
[182]Open in a new tab
The modes of running individual modules of the PathwayKO platform.
4.2. Analyzing a collection of data in a pipeline fashion
4.2.1. Feature 9: All-in-all computations in batch-execution mode
After a number of data having been preprocessed in a step-by-step
manner and stored in the user’s working directory ([183] Supplemental
Figure S2 ), the user may conduct a batch-execution by typing the
pathwayko_batch () command within the continued R session ([184]
Supplemental Figure S3 ). This command will start sequential
operations: (i) to scan and identify data generated previously by the
preprocess module; and (ii) to apply the main pathwayko module to
conduct a batch-execution, performing the integrative KO pathway
enrichment analysis in a pipeline fashion, both across methods and
across data ([185] Supplemental Figure S3 ).
4.3. Evaluating the performance of methods with a benchmark
4.3.1. Feature 10: Evaluating the performance of methods under the same
conditions
Once a collection of benchmark data have been previously preprocessed
and stored in a user’s working directory ([186] Supplemental Figure S2
), the user should start a new R session to complete the desired
batch-computations by following the offered tutorials, where some key
parameters can be initialized in a step-by-step manner ([187]
Supplemental Figure S3 ). All resulting output files should be finally
stored in the new directories that are automatically created and named
after the modules ([188] Supplemental Figure S4 ).
5. Results
5.1. Case study 1: Elucidating the mechanisms underlying systems immunology
triggered by sterile extensive hepatectomy
5.1.1. Assignment of transcriptomes reflecting the original bench-experiments
The original bench-experiments suggested that (i) wild-type (WT) mice
had a high mortality rate after sterile 85% liver resection; (ii)
single null Myd88 (Myd88 ^–/–) mice had greatly reduced survival rates;
(iii) double null Myd88_ Ager (Myd88 ^–/– /Ager ^–/–) mice had even
more greatly reduced survival rates; but (iv) single null Ager (Ager
^–/–) mice had drastically increased survival rates, which suggested
opposing roles between MyD88 (encoded by Myd88) KO and RAGE (encoded by
Ager) KO ([189]3). The [190]GSE22873 GEO data were created from tissues
recovered from the remnants of mouse livers 2 hours after the sterile
85% liver resection in the indicated deficient and wild-type mice
([191]3). These data were not subjected to bioinformatics analysis,
similar to our perspectives, although some DEGs and pathways were
suggested by primary analyses ([192]3) with Pathway Express ([193]16).
To investigate the mechanisms underlying such phenomena raised by the
original bench-experiments ([194]3) from the perspective of systems
immunology, we applied the PathwayKO platform to analyze the following
six subtypes of data we assigned: GSE22873_M (KO Myd88 vs. WT),
GSE22873_MA (KO Myd88_Ager vs. WT), GSE22873_A (KO Ager vs. WT),
GSE22873_MAvM (KO Myd88_Ager vs. KO Myd88), GSE22873_MAvA (KO
Myd88_Ager vs. KO Ager) and GSE22873_MvA (KO Myd88 vs. KO Ager). The
results are presented as follows.
5.1.2. The distribution of edge scores automatically estimated from a global
graph
The distribution of edge scores varied drastically among data ([195]
Figure 4 ). Each showed a distinction, as marked by an HES threshold
(HES1, HES2 and HES3, respectively), and reflected the different
effects of experimental treatments.
Figure 4.
[196]Figure 4
[197]Open in a new tab
The distribution of edge scores with drastic variations among data. (A)
GSE22873_M; (B) GSE22873_MA; (C) GSE22873_A; (D) GSE22873_MAvM; (E)
GSE22873_MAvA; (F) GSE22873_MvA.
5.1.3. The differentially expressed genes statistically selected by an HES
threshold
DEGs were statistically selected by using an HES threshold, as
described in Methods (see [198]Figures 2 , [199]4 ). By the HES1, HES2
and HES3 thresholds, respectively, GSE22873_M separately had 122 DEGs,
38 DEGs and 2 DEGs ([200] Supplemental Table S1 ) with the top
10-ranked genes (Myd88, Cxcl1, Fos, Nfatc3, Ccr1l1, Cxcr2, Ccr4, Rela,
Jun and Xcr1); GSE22873_MA had 130 DEGs, 57 DEGs and 2 DEGs ([201]
Supplemental Table S2 ) with the top 10-ranked genes (Myd88, Cxcl1,
H2-Bl, Ccr1l1, Ccr10, Cxcr2, Rela, Xcr1, Ccr1 and Cxcr5); and
GSE22873_A had 135 DEGs, 27 DEGs and 2 DEGs ([202] Supplemental Table
S3 ) with the top 10-ranked genes (H2-Bl, H2-T24, H2-M3, H2-Q6, Cdkn1a,
Tapbp, AP1g1, Ap1b1, Ap1m1 and H2-M10.1). Similarly, by the HES1
threshold, there were 55 DEGs (with top-ranked H2-Bl and H2-T24) in
GSE22873_MAvM ([203] Supplemental Table S4 ); 67 DEGs (with top-ranked
Myd88 and Cxcl1) in GSE22873_MAvA ([204] Supplemental Table S5 ); and
347 GEGs (with top-ranked H2-Bl, H2-M10.1, Myd88 and Cxcl1) in
GSE22873_MvA ([205] Supplemental Table S6 ). Of note, rather than Ager
itself, the MHC/antigen-related genes were ranked in the top 10 in
GSE22873_A, which compared RAGE-null with wild-type mice. None of the
above lists contained Ager, suggesting that (i) Ager itself was not yet
significantly differentially expressed as early as 2 hours post-surgery
when comparing RAGE-null against wild-type mice in GSE22873_A (KO Ager
vs. WT), born by the transcriptome, in our current analysis; and (ii)
RAGE (Ager)-mediated MHC/antigen-related genes play the most important
roles, implying epigenetic effects.
5.1.4. The KO gene-associated subnetwork automatically constructed from DEGs
The KO gene-associated subnetwork was comprised of the statistically
selected DEGs ([206] Figure 5 ). The landscapes of these subnetworks
explicitly revealed the global effects of experimental treatments at
the systems-level. For instance, the top-ranked two genes (Myd88 and
Cxcl1) connected the last edge, as highlighted by the HES3 threshold
(see [207]Figure 4 ); this result suggests that the KO Myd88-signaling
stimulated the highest responsive expression of Cxcl1 in both
GSE22873_M ([208] Supplemental Table S1 ) and GSE22873_MA ([209]
Supplemental Table S2 ). Similarly, the top-ranked two genes (H2-Bl and
H2-T24) connected the last edge (see [210]Figure 4 ), suggesting that
the KO Ager-signaling stimulated the highest responsive expression of
H2-Bl and H2-T24 in GSE22873_A ([211] Supplemental Table S3 ). In
addition, the remaining pairs displayed drastic variations, reflecting
subtractive responses between the indicated pairs ([212] Figure 5 ).
Figure 5.
[213]Figure 5
[214]Open in a new tab
The KO gene-associated subnetworks comprised of DEGs statistically
determined by HES1 and HES2. Landscapes are displayed since many node
labels are unreadable by humans. Representative key nodes (e.g., Myd88,
Cxcl1 and H2-Bl) are enlarged for clarity. (A) GSE22873_M, (B)
GSE22873_MA, and (C) GSE22873_A at HES1; (D) GSE22873_M, (E)
GSE22873_MA, and (F) GSE22873_A at HES2; (G) GSE22873_MAvM, (H)
GSE22873_MAvA, and (I) GSE22873_MvA at HES1.
5.1.5. The ROC curves with statistical features automatically built in a
pipeline fashion
The ROC curves with statistical features ([215] Figure 6 ) were
generated in a pipeline fashion, both across methods and across data.
These results indicate the diverse landscapes of experimental
treatments (data), which can discriminate the performance difference of
pathway analysis methods under comparison. (i) An AUC with 95% CI and
Youden’s best p-value threshold (noted shortly as p-Threshold) were
marked on each ROC curve, indicating an optimal point (specificity,
sensitivity) for a method. (ii) The 95% CIs of both specificity and
sensitivity were marked on each ROC curve, indicating the deviation for
a method on one data point. For instance, the drastic variation in 95%
CIs shown in GSE22873_A (KO Ager vs. WT) data suggested weak signals
over noise, which might impact the statistical hypothesis testing
([216]19). (iii) The partial AUCs (pAUC_SP and pAUC_SE) of 90–100%
specificity and sensitivity were marked on each ROC curve, implying a
bias toward a higher specificity and sensitivity, respectively. (iv)
The multiple ROC curves with AUCs intuitively displayed the superiority
of methods when benchmarked on the same data ([217] Figure 6 ).
Figure 6.
[218]Figure 6
[219]Open in a new tab
ROC curves with statistical features. GSE22873_A (KO Ager vs. WT)
displays a weak signal to noise ratio, implying a lower degree of
response to the stimulus. (A) GSE22873_M; (B) GSE22873_MA; (C)
GSE22873_A.
5.1.6. The ROC curve-based statistical hypothesis testing based on key
metrics
The ROC curve-based statistical hypothesis testing was conducted to
indicate the significance level of the difference between the two
methods ([220] Figure 7 ). For instance, two methods (SPIA vs. GSEA)
had a significant difference (e.g., p-value < 0.05) when benchmarked on
GSE22873_M and GSE22873_MA, respectively, in view of both ROC curves
themselves (via the venkatraman method) and the values of AUC, pAUC_SP
and pAUC_SE (via the bootstrap method). SPIA was better than GSEA in
these two cases ([221] Figures 7A, B ). However, the same two methods
(SPIA vs. GSEA) showed no significant difference (e.g., p-value > 0.1)
when benchmarked on GSE22873_A ([222] Figure 7C ). The reason is likely
attributed to the drastic variations in the 95% CIs of specificity and
sensitivity in GSE22873_A (see [223]Figure 6C ), which impact the
resampling-based bootstrap assessments used for the ROC curves-based
statistical hypothesis testing ([224]19).
Figure 7.
[225]Figure 7
[226]Open in a new tab
ROC curves-based statistical hypothesis testing of key metrics. The
top-panel is testing the two ROC curves themselves (via the venkatraman
method). The remaining panels are testing the AUC, pAUC_SP and pAUC_SE
values (via the bootstrap method). (A) GSE22873_M; (B) GSE22873_MA; (C)
GSE22873_A.
5.1.7. The two-dimensional probability evidence plots
The two-dimensional probability evidence plots were created for six
data ([227] Figure 8 ), which discriminate these data in view of the
resolution along the X- and Y-axes. The highest resolution along the
Y-axis suggests that the most likely target pathways were identified by
SPIA, as indicated by the fine distribution.
Figure 8.
[228]Figure 8
[229]Open in a new tab
Two-dimensional probability evidence plots. A randomly assigned
signaling pathway (green dot) located above or below the lines of
significance level suggests the intuitive difference in fine
distributions among data. (A) GSE22873_M; (B) GSE22873_MA; (C)
GSE22873_A; (D) GSE22873_MAvM; (E) GSE22873_MAvA; (F) GSE22873_MvA.
5.1.8. The true positive knockout signaling pathways statistically identified
We identified diverse true KO signaling pathways, which contained
target KO gene(s), including 21 from GSE22873_M (KO Myd88 vs. WT), 23
from GSE22873_MA (KO Myd88_ Ager vs. WT), 3 from GSE22873_A (KO Ager
vs. WT) and 24 from GSE22873_MvA (KO Myd88 vs. KO Ager) ([230] Table 2
; [231]Supplementary Table S7 ). The majority of them were identified
to be TPKO signaling pathways at a statistical significance level
(e.g., pGFdr < 0.001 or 0.005) by our definitions described in Methods.
(i) All 21 pathways from GSE22873_M were TPKO signaling pathways. (ii)
Twenty-two out of 23 pathways were TPKO signaling pathways, whereas
only 1 (mmu05010: Alzheimer disease) out of 23 pathways from
GSE22873_MA was a false negative knockout (FNKO) signaling pathway;
i.e., this pathway contained a target KO gene and should have been
significantly impacted, but was not identified to be positive at a
significance level (pGFdr > 0.005). (iii) Only 1 (mmu04933: AGE-RAGE
signaling pathway in diabetic complications) out of 3 pathways from
GSE22873_A was a TPKO signaling pathway, whereas the remaining two
pathways were FNKO signaling pathways (pGFdr > 0.005). (iv)
Twenty-three out of 24 pathways from GSE22873_MvA were TPKO signaling
pathways, whereas only 1 (mmu05150: Staphylococcus aureus infection)
out of 24 pathways was an FNKO signaling pathway (pGFdr > 0.005) ([232]
Supplementary Table S7 ).
Table 2.
True KO signaling pathways identified by SPIA from [233]GSE22873.
Order Pathway ID Pathway Name Status pSize DEGs (%) pGFdr
Pathways identified from GSE22873_M (KO Myd88 vs. WT)
1 mmu04620 Toll-like receptor signaling pathway Inhibited 87 36 (41.38)
1.03E-33
2 mmu05161 Hepatitis B Inhibited 149 36 (24.16) 1.20E-24
3 mmu05140 Leishmaniasis Inhibited 66 23 (34.85) 9.51E-20
4 mmu05152 Tuberculosis Inhibited 158 32 (20.25) 2.20E-19
5 mmu04621 NOD-like receptor signaling pathway Inhibited 148 30 (20.27)
6.91E-18
6 mmu05162 Measles Inhibited 131 28 (21.37) 6.91E-18
7 mmu05142 Chagas disease (American trypanosomiasis) Inhibited 99 23
(23.23) 4.43E-15
8 mmu05169 Epstein-Barr virus infection Inhibited 185 29 (15.68)
1.16E-14
9 mmu05164 Influenza A Inhibited 143 26 (18.18) 1.40E-14
10 mmu05145 Toxoplasmosis Inhibited 105 22 (20.95) 2.62E-14
11 mmu05170 Human immunodeficiency virus 1 infection Inhibited 194 28
(14.43) 2.16E-13
12 mmu05135 Yersinia infection Inhibited 116 22 (18.97) 7.84E-13
13 mmu05133 Pertussis Inhibited 67 17 (25.37) 2.36E-12
14 mmu04064 NF-kappa B signaling pathway Inhibited 92 18 (19.57)
1.94E-10
15 mmu05168 Herpes simplex virus 1 infection Inhibited 340 31 (9.12)
3.64E-10
16 mmu05132 Salmonella infection Inhibited 193 24 (12.44) 5.73E-10
17 mmu05134 Legionellosis Inhibited 51 13 (25.49) 2.56E-09
18 mmu05235 PD-L1/PD-1 checkpoint pathway in cancer Inhibited 84 15
(17.86) 1.38E-08
19 mmu04010 MAPK signaling pathway Inhibited 281 25 (8.90) 3.65E-07
20 mmu05144 Malaria Inhibited 48 9 (18.75) 1.23E-05
21 mmu05143 African trypanosomiasis Inhibited 31 7 (22.58) 0.000126
Pathways identified from GSE22873_MA (KO Myd88_Ager vs. WT)
1 mmu05170 Human immunodeficiency virus 1 infection Inhibited 194 61
(31.44) 2.18E-50
2 mmu04620 Toll-like receptor signaling pathway Inhibited 87 44 (50.57)
1.24E-45
3 mmu05169 Epstein-Barr virus infection Activated 185 53 (28.65)
8.65E-41
4 mmu05142 Chagas disease (American trypanosomiasis) Inhibited 99 35
(35.35) 2.30E-29
5 mmu05162 Measles Inhibited 131 38 (29.01) 1.42E-28
6 mmu05161 Hepatitis B Inhibited 149 40 (26.85) 1.63E-28
7 mmu05135 Yersinia infection Inhibited 116 35 (30.17) 1.61E-26
8 mmu05235 PD-L1/PD-1 checkpoint pathway in cancer Inhibited 84 31
(36.91) 1.94E-26
9 mmu05168 Herpes simplex virus 1 infection Inhibited 340 50 (14.71)
6.55E-24
10 mmu04621 NOD-like receptor signaling pathway Activated 148 33
(22.30) 1.24E-20
11 mmu05140 Leishmaniasis Inhibited 66 23 (34.85) 3.92E-19
12 mmu05164 Influenza A Inhibited 143 31 (21.68) 7.33E-19
13 mmu05152 Tuberculosis Inhibited 158 31 (19.62) 1.74E-17
14 mmu05133 Pertussis Inhibited 67 21 (31.34) 2.00E-16
15 mmu05145 Toxoplasmosis Inhibited 105 23 (21.91) 1.60E-14
16 mmu04064 NF-kappa B signaling pathway Inhibited 92 22 (23.91)
5.09E-14
17 mmu05132 Salmonella infection Inhibited 193 26 (13.47) 5.18E-11
18 mmu04933 AGE-RAGE signaling pathway in diabetics Activated 98 18
(18.37) 1.58E-09
19 mmu05134 Legionellosis Inhibited 51 13 (25.49) 7.03E-09
20 mmu04010 MAPK signaling pathway Inhibited 281 22 (7.83) 6.66E-06
21 mmu05144 Malaria Inhibited 48 9 (18.75) 2.00E-05
22 mmu05143 African trypanosomiasis Inhibited 31 7 (22.58) 0.000149
23 mmu05010 Alzheimer disease Activated 311 13 (4.18) 0.158326
Pathways identified from GSE22873_A (KO Ager vs. WT)
1 mmu04933 AGE-RAGE signaling pathway in diabetics Activated 98 10
(10.20) 0.002314
2 mmu05150 Staphylococcus aureus infection Activated 76 3 (3.95)
0.476408
3 mmu05010 Alzheimer disease Activated 311 8 (2.57) 0.638874
[234]Open in a new tab
True positive knockout (TPKO) signaling pathway (pGFdr < 0.001 or
0.005) and false negative knockout (FNKO) signaling pathway (pGFdr >
0.005) are distinguishable. Bold pathways are discussed in the main
text.
We categorized target TPKO signaling pathways into the following two
groups ([235]1): the basal signaling pathway that is solely responsible
for the KO-gene phenotype; and the composite signaling pathway that
embeds the named basal signaling pathway. Here, the basal signaling
pathway for KO Myd88 was the Toll-like receptor signaling pathway
(mmu04620), while the composite signaling pathways were the remaining
Myd88-containing signaling pathways ([236] Table 2 ; [237]Supplementary
Table S7 ). Similarly, the basal signaling pathway for KO Ager was the
AGE-RAGE signaling pathway in diabetic complications (mmu04933), while
the composite signaling pathways were the remaining Ager-containing
pathways ([238] Table 2 ; [239]Supplementary Table S7 ). Diverse target
TPKO signaling pathways were mainly linked to innate inflammatory
responses.
5.1.9. The target TPKO signaling pathways highlighted at the systems-level
The target TPKO signaling pathways identified by the SPIA method (see
[240]Table 2 ; [241]Supplementary Table S7 ) were rendered with key
regulatory proteins (encoded by genes). Key regulators included
ligands, receptors, adapters, transducers, transcription factors and
cytokines, which were coordinately, significantly and differentially
expressed at the systems-level. Such key regulators may constitute a
signaling-axis, as marked on a target TPKO signaling pathway. Some
top-ranked target TPKO signaling pathways are highlighted below as
examples ([242] Figures 9 –[243] 11 ).
Figure 9.
[244]Figure 9
[245]Open in a new tab
Comparison of the rendered Toll-like receptor signaling pathway
(mmu04620) identified by the SPIA method. This target basal TPKO
signaling pathway is solely responsible for the Myd88-null phenotype
and is significantly inhibited in deficient mice compared to wild-type
mice. The opposing roles of the (A) single Myd88-null and (B) double
Myd88_Ager-null mutants highlight the global effects of key regulators
(e.g., MyD88, NFκB, AP1 and cytokines) on orchestrating innate
inflammatory responses at the systems-level. Treatment (C) reflects the
subtractive difference between treatments (A, B). (A) GSE22873_M; (B)
GSE22873_MA; (C) GSE22873_MAvM.
Figure 11.
Figure 11
[246]Open in a new tab
Comparison of the rendered target TPKO signaling pathways identified by
the SPIA method from GSE22873_MvA (KO Myd88 vs. KO Ager). Explanations
are presented in the main text. (A) The MAPK signaling pathway
(mmu04010). (B) The NF-kappa B signaling pathway (mmu04064). (C) The
PD-L1 expression and PD-1 checkpoint pathway in cancer (mmu05235).
5.1.9.1. Scenario 1
The Toll-like receptor signaling pathway (mmu04620) was the target
basal signaling pathway that was solely responsible for the Myd88-null
phenotype in both single Myd88-null and double Myd88_Ager-null mutants
([247] Figure 9 ). This pathway was significantly inhibited in
deficient mice compared to wild-type mice (see [248]Table 2 ;
[249]Figure 9 ). More than 40% of its 87 critical genes were DEGs that
were coordinately, significantly and differentially up- or
down-regulated as early as 2 hours post-surgery (see [250]Table 2 ).
The opposing roles of single Myd88-null and double Myd88_Ager-null
mutants highlighted the global effects of key regulators (MyD88, NFκB,
AP1 (FOS, JUN) and cytokines) on orchestrating innate inflammatory
responses at the systems-level. (i) Key regulators in GSE22873_M ([251]
Figure 9A ) included significantly down-regulated TLR2, MyD88 and p38
as well as significantly up-regulated TAB2, IRF7 and AP1 (FOS, JUN).
These regulators were part of the TLR2-MyD88-TABs-AP1-ILs/CCL axis,
which orchestrated the significant down-regulation of downstream
inflammatory cytokines (TNFα, IL6, IL12, RANTES (CCL5), IFNβ and IP10
(CXCL10)), and reduced the proinflammatory, anti-viral and chemotactic
effects at the systems-level. (ii) Key regulators in GSE22873_MA ([252]
Figure 9B ) included significantly down-regulated MyD88, PI3K and IP10
(CXCL10) as well as significantly up-regulated FADD, AKT, NFκB and
IRF7. Such regulators constituted the TLRs-MyD88-IRAKs-TABs-NFκB axis,
the TLRs-PI3K-AKT-NFκB axis, the TLRs-MyD88-IRAKs-IRF7 axis, and the
TLRs-MyD88-FADD-CASP8 axis. Most of these axes regulated the expression
of inflammatory cytokines and induced proinflammatory and chemotactic
effects. In addition, the TLRs-MyD88-FADD-CASP8 axis promoted
apoptosis. (iii) The subtractive difference between the single
Myd88-null and double Myd88_Ager-null mutants ([253] Figures 9A, B ) is
displayed in [254]Figure 9C . Ager-null mice, backgrounded in
Myd88-null, induced down-regulation of PI3K and AP1 (FOS, JUN), but
up-regulation of IRF5, IL6 and MIP-1α (CCL3); these factors further
regulated the proinflammatory and chemotactic effects.
5.1.9.2. Scenario 2
The AGE-RAGE signaling pathway in diabetic complications (mmu04933) was
the target basal signaling pathway that was solely responsible for the
Ager-null phenotype in both single Ager-null and double Myd88_Ager-null
mutants ([255] Figure 10 ). This pathway was significantly activated in
the indicated deficient mice compared to wild-type mice (see
[256]Table 2 ; [257]Figure 10 ). Less than 20% of its 98 critical genes
were DEGs that were coordinately, significantly and differentially up-
or down-regulated as early as 2 hours post-hepatectomy in the two
treatments (see [258]Table 2 ). The opposing roles of the single
Ager-null and double Myd88_ Ager-null mutants highlighted the global
effects of key regulators (NFκB, PIM1 and MCP1 (CCL2 and CCL12)) on
orchestrating innate inflammatory responses at the systems-level. (i)
Key regulators in GSE22873_MA ([259] Figure 10A ) included
significantly down-regulated PI3K as well as significantly up-regulated
AKT and NFκB, which were part of the AGEs-RAGE-PI3K-AKT-NFκB-MCP1 axis
of the embedded PI3K-AKT signaling pathway (mmu04151). Activation of
this axis determined the significant down-regulation of downstream
inflammatory cytokine MCP1 (CCL2 and CCL12), thereby reduced the
inflammation, atherosclerosis and thrombogenesis effects at the
systems-level. (ii) Key regulators in GSE22873_A ([260] Figure 10B )
constituted multiple axes involved in regulating the single Ager-null
phenotype. Firstly, key regulators included significantly
down-regulated PLC and significantly up-regulated NFκB, and were part
of the AGEs-RAGE-PLC-PKC-NFκB axis. Activation of this axis, coupled
with the embedded MAPK signaling pathway (mmu04010) and the calcium
signaling pathway (mmu04020), induced inflammation, atherosclerosis and
thrombogenesis, in addition to RAGE-mediated positive feedback.
Secondly, key regulators including significantly up-regulated STAT5 and
PIM1 were part of the AGEs-RAGE-JAK2-STAT3-PIM1 axis. This axis,
coupled with the embedded JAK-STAT signaling pathway (mmu04630),
regulated vascular remodeling. Alternatively, the
AGEs-RAGE-JAK2-STAT5-CycD1/CDK4 axis, coupled with the cell cycle
signaling pathway, regulated cellular proliferation, leading to renal
hypertrophy. Finally, the key regulator AT1R was significantly
up-regulated and part of the AGT-AT1R-TGFβ-TGFβR-SMADs-p27/ECM axis.
This axis, coupled with the embedded TGFβ signaling pathway (mmu04350),
regulated cell hypertrophy and mesangial matrix expansion. Strikingly,
RAGE (encoded by Ager) itself was not significantly up- or
down-regulated in both cases when comparing the indicated null mutants
against wild-type mice as early as 2 hours post-surgery in our current
analysis, which implies that RAGE likely has an indirect effect of
unclear epigenetics on regulating the single Ager-null (KO RAGE)
phenotype at the systems-level. In addition, the subtractive difference
between the two cases ([261] Figures 10A, B ) is displayed in
[262]Figure 10C . Myd88-null mice, backgrounded in Ager-null, induced
down-regulation of PI3K, AP1 (FOS, JUN) and NFκB, but up-regulation of
p21RAS, which further up-regulated the production of IL6, which is
responsible for inflammation.
Figure 10.
Figure 10
[263]Open in a new tab
Comparison of the rendered AGE-RAGE signaling pathway in diabetic
complications (mmu04933) identified by the SPIA method. This target
basal TPKO signaling pathway is solely responsible for the Ager-null
(KO RAGE) phenotype, and is significantly activated in deficient mice
compared to wild-type mice as early as 2 hours post-surgery. The
opposing roles of the (A) double Myd88_ Ager-null and (B) single
Ager-null mutants highlight the global effects of key regulators (e.g.,
NFkB, MCP1 and PIM1) on orchestrating innate inflammatory responses at
the systems-level. Treatment (C) reflects the subtractive difference
between treatments (A, B). (A) GSE22873_MA; (B) GSE22873_A; (C)
GSE22873_MAvA.
5.1.9.3. Scenario 3
The treatment GSE22873_MvA (KO Myd88 vs. KO Ager) for a contrast
between the Myd88-null and Ager-null mutants was interesting and
plausible to elaborate the subtractive effects of KO Myd88 against KO
Ager. We identified 22 out of 23 significant TPKO signaling pathways
(pGFdr < 0.001 or 0.005) from it ([264] Supplemental Table S7 ).
Firstly, two basal TPKO signaling pathways were significantly altered
([265] Supplemental Table S7 ; [266]Figure S5 ). (i) The Toll-like
receptor signaling pathway (mmu04620) ([267] Supplemental Table S7 ;
[268]Figure S5A ) was ranked 2nd (with 55.17% DEGs out of the 87
critical genes) and was significantly inhibited. Key regulators (TLR2,
TLR7/8, MyD88, CTSK, TAB1, p105, p38 and NFκB) were significantly
down-regulated, whereas key regulators (MD2, PI3K, IRAK1, IRF7 and AP1
(FOS, JUN)) were significantly up-regulated. Such regulators
constituted the TLR2-MyD88-IRAK1-TAB1-NFκB axis, the TLR2-PI3K-AKT-NFκB
axis, the TLR7/8-MyD88-IRAK1-IRF7 axis, and the TLR2-MyD88-FADD-CASP8
axis. Activation of these axes resulted in up-regulation of
inflammatory cytokines (IL1β and IL6) and down-regulation of
inflammatory cytokines (IL12 and RANTES (CCL5)), which provoked
proinflammatory effects and chemotactic (neutrophils and immature
dendritic cells) effects at the systems-level. (ii) The AGE-RAGE
signaling pathway in diabetic complications (mmu04933) ([269]
Supplemental Table S7 ; [270]Figure S5B ) was ranked 7th (with 36.73%
DEGs out of the 98 critical genes) and was significantly activated. Key
regulators (PLC, p38 and NFκB) were significantly down-regulated,
whereas key regulators (PKC, PI3K, STAT3, STAT1/3, p21RAS, AP1 (FOS,
JUN) and EGR1) were significantly up-regulated. Most importantly,
activation of (i) the AGE-RAGE-PI3K-AKT-NFκB axis, coupled with the
embedded PI3K-AKT signaling pathway, up-regulated IL1 and IL6, which
are responsible for inflammation, as well as up-regulated VEGF, which
is responsible for angiogenesis; (ii) alternatively, activation of the
AGE-RAGE-PLC-PI3K-p38-AP1 axis or the AGE-RAGE-PI3K-ROS-p21RAS-p38-AP1
axis, coupled with the embedded MAPK signaling pathway and the calcium
signaling pathway, up-regulated the expression of IL1, IL6 and VEGF;
(iii) activation of the AGE-RAGE-JAK2-STAT3-axis, coupled with the
embedded JAK-STAT signaling pathway, down-regulated the expression of
PIM1, which is responsible for vascular remodeling; and (iv) activation
of the AGE-RAGE-PKCβII-JNK-EGR1-axis increased the expression of
EGR1-dependent genes leading to vascular dysfunction.
Secondly, multiple composite TPKO signaling pathways were also
targeted, as described below ([271] Figure 11 ; [272]Supplemental Table
S7 ). (i) The MAPK signaling pathway (mmu04010) ([273] Figure 11A ) was
ranked 12th and was significantly activated (with 20.64% DEGs of the
281 critical genes) ([274] Supplemental Table S7 ). On the classical
MAPK pathway, several key regulators (CACN, RTK, cPLA2 and NFκB) were
significantly down-regulated, whereas other key regulators (GF, G12,
GRB2, SOS, Ras, PKA, RSK2 and cFOS) were significantly up-regulated,
which induced proliferation and differentiation. On the JNK/p38 MAPK
pathway, several key regulators (MyD88, TAB1 and p38) were
significantly down-regulated, whereas other key regulators (IL1,
IRAK1/4, NFAT4 and AP1 (FOS, JUN)) were significantly up-regulated,
which stimulated proliferation, differentiation and inflammation. (ii)
The NFκB signaling pathway (mmu04064) ([275] Figure 11B ) was ranked
21st and was significantly inhibited (with 21.74% DEGs out of the 92
critical genes) ([276] Supplemental Table S7 ). Several key regulators
(MyD88, NFκB (p50) and MIP2) were significantly down-regulated, whereas
other key regulators (IL1β, MD2 and IRAK1/4) were significantly
up-regulated, which resulted in increased IL1β leading to positive
feedback and decreased MIP2 (CXCL1~CXCL3) leading to reduced
inflammation. (iii) The PD-L1 expression and PD-1 checkpoint pathway in
cancer (mmu05235) ([277] Figure 11C ) was ranked 15th and was
significantly inhibited (with 33.33% DEGs of the 84 critical genes).
Several key regulators (EGFR, TLR, MyD88, NFκB and p38) were
significantly down-regulated, whereas other key regulators (PI3K, Ras,
STAT, JUN, NFAT and AP1 (FOS, JUN)) were significantly up-regulated.
Activation of the EGF-EGFR-Ras-Erk-AP1 axis coupled with the MAPK
signaling pathway, the EGF-EGFR-PI3K-NFκB axis coupled with the
PI3K-AKT signaling pathway, or the PAMPs-TLR-MyD88-NFAT/NFκB axis
coupled with the Toll-like receptor signaling pathway stimulated the
production of PD-L1 (CD274). The PD-L1/PD-1 complex led to further
downstream effects, such as increased apoptosis as well as decreased
IL-2 production, T-cell activation, effector T-cell development, and
cell cycle progression. Auxiliary signaling pathways including the
Toll-like receptor signaling pathway, the MAPK signaling pathway, the
PI3K/AKT signaling pathway, the calcium signaling pathway, and the
T-cell receptor signaling pathway were also embedded in the indicated
target composite signaling pathways.
5.2. Case study 2: Evaluating the performance difference of pathway analysis
methods
5.2.1. Assignment of a benchmark
Based on the above comprehensive characterizations of individual data
in this study and in our previous publication ([278]1), we conclude
that GSE22873_M, GSE22873_MA, GSE24327_A, GSE24327_B and GSE24327_C
have strong signal to noise ratios when generated from the original
bench-experiments ([279]2, [280]3). These datasets are qualified to be
included in a collection of qualified data that serves as a proper
benchmark used to exemplify the different tasks in the present study
([281] Supplemental Figure S2 ).
5.2.2. All-in-all computations through batch-execution in a pipeline fashion
The operational scripts depict batch-executions, exemplifying how
target KO signaling pathways are automatically deciphered at the
systems-level in a pipeline fashion, both across methods and across
data ([282] Supplemental Figure S3 ). It takes approximately 25 minutes
to complete the batch-computations across seven methods over one data
on our high-performance workstation with the assigned configuration
(see [283]Figure 1 ; [284]Supplemental Figure S1 ). This is a
time-limiting step.
5.2.3. Evaluating the performance difference of methods under the same
conditions
The ROC curve-based statistical analyses (e.g., violin plot and Wilcox
test) are suitable for assessing the performance difference of the
pathway analysis methods under comparison ([285] Supplemental Figure S4
). The violin plot graph suggested that the topology-based methods
(SPIA, ROntoTools_PE and ROntoTools_pDIS) generally performed better
than the non-topology-based methods (GSEA, GSA, SAFF and PADOG); and
that SPIA was overall better than ROntoTools_PE and ROntoTools_pDIS
([286] Figure 12 ). These results coincide well with the findings of
each individual data (see [287]Figures 6 , [288]7 , plus [289]Figure 2
therein Ref ([290]1). However, the results of the Wilcox test (data not
shown) on the global metrics (AUC, pAUC_SP and pAUC_SE) suggested that
no significant difference was observed among methods when benchmarked
on this collection of data.
Figure 12.
Figure 12
[291]Open in a new tab
Violin plot of key metrics indicates the performance difference of
seven methods when benchmarked on a collection of five data. The
topology-based methods (SPIA, ROntoTools_PE and ROntoTools_pDIS)
generally perform better than the non-topology-based methods (GSEA,
GSA, SAFF and PADOG). SPIA is overall better than ROntoTools_PE and
ROntoTools_pDIS.
5.2.4. Key metrics suitable for a local comparison among methods
Collected at the optimal p-Threshold, the key metrics (FDR, FPR, FNR,
specificity, sensitivity, accuracy, precision and recall) reflect the
local properties of the ROC curves and are suitable for a local
comparison among methods (see [292]Table 1 ). With the chosen
benchmark, the topology-based methods (SPIA, ROntoTools_PE and
ROntoTools_pDIS) performed better than the non-topology-based methods
(GSEA, GSA, SAFF and PADOG) in terms of p-Threshold and FDR; and GSEA
displayed the poorest performance among the seven tested methods in
terms of specificity, sensitivity, accuracy and precision ([293]
Figure 12 ).
5.2.5. Key metrics suitable for a global comparison among methods
The key metrics including AUC, pAUC_SP and pAUC_SE reflect the overall
properties of the ROC curves and are appropriate for a global
comparison among methods (see [294]Table 1 ). The partial AUCs (pAUC_SP
and pAUC_SE) of the 90–100% specificity and sensitivity may display
drastic variations on a case-by-case basis, thus measuring possible
bias toward a higher specificity (true negative rate) or higher
sensitivity (true positive rate). With the chosen benchmark,
ROntoTools_PE performed better than ROntoTools_pDIS in terms of AUC,
pAUC_SP and pAUC_SE; and SPIA had a higher specificity (pAUC_SP) but a
lower sensitivity (pAUC_SE) than GSEA ([295] Figure 12 ).
6. Discussion
6.1. Mechanisms underlying systems immunology triggered by sterile extensive
hepatectomy
Our results endorse the main findings of the original bench-experiments
in view of the signaling effects on innate immune responses after
extensive hepatectomy ([296]3), as summarized below. (i) MyD88 and RAGE
distinctly modulated inflammation, proliferation and apoptosis. (ii)
Deletion of MyD88 gene (Myd88 null) significantly decreased the
survival rate. (iii) Deletion of RAGE gene (Ager null) significantly
increased the survival rate. (iv) Deletion of RAGE gene modulated NFκB
activation, up-regulated PIM1 expression and increased phospho-STAT3.
These results suggest that blockade of RAGE might rescue liver remnants
from the multiple signals that preclude adaptive proliferation
triggered primarily by MyD88 signaling pathways. We hypothesize
mechanisms underlying systems immunology triggered by the sterile 85%
liver resection as early as 2 hours post-hepatectomy in the liver
remnants of wild-type mice ([297] Figure 13 ). We exemplify five TPKO
signaling pathways, including the Toll-like receptor signaling pathway,
the AGE-RAGE signaling pathway in diabetic complications, the MAPK
signaling pathway, the NFκB signaling pathway, and the PD-L1 expression
and PD-1 checkpoint pathway in cancer ([298] Figure 13 ). Nonetheless,
it does not necessarily mean that other target TPKO signaling pathways
(see [299]Table 2 ; [300]Supplemental Table S7 ) should be excluded,
nor should these target pathways be altered at the same time or in a
single event; rather, these pathways are targeted because of massive
signaling-crosstalk at the systems-level (see [301]Figures 9 –[302] 11
; [303]Supplemental Figure S5 ), which are similar to those scenarios
identified from the CASP-model sepsis previously discussed in detail
([304]1). Hence, our results suggest diverse postinjury dysfunctions of
the liver at the systems-level. This increases the difficulty of early
diagnosis and prevention of liver failure after extensive hepatectomy
because it is unlikely that systemic failures can be precisely
attributed to a specific type of TPKO signaling pathway. Altogether,
the hypothetical mechanisms we proposed from bioinformatics analysis
and systems immunology provide novel informative cues that warrant
bench-experimental validation at the systems-level in the future. We
anticipate that such systems immunology in mice may serve as models for
humans and ultimately guide formulating the research paradigms and
composite strategies for the early diagnosis and prevention of diverse
dysfunctions of livers post-hepatectomy in ICU patients who have
undergone successful 85% hepatectomy for removing sick tissue and
transplanting healthy tissue ([305]3).
Figure 13.
[306]Figure 13
[307]Open in a new tab
Hypothetical mechanisms underlying systems immunology triggered by
extensive hepatectomy 2 hours post-resection in liver remnants. Key
regulators are up (red)- and down (green)-regulated and marked on the
indicated signaling axis coupled with target TPKO signaling pathways.
These regulators coordinate MyD88 (Myd88) vs. RAGE (Ager) signaling
transduction, thus orchestrating the regulation of innate inflammatory
responses at the systems-level. The MyD88 signaling pathway responsible
for proinflammatory responses is essential for survival consequent to
85% resection of the liver. MyD88 action stimulates the activation of
NFκB and the subsequent up-regulation of key genes (including IL6)
involved in liver regeneration responses. In contrast, RAGE opposes the
actions of MyD88 signaling by suppressing NFκB, thereby reducing the
activation of NFκB and the consequent production of cytokines
(including IL1, IL1β, IL6, IL12 and CCL5), and by suppressing
IL6-mediated phosphorylation of STAT3, which down-regulates PIM1 and
reduces hyperplastic responses. The AP1 complex (FOS and JUN) acts as a
hub in the present study. (A) Hepatocyte cytoplasm; (B) Hepatocyte
nucleus; (C) Hepatocytes microenvironment (Tumor cells and T cells).
Our results offer a better understanding of the mechanisms underlying
the innate inflammatory responses triggered by sterile 85% liver
resection ([308]3) from the perspectives of bioinformatics analysis and
systems immunology ([309] Figure 13 ). The in vitro original
bench-experiments suggested that some key regulators (RAGE, TRL4, NFκB
and PIM1) were significantly increased at 6 hours when comparing
hepatectomy wild-type mice to sham-surgery wild-type mice ([310]3).
Unfortunately, the microarray profiling data from sham-surgery
wild-type mice were absent from the transcriptome of [311]GSE22873
([312]https://ncbi.nlm.nih.gov/geo/query/acc.cgi/GSE22873) produced
from tissues sampled at 2 hours post-hepatectomy ([313]3), which
hindered us from comparing hepatectomy against sham surgery in
wild-type mice. However, in the present study, we can infer a virtual
treatment, GSE22873_MvA (KO Myd88 vs. KO Ager), that reflects a virtual
comparison between resected wild-type mice and untreated wild-type mice
(see [314]Figure 11 ; [315]Supplementary Figure S5 ). By which, we can
elaborate the subtractive effects of KO Myd88 over KO Ager by
summarizing the up- and down-regulation of key regulators, as
highlighted on the indicated signaling axis with target TPKO signaling
pathways (see [316]Figure 11 ; [317]Supplementary Figure S5 ). Thereby,
we can further depict innate inflammatory responses at the
systems-level to stimulus from the sterile 85% liver resection in
wild-type mice, where the influences of attenuation or enhancement by
the deletion of MyD88 and RAGE are inversely adapted ([318] Figure 13
). The critical regulators are differentially marked on target TPKO
signaling pathways, such as TLR2/4, MyD88, RAGE, NFκB, PI3K, STAT3,
NFAT, PIM1, IL1, IL1β, IL6, IL12 and RANTES (CCL5) (see [319]Figure 11
; [320]Supplementary Figure S5 ). These results are consistent with in
vitro bench-experiment evidence ([321]3, [322]29–[323]34) and in line
with recent reviews ([324]4, [325]35–[326]37). Intriguingly, the AP1
(FOS, JUN) complex emerges as a hub for the first time in this setting,
and this complex plays a critical role in the signal transduction of
DAMPs, PAMPs and AGEs that are triggered by extensive liver resection
([327] Figure 13 ). DAMPs are released by damaged and dying cells (such
as tumor cells) as endogenous host-derived molecules to promote sterile
inflammation ([328]4). DAMPs can also lead to the development of
numerous inflammatory diseases, including autoimmune diseases,
metabolic disorders, neurodegenerative diseases, and cancer ([329]3,
[330]4). The recognition of DAMPs is essential for tissue repair and
regeneration ([331]3, [332]4). The NFAT/AP1 transcriptional complex
should be further explored, as a hub involved in regulating
AP1-targeted genes at the systems-level ([333] Figure 13 ). In addition
to hepatocytes responding to stimulus from 85% hepatectomy ([334]
Figures 13A, B ), the hepatocyte microenvironment is crucial to
maintain proliferation, differentiation, apoptosis and inflammation
([335] Figures 13B, C ). Our findings are in line with recent advances
in literature ([336]38, [337]39). The NFAT/AP1 transcriptional complex
was recently modulated by compounds that disrupt the NFAT : AP1:DNA
complex, impairing transcription of cytokine genes (including IL2)
important for the effector immune response ([338]38). The transcription
factor nuclear factor of activated T cells (NFAT) has a key role in
both T-cell activation and tolerance and has emerged as an important
target of immune modulation ([339]38). NFAT directs the effector arm of
the immune response in the presence of activator protein-1 (AP1), and
T-cell anergy/exhaustion in the absence of AP1 ([340]38). The
PIM1/NFATc1 signaling pathway was recently suggested to be associated
with impaired fibrosis resolution in aged mice after bleomycin injury
([341]39).
Nevertheless, we did not recapture the original bench-experiment
observation that the RAGE-dependent suppression of Glo1 (i.e., a
detoxification pathway for pre-AGEs) enhanced the levels of AGEs and
fueled a mechanism suppressing IL6 action ([342]3). The RAGE-mediated
down-regulation of Glo1 increased the production of AGEs (i.e., RAGE
ligand) in the remnant, and AGEs via RAGE antagonized IL6, which in
turn reduced the phosphorylation of STAT3 and thus blocked the
up-regulation of PIM1 in wild-type mice compared to RAGE-null mice
([343]3). In fact, in vitro bench-experiments (Affymetrix gene arrays,
quantitative real-time PCR/ChIP assays, immunoblotting assays, etc.)
assessing target chemicals (AGE, IL6, phospho-STAT3 and PIM1) in
hepatocyte remnants 2 hours post-injury affirmed the observation that
AGE-RAGE suppressed the IL6-induced phosphorylation of STAT3 and the
up-regulation of PIM1 expression ([344]3). These results were
consistent with data from ex vivo bench-experiments in which
hepatocytes isolated from both WT and RAGE-null mice 2 hours after 85%
liver resection were incubated under the indicated conditions ([345]3).
Remarkably, AGE molecules can mediate extracellular signals (e.g.,
inflammation, aging, oxidative stress, ischemia-reperfusion and high
glucose) to the RAGE receptor, but RAGE (encoded by Ager) itself was
not significantly up- or down-regulated in our current analysis when
comparing deficient mice to wild-type mice (KO Ager vs. WT) (see
[346]Figure 10 ; [347]Supplemental Figure S5). Our findings tentatively
suggest that RAGE may indirectly (via unclear epigenetics) regulate the
KO RAGE (null Ager) phenotype and that more differentially expressed
genes (DEGs) may be targeted and involved (see [348]Figures 9 –[349] 11
; [350]Supplemental Figure S5 ; [351]Tables S1 -[352] S6 ); further
study is advised. We speculate that the expression of RAGE itself and
the RAGE-dependent suppression of Glo1 impairing IL-6 activity were not
dominant enough to be significantly detected as early as 2 hours
post-surgery, born by the transcriptome, when comparing deficient mice
to wild-type mice (see [353]Figure 10 ; [354]Supplemental Table S3 ).
The attractive role of RAGE-mediated epigenetics remains elusive.
6.2. Lessons from both non-infectious extensive hepatectomy and infectious
CASP-model surgery
Strikingly, our data indicate that sterile extensive hepatectomy
([355]3) and septic CASP-model surgery ([356]1, [357]2) share 21 KO
MyD88-associated target TPKO signaling pathways, including the
Toll-like receptor signaling pathway, the NFκB signaling pathway, the
MAPK signaling pathway, and the PD-L1 expression and PD-1 checkpoint
pathway in cancer, alongside the pathways of bacterial, viral and
parasitic infections (see [358]Table 2 ; [359]Supplementary Table S7 ).
These findings suggest that the two cases share common fundamental
mechanisms underlying the innate inflammatory responses as early as 2
hours and 12 hours post-surgery, respectively ([360]2, [361]3). We
infer that the down-regulation of MyD88-signaling, as marked on target
TPKO signaling pathways (see [362]Figure 9 ), should significantly
diminish proinflammatory responses by eliminating the downstream
“cytokine storm”, similar to the scenario we previously discussed in
infectious CASP-model sepsis ([363]1). We speculate that such target
TPKO signaling pathways could orchestrate the innate immune responses
at the early stage of post-surgery before diverse dysfunctions of
post-injured organs occur ([364]2, [365]3). Our results offer valuable
informative cues of systems immunology that warrant bench-experiment
validation at the systems-level in the future. We anticipate that
common fundamental mechanisms for both surgeries in mice may serve as
models for humans and ultimately guide formulating research paradigms
and prevention strategies for the early diagnosis and prevention of
dysfunctions of multiple organs at the early stage of post-surgery,
which in turn should reduce the high mortality rates of ICU patients
who have undergone successful traumatic surgical treatments.
Furthermore, our results favor recent proposals that TLRs, MyD88, RAGE
and AGEs should have complex interactions to coordinate diverse
mechanisms underlying innate immune responses at the systems-level
([366]4, [367]40–[368]42). Systems immunology triggered by MyD88- and
TLRs-signaling in a wide range of immune-associated diseases, including
mechanisms involved in diabetic complications ([369]43), cardiovascular
disease ([370]44), metabolic disease ([371]44) and other diseases
associated with innate inflammatory signaling pathways ([372]4,
[373]45), deserves to be elusive in the context of gene-KO experiments
through both experimental and computational analysis. Therefore,
systems characterization of the immune landscapes of innate immune
responses may dominate such research topics. We expect that the
PathwayKO platform offers powerful tools for exploring such important
topics and beyond.
6.3. The fundamental advantages of the PathwayKO platform
The PathwayKO platform has currently incorporated several eminent
methods that have been widely used by the community (see [374]Figure 1
). It was not our intention to choose the most prestigious method(s) to
integrate into the PathwayKO platform; rather, we intended to create an
integrated platform, flexible enough to incorporate promising methods
and to evaluate them under the same context. Multiple aspects were
considered as follows. (i) The choice of method is data-dependent since
each method may have individual bias toward specific data ([375]5,
[376]7). It is difficult to predict in advance which method should be
chosen ([377]5, [378]7, [379]8). (ii) Readers should be free to make
their own judgments on proper method(s) toward data with complete
resulting outputs in hand (see [380]Supplemental Figures S2 -[381] S4
); customers should also choose their favorite method(s) for routine
analysis. (iii) It is possible for readers to remove methods that are
performing consistently worse from the integrated PathwayKO platform by
simply opting out of them when initializing parameters for a
batch-execution (see [382]Supplemental Figures S1 , [383]S2 ). (iv) The
PathwayKO platform remains open to be constantly updated by
incorporating new promising methods. Accordingly, we have established
three fundamental advantages for the integrated PathwayKO platform, as
highlighted below.
Firstly, a ground truth is chosen for the PathwayKO platform based on
whether or not the knockout gene itself is included in the predicted
pathway to evaluate method performance, modified from the literature
([384]5). We defined the TPKO (true positive knockout) signaling
pathway as the pathway that comprises of, and is significantly impacted
by, the KO gene, and is correctly identified as a significantly
positive pathway (e.g., at the pathway-level p-value < 0.0001). We
further defined a set of terms modified from the literature ([385]5),
including FPKO, TNKO and FNKO signaling pathways, as described in the
Methods section. Thereby, we defined a set of derivative metrics
modified from the literature ([386]19), including FDR, FPR, FNR,
specificity, sensitivity, accuracy, precision, recall, AUC, pAUC_SP and
pAUC_SE (see [387]Table 1 ). Such that the universal rules in computer
science for direct measurements on a prediction accuracy can be fitted
to pathway enrichment analysis, which differs from indirect
measurements used by the community ([388]5, [389]7, [390]8, [391]46).
Secondly, for such a true-false case of prediction, the response versus
prediction with a probability allows us employing the external pROC
package ([392]19) to build ROC curves and compute the set of key
metrics (see [393]Table 1 ). The performance difference of pathway
enrichment analysis methods can be evaluated in terms of the ROC
curve-based statistics analysis for the first time in this setting (see
[394]Figures 6 –[395] 8 , [396]12 ). Moreover, each point on an ROC
curve represents a true KO (both TPKO and FNKO) signaling pathway in
our cases; and an ROC curve represents the tradeoff between specificity
and sensitivity for every possible p-value ([397]19). The Youden’s best
p-value threshold (denoted as p-Threshold) is a p-value that defines an
optimal point (specificity, sensitivity) on an ROC curve ([398]19,
[399]26), where the sum of specificity and sensitivity is maximal
([400]19, [401]25). Hence, collected at the optimal p-Threshold, some
key metrics (FDR, FPR, FNR, specificity, sensitivity, accuracy,
precision and recall) with local properties can be used to conduct a
local comparison, while others (AUC, pAUC_SP and pAUC_SE) with global
properties can be employed to perform a global comparison (see
[402]Table 1 ). And such metrics are appropriate for evaluating the
performance difference of methods (see [403]Figure 12 ) and for
assessing the quality of data (see [404]Figures 6 –[405] 8 ), both in
terms of the ROC curve-based statistics analysis.
Finally, the HES (high-edge-scores) approach and the change-point
analysis method ([406]17, [407]26) are employed to statistically select
DEGs (see [408]Figures 2 , [409]4 , [410]5 ) in a pipeline fashion.
This approach exerts two advantages: (i) a fair comparison can be made
among the methods integrated in the PathwayKO platform in a pipeline
fashion, rather than conducted one by one in a manually-interfering
manner. This also saves hundreds of hours from non-stop computations
under the same conditions when a large-scale collection of (benchmark)
data should be analyzed through a large-scale batch-computation (see
[411]Supplemental Figures S2 , [412]S3 ); and (ii) the resulting output
files are automatically created and named after each data (see
[413]Supplemental Figures S3 , [414]S4 ). Thereby, a fair comparison
can be pursued under the same conditions, and thus a fair choice of
method (or data) can be made in the same context. These features allow
readers to obtain complete and broad insights into customer data in a
timely manner at the systems-level ([415]1).
6.4. The prospects of the PathwayKO platform
We note some limitations in the current status of the PathwayKO
platform. Firstly, the complete resulting output files may be further
integrated for more displaying ways, e.g., integrating the results into
one output after applying a voting or merging mechanism ([416]7,
[417]8). Secondly, the PathwayKO platform deserves to explore other
kinds of signaling pathways ([418]47) (e.g., beyond the KO
gene-associated signaling pathways defined by the choice of ground
truth). For example, it might be possible that (i) a gene set
containing a KO gene does not represent an actual regulated pathway (in
the respective tissue and under respective condition, e.g., owing to
epigenetic effects or missing co-factors); and (ii) the gene set of a
truly affected pathway might not contain the KO gene. Thirdly, the
PathwayKO platform does not fit to other kinds of data sources beyond
GEO and KEGG pathways ([419]47). Fourthly, a large-scale benchmark
(constituted by KO transcriptomes) deserves to be created because a
larger benchmark will yield a fairer comparison to evaluate the
performance difference of methods under the same conditions ([420]5).
For which, qualified data must be selected, whose strong signals over
noise must be validated through comprehensive characterizations, as
revealed in the present study (see [421]Figure 6 ). Finally, a
large-scale benchmarking study toward gold-standard benchmarking
remains to be explored, as another major challenge in the field
([422]5, [423]7, [424]8, [425]46). All of these prospects deserve to be
explored via updating the platform in the future.
7. Conclusion
This article exemplified case studies associated with systems
immunology to elaborate the methodology, principle and application
features of the PathwayKO platform. The PathwayKO platform can
comprehensively analyze gene-knockout (KO) transcriptomes to uncover
mechanisms underlying systems immunology triggered by non-infectious
extensive hepatectomy born by [426]GSE22873 in the present study and by
infectious CASP-model surgery born by [427]GSE24327 in our previous
publication. The PathwayKO platform model-based assessments on the
performance of pathway analysis methods can also effectively evaluate
the performance difference of methods when benchmarked on a collection
of KO transcriptomes. A proper method toward data can be inferred.
Taken together, we recommend the PathwayKO platform be applicable to
broad fields (e.g., immunology, microbiology, genetics, pharmacology,
cancers biology, cell and developmental biology, etc.) as long as KO
transcriptomes are available.
The PathwayKO platform is suitable for systems characterization of
immune molecular landscapes of innate inflammatory responses in health,
disease and clinical intervention cases through analyzing
high-throughput transcriptomes from gene-knockout (KO) experiments.
Real-world case studies suggest that both cases ([428]GSE22873 and
[429]GSE24327) in the study share the same core set of 21 KO
MyD88-associated target signaling pathways, including the Toll-like
receptor signaling pathway, the NFκB signaling pathway, the MAPK
signaling pathway, and the PD-L1 expression and PD-1 checkpoint pathway
in cancer, alongside the pathways of bacterial, viral and parasitic
infections. These findings suggest common fundamental mechanisms
between the two cases and offer valuable insights into a better
understanding of mechanisms underlying the innate inflammatory
responses triggered by the non-infectious extensive hepatectomy (2
hours after 85% liver resection surgery in [430]GSE22873) and the
infectious CASP-model sepsis (12 hours after CASP-model surgery in
[431]GSE24327). Our results thus provide novel informative cues from
the perspectives of bioinformatics analysis and systems immunology,
which warrant further experimental validation in mice and may serve as
models for humans.
Data availability statement
The original contributions presented in the study are included in the
article/[432] Supplementary Material . Further inquiries can be
directed to the corresponding authors.
Author contributions
HA and YA designed the project HA designed and implemented the
PathwayKO platform, conducted computations and analyzed data. HA, FM
and YA interpreted results, wrote manuscript and approved the final
manuscript. All authors contributed to the article and approved the
submitted version.
Acknowledgments