Abstract
Discovering dominant patterns and exploring dynamic behaviors
especially critical state transitions and tipping points in
high‐dimensional time‐series data are challenging tasks in study of
real‐world complex systems, which demand interpretable data
representations to facilitate comprehension of both spatial and
temporal information within the original data space. This study
proposes a general and analytical ultralow‐dimensionality reduction
method for dynamical systems named spatial‐temporal principal component
analysis (stPCA) to fully represent the dynamics of a high‐dimensional
time‐series by only a single latent variable without distortion, which
transforms high‐dimensional spatial information into one‐dimensional
temporal information based on nonlinear delay‐embedding theory. The
dynamics of this single variable is analytically solved and
theoretically preserves the temporal property of original
high‐dimensional time‐series, thereby accurately and reliably
identifying the tipping point before an upcoming critical transition.
Its applications to real‐world datasets such as individual‐specific
heterogeneous ICU records demonstrate the effectiveness of stPCA, which
quantitatively and robustly provides the early‐warning signals of the
critical/tipping state on each patient.
Keywords: critical state transition, interpretable data representation,
spatial‐temporal PCA, ultralow‐dimensionality reduction
__________________________________________________________________
The proposed spatial‐temporal principal component analysis (stPCA)
method analytically reduces high‐dimensional time‐series data to a
single latent variable by transforming spatial information into
temporal dynamics. By preserving the temporal properties of the
original data, stPCA effectively identifies critical transitions and
tipping points. It provides robust early‐warning signals, demonstrating
effectiveness on both simulation and real‐world datasets.
graphic file with name ADVS-12-2408173-g005.jpg
1. Introduction
Numerous physical and biological processes are characterized by
high‐dimensional nonlinear dynamical systems, in which critical state
transitions often occur.^[ [36]^1 ^] However, for most real‐world
systems that are too complex to be simply described by an explicit
model, we have to study their dynamics especially identifying the
tipping point before an upcoming critical transition through
high‐dimensional time‐series or sequential data. The exploratory
analysis of these systems requires data dimensionality reduction and
linear or nonlinear data representations,^[ [37]^2 ^] which are
critical in dynamical analysis, pattern recognition, and
visualization.^[ [38]^3 ^] It is still a challenging problem to
efficiently perform dimensionality reduction and explore
ultralow‐dimensional dynamics in an interpretable way for complex
dynamical systems. A variety of data‐driven techniques have been
developed to facilitate this task, including t‐distributed stochastic
neighbor embedding (t‐SNE),^[ [39]^4 ^] uniform manifold approximation
and projection,^[ [40]^5 ^] multidimensional scaling (MDS),^[ [41]^6 ^]
local linear embedding (LLE),^[ [42]^7 ^] isometric feature mapping
(ISOMAP).^[ [43]^8 ^] Nevertheless, these approaches generally lack the
desirable linear and temporal interpretability, which is essential in
enabling a more lucid depiction of location, direction, distance, and
dynamics in the low‐dimensional representation space.
Linear dimensionality reduction methods, including principal component
analysis (PCA),^[ [44]^9 ^] linear discriminant analysis,^[ [45]^10 ^]
sparse dictionary learning (SDL),^[ [46]^11 ^] and independent
component analysis^[ [47]^12 ^] offer interpretability by preserving
the relations among variables using linear transformations. For
example, PCA projects the original data onto a lower‐dimensional
subspace in a linear form such that the projected latent variables
capture the maximum variance in the representation space. However,
these existing methods do not leverage the temporal or inherent
relations among sequentially observed samples within the original data;
hence, explaining the dynamic nature of the time‐series data by these
methods is difficult. Besides, these methods are sensitive to outliers
and thereby susceptible to noise interference. These limitations are a
barrier to the practical usage of data representation for real‐world
dynamical systems.
Deep neural networks (DNNs) have emerged as powerful tools in
sequential data analysis and dynamic modeling, leveraging deep learning
techniques to handle nonlinear embedding and intricate patterns in
dynamical systems, including recurrent neural networks (RNNs)^[ [48]^13
^] and the low‐rank variants,^[ [49]^14 , [50]^15 ^] temporal
convolutional networks,^[ [51]^16 ^] autoencoders (AEs),^[ [52]^17 ^]
deep belief networks (DBNs),^[ [53]^18 ^] transformers,^[ [54]^19 ^]
CEBRA,^[ [55]^20 ^] LFADS,^[ [56]^21 ^] CANDyMan,^[ [57]^22 ^] and
MARBLE.^[ [58]^23 ^] Despite of the state‐of‐the‐art performances of
these DNN architectures, traditional RNNs can be computationally
intensive and may suffer from issues such as vanishing or exploding
gradients, which can affect their ability to capture all relevant
information. Additionally, many DNN architectures still demand
significant computational and memory resources, which can be limiting
in resource‐constrained environments or scenarios with strict latency
requirements.
Recently, the Koopman operator theory offers an alternative
perspective, suggesting that linear superposition may characterize
nonlinear dynamics through the infinite‐dimensional linear Koopman
operator,^[ [59]^24 , [60]^25 , [61]^26 , [62]^27 ^] but obtaining its
representations has proven difficult in all but the simplest systems.
One approach to find tractable finite‐dimensional representations of
the Koopman operator is DNN,^[ [63]^28 ^] however, they still face
challenges related to the requirement of large training samples and
other computational issues. Dynamic mode decomposition (DMD)^[ [64]^24
, [65]^25 , [66]^26 ^] is another commonly employed method for
approximating the Koopman operator from time‐series data. It decomposes
the system dynamics into temporal modes, each of which is associated
with a fixed oscillation frequency and decay/growth rate. Some variants
of the DMD algorithm mitigate noise interference by using techniques
such as a Schweppe‐type Huber maximum‐likelihood estimator to flag
outliers, incorporating regularizers and constraints on the
decomposition parameters, or deriving a maximum a posteriori (MAP)
estimator for systems corrupted by multiplicative noise.^[ [67]^29 ,
[68]^30 , [69]^31 ^] Nonetheless, these DMD methods may require
extensive time‐series data, and involves the computation of
pseudo‐inverses of linear matrix operator and truncation of spectrum,
all potentially impacting accuracy. Another method for representing
dynamical systems is sparse identification of nonlinear dynamical
systems,^[ [70]^32 ^] which combines sparsity‐promoting techniques and
machine learning with nonlinear dynamical systems to discover governing
equations, but the accuracy highly depends on the choice of measurement
coordinates and the sparsifying function.
Furthermore, these dimensionality reduction methods mentioned above
generally require multiple variables or principal components to
represent the original high‐dimensional data. In other words, multiple
representation components are usually mandatory for reconstructing a
dynamical system by those methods to avoid information loss, which
leads to challenges in efficiently visualizing or quantifying nonlinear
phenomena, e.g., the critical slowing down (CSD) effect.^[ [71]^33 ,
[72]^34 ^] In fact, when a dynamical system approaches a bifurcation
point from a stable steady state, the space is generically constrained
to a center manifold, which typically has codim‐1 local bifurcation
with a dominant real eigenvalue such as the saddle‐node bifurcation;
thus, this one‐dimensional variable is roughly considered to represent
the center manifold near the tipping point. A natural question arises
for analyzing the observed high‐dimensional time‐series data: is it
possible to devise an ultralow‐dimensionality reduction method with
only a single latent variable to fully represent or reconstruct the
whole dynamics of the original, yet unknown, high‐dimensional system?
Based on the generalized Takens’ embedding theory,^[ [73]^35 , [74]^36
^] the dynamics of a system can be topologically reconstructed from the
delay embedding scheme if L > 2d > 0, where d is the box‐counting
dimension of the attractor for the original system and L represents the
embedding dimension. By assuming that the steady state of a
high‐dimensional system is contained in a low‐dimensional manifold,
which is generally satisfied for dissipative real‐world systems, the
spatial‐temporal information (STI)^[ [75]^37 ^] transformation has
theoretically been derived from the delay embedding theory. This
approach transforms the spatial information of high‐dimensional data
into the temporal dynamics of any (one) target variable, i.e., a
one‐dimensional delay embedding of this single variable, thus
exploiting not only inter‐sample (cross‐sample) information but also
intra‐sample information embedded in the high‐dimensional/spatial data.
Several methods, such as the randomly distributed embedding,^[ [76]^37
^] the autoreservoir neural network,^[ [77]^38 ^] and the
spatiotemporal information conversion machine,^[ [78]^39 ^] have been
developed for predicting short‐term time series within the framework of
STI transformation, which depends on an explicit target variable from
high‐dimensional data.
In this study, according to the STI transformation but relying on a
latent variable, we develop a novel ultralow‐dimensionality reduction
framework: spatial‐temporal principal component analysis (stPCA), to
fully represent the dynamics of high‐dimensional time‐series data by
only a single latent variable. This framework enables effective
detection of the impending critical transitions in dynamical systems
with codim‐1 local bifurcation by examining the fluctuation of this
single variable derived by transforming high‐dimensional spatial
information into low‐dimensional temporal information. On the basis of
a solid theoretical background in nonlinear dynamics and delay
embedding theory (Figure [79] 1A), stPCA naturally imposes constraints
on samples from the inherent temporal characteristics of dynamical
systems, rendering it an applicable method in representing a complex
dynamical system with almost no information loss. Furthermore, this
single representation variable is analytically obtained from a delay
embedding‐based optimization problem by solving a characteristic
equation, rather than using an iterative numerical optimization
algorithm which generally depends on initial values of parameters W,^[
[80]^40 ^] and can theoretically preserve the dynamical properties of
the original high‐dimensional system.
Figure 1.
Figure 1
[81]Open in a new tab
Schematic illustration of stPCA. A) The given high‐dimensional
“spatial” information X ^t can be transformed into one‐dimensional
“temporal” information Z ^t based on the STI equations derived from the
delay‐embedding theorem. B) Given the high‐dimensional time series X,
stPCA seeks to obtain the transformation matrix W and the projected
Hankel matrix Z. The optimization objective consists of two terms: the
first term aims to maximize the variance of the projected variable Z,
and the second term ensures that the projected Hankel matrix Z
satisfies the delay embedding condition. The analytical solution to the
optimization problem is obtained by solving a characteristic equation
H(X)V = αV, where H(X) is a block‐tridiagonal matrix. C) Through stPCA,
the high‐dimensional time series X is effectively reduced to the Hankel
matrix Z, which is actually formed by a single variable. On the other
hand, PCA is employed to reduce the high‐dimensional time series X to
Y. Notably, both the single latent variable z from stPCA and multiple
latent variables from PCA can topologically reconstruct the original
dynamical system.
Specifically, stPCA employs a spatial‐temporal transformation denoted
as W, merging i) the attributes of linear interpretation from PCA and
ii) the temporal dynamics from delay embeddings (Figure [82]1B). On one
hand, similar to PCA, stPCA requires the maximum projected variance to
attain a proper feature transformation (i.e., explainable features),
ensuring the capability of a latent variable to quantify critical state
transitions based on CSD theory; on the other hand, it requires the
projected latent variable to adhere to the delay embedding constraints
(i.e., temporal features);^[ [83]^38 , [84]^39 ^] thus, this ultralow
1‐dimensional variable can be theoretically used to topologically
reconstruct the original high‐dimensional dynamical system. Therefore,
it can be obtained by simultaneously optimizing the projected variance
and the consistency with the delay embedding for the representative
data. The existence of a closed‐form solution makes stPCA analytically
tractable and offers a fast computational way to even large datasets.
Owing to the single representative variable of stPCA, it may become
convenient to explore, quantify, and even predict the short‐term
nonlinear phenomena/behaviors of real‐world dynamical systems from the
observed high‐dimensional time‐series data, e.g., the deterioration of
diseases,^[ [85]^41 ^] climate changes,^[ [86]^42 ^] and the
occurrences of strong earthquakes.^[ [87]^43 ^] In particular, such a
single representative variable derived from stPCA can be directly
utilized to detect early‐warning signals of a critical state or tipping
point just before a sudden event based on the CSD effect (a state
transition condition).^[ [88]^33 , [89]^41 , [90]^44 ^] We applied
stPCA first to three representative models, including coupled Lorenz
systems,^[ [91]^45 , [92]^46 ^] multiple‐node networks with
Michaelis‒Menten forms,^[ [93]^47 ^] and a DREAM4 dataset.^[ [94]^48 ^]
Subsequently, we extended the analysis to real‐world scenarios, i.e.,
the Medical Information Mart for both Intensive Care III (MIMIC‐III)^[
[95]^49 ^] and Intensive Care IV (MIMIC‐IV)^[ [96]^50 ^] datasets, with
the aim of identifying the tipping points of patients during their ICU
hospitalizations. With the dynamics of a single variable derived by
stPCA from heterogeneous high‐dimensional ICU data, we proposed a
decision‐making scheme for ICU discharge with the goal of improving
patient survival rates and ICU performance. In addition, stPCA was
utilized to analyze a single‐cell RNA sequencing dataset of human
embryonic stem cells,^[ [97]^51 ^] successfully detecting the cell fate
transition of differentiation into definitive endoderm. Finally, we
predicted the risk of transitioning from a healthy state to type 2
diabetes (T2D) based on continuous glucose monitoring data.^[ [98]^52
^] All of these results validated the effectiveness and efficiency of
the proposed stPCA, as a new ultralow‐dimensionality reduction tool for
dynamically decoding high‐dimensional time series.
2. Results
2.1. The stPCA Framework for High‐Dimensional Time‐Series Data: an Analytical
and Explainable Approach of a Dynamical System
The proposed stPCA framework is a dynamic dimensionality reduction
method designed to maintain the interpretability of a linear subspace
while representing a high‐dimensional dynamical system with a single
projected variable using a delay embedding strategy. For an observed
high‐dimensional time series
[MATH: Xt=(x1t,x2t,
…,xnt<
/msubsup>)′ :MATH]
with n variables and
[MATH:
t=1,2,…,m :MATH]
, we construct a corresponding delayed vector Z ^t = (
[MATH: Zt=(zt,zt+1,
…,zt+L−1)′ :MATH]
by a delay‐embedding strategy with L > 1 as the embedding dimension
(Figure [99]1A), where the symbol “ ′ ” represents the transpose of a
vector. Clearly, X ^t is a spatial vector with n variables observed at
one‐time point t, while Z ^t is a temporal vector formed by only one
variable z but at many time points
[MATH:
t,t+1,…,t+L−
1 :MATH]
. According to Takens’ embedding theory and its generalized versions,
such as a delay‐embedding scheme, Z ^t topologically reconstructs the
equivalent dynamics of the original system X ^t if L > 2d > 0, where d
is the box‐counting dimension of the attractor of X ^t , via the
spatial‐temporal information transformation (STI) equations Φ (X ^t ) =
Z ^t shown in the Materials and Methods section. In this way,
[MATH: Z=[Z1,Z2,…,Zm] :MATH]
is a Hankel matrix formed by an extended vector
[MATH: z=(z1,z2,…,zm,zm+1
,…,z
m+L−1<
/mrow>) :MATH]
of a single variable. Then, aiming to identify the critical state
transition of a complex dynamical system, we linearize the STI equation
as Z = WX. Based on the delay embedding requirement for Z, the latter m
− 1 elements in the i‐th row are identical to the former m − 1 elements
in the (i + 1)‐th row of the matrix Z, thus we can establish the
following equation:
[MATH: WiP=Wi+1Q :MATH]
(1)
where P and Q are two submatrices of X, i.e., P = [X ^2 X ^3 ⋅⋅⋅ X ^m
], Q = [X ^1 X ^2 ⋅⋅⋅ X ^m − 1], and W [i ] = (w [i1], w [i2],⋅⋅⋅,
w[in] ), i = 1, 2, ⋅⋅⋅, L − 1 (see the [100]Experimental Section for
details). More explanations about linear embedding are provided in the
Discussion section.
On the other hand, to obtain W such that Z can be obtained through
Z = WX given X = [X ^1 X ^2 ⋅⋅⋅ X ^m ] with n variables {
[MATH:
x1t,
x2t,…,xnt<
/mrow> :MATH]
}, i.e., Z [i ]= W [i ] X, it is necessary to minimize the
reconstruction error or maximize the variance of the projected variable
[MATH: ∥Zi∥22
:MATH]
. Then, the sample mean of X is removed as is done in PCA. Thus, the
loss function is shown as
[MATH: minimizeW−(1−λ)∑i=1L∥WiX∥22+λ∑i
=1L−1∥WiP−Wi+1Q∥22,<
/mrow> :MATH]
(2)
[MATH:
s.t.∑i=1LWiWiT=1 :MATH]
(3)
where λ is a predetermined regularization parameter between 0 and 1.
The first term of Equation ([101]2) is the typical PCA loss, which
guarantees the maximum projected variance and enables identification of
the critical state transition by examining the fluctuation of z, as
detailed in the Materials and Methods section. The second term makes Z
a Hankel matrix, i.e., governed by the delay embedding constraint
according to Equation ([102]1). Theoretically, the obtained single
variable z^t can be used to topologically reconstruct the dynamics of
the original system X ^t , thus characterizing the dynamical features
of the whole original system.
Given X, in this work, an analytical solution to the optimization
problem in Equation ([103]2) was presented by solving the following
characteristic equation:
[MATH: HXV=αV :MATH]
(4)
where H is a function that converts X to a block‐tridiagonal matrix, α
and V denote the dominant eigenvalue and a corresponding eigenvector of
matrix H(X), respectively. Matrix W is obtained by reshaping vector V
(Figure [104]1B), then leading to Hankel matrix Z (Figure [105]1A). The
main time cost comes from solving the dominant eigenvalue of H(X).
Utilizing the tridiagonal matrix algorithm,^[ [106]^53 ^] its time
complexity is O(nL), which ensures the high efficiency of the stPCA
algorithm.
2.2. Ultralow‐Dimensionality Data Representation and High Dynamic Robustness:
Application of stPCA to Time‐Series Data of a Coupled Lorenz Model
To illustrate the mechanism of the proposed framework, a series of
coupled Lorenz models^[ [107]^45 , [108]^46 ^]
[MATH: X˙t=GXt;P :MATH]
(5)
were employed to generate synthetic time‐series datasets under
different noise conditions, where G(·) is the nonlinear function of the
Lorenz system with
[MATH: X(t)=(x1t,…,xnt)′ :MATH]
, P is a parameter vector, and n is the dimension of X(t). More details
about the Lorenz system are provided in Note [109]S4 (Supporting
Information).
To study the intrinsic dynamical information of variable z via stPCA,
singular value decomposition (SVD) is carried out on the Hankel matrix
Z stacked by z:
[MATH: Z=z1z2…zmz2z3…zm+1⋮⋮⋱⋮z
LzL+1…zm+L−1L×m=UΣR :MATH]
(6)
The columns of matrices U and R after SVD are structured hierarchically
according to their capability to represent the columns and rows of Z.
Usually, Z can be derived via a low‐rank approximation by utilizing the
first r columns from U and R.^[ [110]^20 , [111]^21 ^] The initial r
columns of R represent a time series of the intensity of each column of
UΣ.^[ [112]^54 ^] In this manner, we can decompose the single variable
z into multiple principal components projections (Figure [113] 2C,D).
Figure 2.
Figure 2
[114]Open in a new tab
Synthetic time‐series datasets of a coupled Lorenz model were generated
in noise‐free and noisy situations. A–I) For the strong noise‐perturbed
scenario (noise intensity σ = 20), the dimensionality reduction results
of a time series with length m = 50 and embedding dimension L = 20 of
the 60D‐coupled (n = 60) Lorenz system. A) The original
high‐dimensional time series X. B) The heatmap of the matrix
[MATH: Z^
:MATH]
obtained by stPCA. C) The one‐dimensional variable z obtained by stPCA.
D) Projections of the top three principal components (denoted as PC for
short in the figure) obtained from the Hankel matrix from stPCA. E) The
projected phase of the one‐dimensional variable z. F) The heatmap of
the projections Y of X obtained by PCA. G) The proportions of PC
variances from PCA. H) Projections of the top three PCs from PCA. I)
The projected phase of the #1 PC projection from PCA. J) The
projections of the top 2 PCs from stPCA or PCA for a low‐dimensional
system (n = 3) with m = 10, L = 5 and noise intensity σ = 0. K–N) For
dimensionalities n = 3, 9, 15, and 30, the curves of top 2 PC
projections from stPCA and PCA, along with the PCFD between the curve
in Figure 2J and that in the strong noise‐perturbed scenario by stPCA
and PCA, respectively.
To quantify the similarity between noisy cases and noise‐free cases for
the principal component projections of the stPCA and PCA algorithms
respectively, we designed a metric called the principal component
Frechet distance (PCFD) based on the Frechet distance^[ [115]^55 ,
[116]^56 ^] (Note [117]S7 for details, Supporting Information). Given a
metric space
[MATH: (H,D) :MATH]
with D as the distance metric in
[MATH: H :MATH]
, for two curves
[MATH: y:[b1,b2]→H :MATH]
and
[MATH: z:[b1′,b2′]→H :MATH]
, their discrete Fréchet distance (DFD) is given as
[MATH: DFDy,z=infβy,
βzmaxt∈0,1<
/munder>Dyβyt,zβzt
:MATH]
(7)
where β[ y ]and β[ z ]are arbitrary continuous nondecreasing functions
from [0, 1] onto [b [1], b [2]] and
[MATH: [b1′,b2′] :MATH]
, respectively. To better evaluate the distance of the two projection
trajectories, we normalize the curves/projections by
[MATH: y∼=Normalize(y) :MATH]
and
[MATH: z∼=Normalize(z) :MATH]
. Any commonly used normalization methods, such as the z‐score, can be
applied. Then
[MATH: PCFDy,z=minDFDy∼,z∼,DFD−y∼,z∼ :MATH]
(8)
By plotting the projections of the first three columns of matrix R in
Equation ([118]6) of one noise‐free case (m = 50, L = 20, n = 20,
σ = 0), we found that the projections of the top principal components
from Z by stPCA are very similar to those by PCA (Figure [119]S5,
Supporting Information), with PCFD values of 0.108, 0.291 and 0.254 for
the top 3 components, respectively, which indicates that the
1‐dimensional variable z actually contains as much information as
multiple projected variables from PCA. This characteristic is evidence
that our stPCA method achieves efficient dimensionality reduction to an
ultralow‐dimensional space with almost no information loss compared
with traditional PCA. The analysis results obtained by stPCA and PCA
for more cases are presented in Figure [120]S1 (Supporting
Information). Furthermore, both stPCA and PCA were applied in two
scenarios: one with the original sample order and another with a
randomly shuffled sample order, as shown in Figure [121]S2 (Supporting
Information).
In addition, we discuss the situation when there is additive noise,
where n = 60, m = 50, L = 20, and noise intensity σ = 20
(Figure [122]2A). The results are illustrated in Figure [123]2B–I. The
projections of the top three principal components from stPCA in the
presence of noise closely resemble those from the noise‐free situation,
which are shown in Figure [124]S5 (Supporting Information). Besides,
Figure [125]2D,H demonstrates the high robustness of stPCA against
noise in contrast to PCA, which may be due to the inherent temporal
constraints of stPCA imposed by the dynamical system in a steady state.
The results of more cases with different noise intensities are
illustrated in Figure [126]S5 (Supporting Information). The quantified
PCFD values for different embedding dimensions L are presented in
Table [127]S1 (Supporting Information).
We also applied stPCA and PCA to short‐term time series with m = 10 and
L = 5. With the relatively low dimensionality n = 3 and a noise
intensity σ = 0 of system Equation ([128]5), the projections of the top
two principal components obtained from stPCA and PCA are illustrated in
Figure [129]2J. When strong noise is introduced (σ = 20), we
illustrated the top two principal components for dimensionalities
n = 3, 9, 15, and 30 in Figure [130]2K–N. Clearly, with low
dimensionality (n = 3), both stPCA and PCA exhibit unsatisfactory
performance due to the presence of noise compared with the noise‐free
situation (Figure [131]2K). As dimensionality increases, the
#1‐#2‐projection curve obtained from stPCA becomes smoother, and the
PCFD between this curve and that of stPCA in Figure [132]2J decreases
from 1.6 to 0.23 (Figure [133]2L–N). Meanwhile, the projection curve
from PCA always exhibits irregular fluctuations owing to noise
perturbations. The improved performance of stPCA with the increasement
of dimensionality of system Equation ([134]5) indicates that the
inclusion of additional relevant variables results in an expansion of
spatial information. According to Takens' embedding theory, such
extra/increased spatial information can be transformed into temporal
information, thereby enhancing the robustness of stPCA to noise, which
also demonstrates the ability of stPCA in converting spatiotemporal
information. More results are demonstrated in Figure [135]S3
(Supporting Information). We also applied stPCA to longer time series
of a coupled Lorenz system and the DREAM4 dataset, and the results are
shown in Figures [136]S4 and [137]S7 (Supporting Information).
2.3. The Early‐Warning Signals of a Critical Transition: Application of stPCA
to the Tipping Point Detection of Multiple‐Node Networks
To illustrate how stPCA effectively characterizes high‐dimensional time
series, two network models were employed:^[ [138]^47 ^] a
sixteen‐node/variable system with a Fold bifurcation (Figure [139] 3A)
and an eighteen‐node/variable system with a Hopf bifurcation
(Figure [140]3E). More details of these two networks are presented in
Figures [141]S8, [142]S9, and in Note [143]S5 (Supporting Information).
From these systems, two datasets were generated by varying parameter τ
from 1 to 310 and from 1 to 190, respectively (Tables [144]S2
and [145]S3, Supporting Information). The critical transition points
were set at τ = 210 (Hopf bifurcation) and τ = 100 (fold bifurcation).
To investigate different states with the varying parameters of a
dynamical system, a sliding‐window scheme was implemented to divide the
entire process into smaller windows. For each window, the dynamics of
the high‐dimensional time series are reduced to that of a latent
variable z from stPCA (Figure [146]3B,F). Based on the CSD or dynamic
network biomarker (DNB),^[ [147]^44 ^] the abrupt increase in the
standard deviation (SD) of z = (z ^1,z ^2,⋅⋅⋅, z^m ,z ^m + 1, ⋅⋅⋅, z ^m
+ L − 1) (or the fluctuations of the “z” dynamics in a sliding window)
indicates the upcoming critical transition (Figure [148]3C,G). The
details are shown in the Materials and Methods section.
Figure 3.
Figure 3
[149]Open in a new tab
Application of stPCA to the tipping point detection in simulated
datasets. The datasets are generated from a multiple‐node network
model. A) A 16‐node model of a 16‐dimensional dynamical system with a
Hopf bifurcation. B) The observed dynamics of each node/variable as
parameter τ varies for the 16‐node model. To capture the early‐warning
signals of the tipping points before the critical transitions of the
dynamical system, the original high‐dimensional time series are
partitioned into sliding windows. The one‐dimensional latent variable z
is obtained by stPCA from each sliding window. C) In each sliding
window, the standard deviation (SD) of the latent variable z is
calculated with the varying parameter τ and regularization parameter λ
for the 16‐node model. D) The red and gray curves represent the average
embedding error and all embedding errors as λ varies, respectively. E)
An 18‐node model of an 18‐dimensional dynamical system with a fold
bifurcation. F) The observed dynamics of each node/variable as
parameter τ varies for the 18‐node model. The one‐dimensional latent
variable z is obtained by stPCA from each sliding window. G) The SD of
z is calculated within a sliding window as parameters τ and λ varies
for the 18‐node model. H) The red curve represents the curve of the
average embedding error as λ varies, while the gray curves represent
the curve of all embedding errors as λ varies for the 18‐node model.
In addition, when the regularization parameter λ varies from 0.78 to
0.98, stPCA can effectively detect the imminent critical transitions of
systems with fold bifurcation (Figure [150]3C) and Hopf bifurcation
(Figure [151]3G) through the one‐dimensional dynamics of z. However,
the embedding error
[MATH:
Eembe
mi>dding
=∥Z−Z^∥F :MATH]
increases accordingly (Figure [152]3D,H) with decreasing λ. To meet the
delay embedding condition so that
[MATH: Z^
:MATH]
can reconstruct the original high‐dimensional dynamics, it is necessary
to keep the embedding error below 2, which suggests that λ typically
ranges from 0.9 to 1.
2.4. ICU Decision‐Making: Real‐World Application of stPCA on MIMIC Datasets
The MIMIC‐III database^[ [153]^49 ^] is a widely used resource for
studying critical care patients.^[ [154]^57 , [155]^58 ^] It comprises
deidentified health‐related data associated with patients who stayed in
the critical care units of the Beth Israel Deaconess Medical Center
between 2001 and 2012 and contains comprehensive clinical information
collected from the ICU stays. The percentage of patients who
experienced death within a 5‐year period was 2.3%, and unplanned 30‐day
readmissions were 12.9%; therefore, making accurate judgments about
whether a patient is ready for discharge from the ICU is of utmost
importance in optimizing life‐saving measures with the constraints of
available medical resources. By analyzing these heterogeneous
high‐dimensional noisy data, we aimed to gain insights into the complex
nature of critical care conditions and contribute to improving patient
management and decision‐making processes.
In this study, we first applied stPCA to a subset of the MIMIC‐III
database. We randomly selected 10 000 patients from this database.
After a rigorous preprocessing procedure shown in Figure [156]S14
(Supporting Information), which involved data cleaning and quality
control measures, a final dataset comprising 2908 patients was obtained
(Figure [157] 4A). The data of each patient were represented by a
heterogeneous high‐dimensional time series, i.e., a matrix consisting
of at least 10 time points indicating the number of hours since his/her
admission and containing no less than five diagnosis‐related
indicators. Additional details of the preprocessing process are
provided in Figure [158]S11 (Supporting Information).
Figure 4.
Figure 4
[159]Open in a new tab
Performance of stPCA applied on the MIMIC‐III dataset with three groups
on patients’ ICU stay. A) Patient admission data were extracted from
the ICU PDMS. According to the screening criteria, we obtained 2908
patient‐specific matrices that represent tested disease indicators at
different hours. The patients were divided into three groups, with B–D)
representing patients who stayed in the ICU for 1–24 h, E–G)
representing patients who stayed in the ICU for 25–48 h, and H–J)
representing patients who stayed in the ICU for 49–96 h. For each
group, the high‐dimensional disease indicators of each patient were
dimensionally reduced to a single variable by stPCA. Then, a DTW scheme
was adopted to align these time series so that they have the same
length. B) Each colored curve corresponds to the SD of the
one‐dimensional variable for each patient, and the black curve
represents the average SD. Then, time series clustering was performed
on these curves, resulting in four clusters. Notably, Cluster 3
exhibits strong fluctuations in the later stages. C) Survival analysis
results for the four clusters with an ICU duration of 1–24 h. D)
Proportion of patients in the four clusters with an ICU duration of
1–24 h who experienced the highest disease mortality rate. E) Time
series clustering results for patients with an ICU duration of 25–48 h,
with Cluster 3 exhibiting strong fluctuations in the later stages. F)
Survival analysis results for the four clusters with ICU durations of
25–48 h. G) Proportion of patients in the four clusters with ICU
durations of 25–48 h who experienced diseases with the highest
mortality rates. H) Time series clustering results for patients with
ICU durations of 49–96 h, with Cluster 2 exhibiting strong fluctuations
in the later stages. I) Survival analysis results for the four clusters
with ICU durations of 49–96 h. J) Proportion of patients in the four
clusters with ICU durations of 49–96 h who experienced diseases with
the highest mortality rates.
For each patient, the high‐dimensional time series was reduced to one
variable z by stPCA with the sliding‐window scheme, and then the
fluctuation of z, Fl^z , was calculated;, i.e.,
[MATH: Fl(k)z=SD(z(k)1,z(k)2,…,z(k)m,z(k)m+1,…,z(k)m+L−
1) :MATH]
was calculated for the kth sliding window (Figure [160]S12, Supporting
Information). Based on the individual variations in ICU stay duration,
we classified the curves of Fl^z into four groups: Group 1 (ICU stay
duration 1–24 h), Group 2 (ICU stay duration 25–48 h), Group 3 (ICU
stay duration 49–96 h), and Group 4 (ICU stay duration exceeding 4
days). To explore underlying dynamic patterns among patients in each
group, we first used the dynamic time warping (DTW) algorithm^[
[161]^59 , [162]^60 ^] to align the curves of Fl^z so that they have a
consistent length. The details of the DTW algorithm are presented in
Note [163]S8 (Supporting Information).
Then, the SD curves were clustered^[ [164]^60 , [165]^61 ^] within each
group based on the aligned time series. Four clusters were obtained
within each group (Figure [166]4B,E,H). Notably, there is a common
pattern among Cluster 3 of Group 1 (Figure [167]4C), Cluster 3 of Group
2 (Figure [168]4F), and Cluster 2 of Group 3 (Figure [169]4I), all of
which exhibit relatively strong fluctuations in z in the later stages,
correlating with the poorest recorded prognosis within the
corresponding cluster. In addition, there was a significant difference
(pvalue < 0.05) in the prognoses of patients among different clusters
in each group. Furthermore, by examining the proportion of patients
within each cluster who suffered from diseases/diagnoses with the
highest mortality rates (Figure [170]4D,G,J), it was found that in
these clusters with poor prognoses, the proportions of patients
afflicted by diagnoses with high mortality rates were also higher.
Based on these observations, we speculate that for an individual
patient, the significant fluctuations of z from stPCA may indicate a
critical condition, warranting close monitoring and ongoing treatment
until the Fl ^z index stabilizes.
Furthermore, on the basis of Fl^z , i.e., the fluctuation of the latent
variable z, and 2–5 diagnosis‐specific indicators, we proposed a new
scheme to determine whether a patient should be discharged from the ICU
at time point t, i.e., ICU decision‐making, as follows:
In recent hours,
[MATH: {t−wl+1,…,t−
mo>1,t} :MATH]
, where wl represents the window length, if the average Fl ^z for a
patient is significantly lower than the highest value observed since
admission, it indicates a relatively stable condition after treatment.
An index idx[ z ](t) of z is defined as follows:
[MATH: idxzt=maxmeanSpastj
/meanSrecent
:MATH]
(9)
where vector S [past(j)] represents a vector of standard deviations of
z during a past period (time window j);, i.e.,
[MATH: Spast(j)=(Fl(j−wl+1)z,…,Fl(j−1)z,F<
msubsup>l(j)z) :MATH]
, with
[MATH:
j=wl,wl+1,…,
t−1 :MATH]
. In addition, vector S [recent] represents the standard deviations of
z over the most recent wl time points, i.e.,
[MATH: Srecent=(Fl(t−wl+1)z,…,Fl(t−1)z,F<
msubsup>l(t)z) :MATH]
.
During the recent period, 2–5 diagnosis‐specific items
[MATH: xit
:MATH]
that are most closely associated should fall within the normal range
(see Table [171]S4, Supporting Information). This can be evaluated
using the following condition:
[MATH: itmFlg(t)=∑i<
/mi>=1K#(xit) :MATH]
with item flags
[MATH: #xit=1,Lbi≤xit≤Ubi
,0,otherwise, :MATH]
(10)
where K is the number of selected items, Lb[i] and Ub[i] represent the
lower and upper bounds of the normal range for
[MATH: xit
:MATH]
, respectively.
Then, the decision for ICU discharge at time point t is described as
follows:
[MATH: Decisiont=1,idxzt≥FCanditmFlgt=K0,otherwise, :MATH]
(11)
where FC represents the fold change threshold. If Decision (t) = 1,
i.e., a positive decision at time t, then the patient's condition is
stable at time point t, thereby indicating suitability for ICU
discharge. Otherwise, if Decision (t) = 0, i.e., a negative decision at
time t, then the patient should stay in the ICU for further treatment
and monitoring. An illustration of this discharge scheme is presented
in Figure [172]S13 (Supporting Information).
A set of standardized metrics are employed for evaluating the
performance of the ICU discharge decision for a specific method as
follows. When Decision (t) = 1, if the patient is discharged during a
period of 5 h after t or all the relevant items thereafter are within
the normal range, then this decision is considered a true positive
(TP); otherwise, it is a false‐positive (FP). When Decision (t) = 0, if
the patient is currently receiving treatment in the ICU, then this
decision is considered a true negative (TN); otherwise, this decision
is considered a false negative (FN). The F1‐score, accuracy (ACC), true
positive rate (TPR)/recall, false positive rate (FPR), and precision
metrics are calculated based on TP, FP, TN, and FN. The detailed
descriptions and calculation formulas for these metrics are provided in
Note [173]S6 (Supporting Information).
Based on the aforementioned scheme, the overall performance of
stPCA‐based decision‐making was assessed over the course of time since
admission (Figure [174] 5A). The precision and F1‐score are ≈0.6 and
0.7, while TPR/recall and ACC metrics are all greater than 0.8 when t ≥
30 h. When t < 30 h, the slightly lower F1‐score and TPR are due to the
limited number of discharged patients. Then, the performances across
five age cohorts are demonstrated in Figure [175]5B. Moreover,
Figure [176]5C presents the decision performance of stPCA for the top‐8
diagnoses with the highest mortality rates: sepsis, intracranial
hemorrhage, pneumonia, congestive heart failure, altered mental status,
abdominal pain, stroke, and chest pain. Across various metrics, our
decision scheme based on stPCA consistently demonstrates favorable
results. Figure [177]5D illustrates the assessment of ICU discharge for
four subjects diagnosed with sepsis, intracranial hemorrhage,
pneumonia, and congestive heart failure. When t ≥ 40 h, 38 h, and 36 h
for Subjects 1, 2, and 3, respectively, we determined that Decision
(t) = 1; thus, Subjects 1–3 met the criteria for ICU discharge after
treatment, which is consistent with their positive prognoses, as
recorded. Conversely, Subject 4 exhibited inadequate treatment response
and experienced a deteriorating state over time, which is consistent
with the recorded negative prognosis.
Figure 5.
Figure 5
[178]Open in a new tab
Performance of the decision for ICU discharge. A) For all samples
across diagnoses of 2908 patients, the performance of stPCA‐based
discharge decision was assessed for each hour since admission. B)
TPR/Recall and Precision across five age cohorts are presented. C) For
the top eight diagnoses with the highest mortality rate, we evaluate
the performance of our discharge decision metrics based on the stPCA
algorithm. D) We present four typical cases (patients suffering from
sepsis, intracranial hemorrhage, pneumonia, and congestive heart
failure). Combining the dimensionality reduction results of stPCA and
2–5 diagnosis‐related indicators, we can make decisions on whether
patients should be discharged from the ICU. The red interval indicates
that the patient should continue receiving ICU treatment or
observation, while the blue interval suggests that the patient's
condition is relatively stable and that they can be discharged from the
ICU.
2.5. Performance Comparisons of Dimensionality Reduction Methods on the
MIMIC‐III and MIMIC‐IV Datasets
To further evaluate the performance of the stPCA algorithm in ICU
decision‐making, stPCA is compared with seven commonly used
dimensionality reduction or data representation methods including (1)
PCA,^[ [179]^9 ^] (2) LLE,^[ [180]^7 ^] (3) t‐SNE,^[ [181]^4 ^] (4)
MDS,^[ [182]^6 ^] (5) ISOMAP,^[ [183]^8 ^] (6) AE,^[ [184]^17 ^] (7)
variational autoencoder (VAE),^[ [185]^62 ^] (8) MARBLE,^[ [186]^23 ^]
(9) CANDyMan,^[ [187]^22 ^] (10) HAVOK,^[ [188]^26 ^] which can
potentially be applied to the decision‐making process of ICU patient
discharge. The ten methods for comparison are listed below, with
hyperparameters mainly adopted from the original references. Further