Abstract Discovering dominant patterns and exploring dynamic behaviors especially critical state transitions and tipping points in high‐dimensional time‐series data are challenging tasks in study of real‐world complex systems, which demand interpretable data representations to facilitate comprehension of both spatial and temporal information within the original data space. This study proposes a general and analytical ultralow‐dimensionality reduction method for dynamical systems named spatial‐temporal principal component analysis (stPCA) to fully represent the dynamics of a high‐dimensional time‐series by only a single latent variable without distortion, which transforms high‐dimensional spatial information into one‐dimensional temporal information based on nonlinear delay‐embedding theory. The dynamics of this single variable is analytically solved and theoretically preserves the temporal property of original high‐dimensional time‐series, thereby accurately and reliably identifying the tipping point before an upcoming critical transition. Its applications to real‐world datasets such as individual‐specific heterogeneous ICU records demonstrate the effectiveness of stPCA, which quantitatively and robustly provides the early‐warning signals of the critical/tipping state on each patient. Keywords: critical state transition, interpretable data representation, spatial‐temporal PCA, ultralow‐dimensionality reduction __________________________________________________________________ The proposed spatial‐temporal principal component analysis (stPCA) method analytically reduces high‐dimensional time‐series data to a single latent variable by transforming spatial information into temporal dynamics. By preserving the temporal properties of the original data, stPCA effectively identifies critical transitions and tipping points. It provides robust early‐warning signals, demonstrating effectiveness on both simulation and real‐world datasets. graphic file with name ADVS-12-2408173-g005.jpg 1. Introduction Numerous physical and biological processes are characterized by high‐dimensional nonlinear dynamical systems, in which critical state transitions often occur.^[ [36]^1 ^] However, for most real‐world systems that are too complex to be simply described by an explicit model, we have to study their dynamics especially identifying the tipping point before an upcoming critical transition through high‐dimensional time‐series or sequential data. The exploratory analysis of these systems requires data dimensionality reduction and linear or nonlinear data representations,^[ [37]^2 ^] which are critical in dynamical analysis, pattern recognition, and visualization.^[ [38]^3 ^] It is still a challenging problem to efficiently perform dimensionality reduction and explore ultralow‐dimensional dynamics in an interpretable way for complex dynamical systems. A variety of data‐driven techniques have been developed to facilitate this task, including t‐distributed stochastic neighbor embedding (t‐SNE),^[ [39]^4 ^] uniform manifold approximation and projection,^[ [40]^5 ^] multidimensional scaling (MDS),^[ [41]^6 ^] local linear embedding (LLE),^[ [42]^7 ^] isometric feature mapping (ISOMAP).^[ [43]^8 ^] Nevertheless, these approaches generally lack the desirable linear and temporal interpretability, which is essential in enabling a more lucid depiction of location, direction, distance, and dynamics in the low‐dimensional representation space. Linear dimensionality reduction methods, including principal component analysis (PCA),^[ [44]^9 ^] linear discriminant analysis,^[ [45]^10 ^] sparse dictionary learning (SDL),^[ [46]^11 ^] and independent component analysis^[ [47]^12 ^] offer interpretability by preserving the relations among variables using linear transformations. For example, PCA projects the original data onto a lower‐dimensional subspace in a linear form such that the projected latent variables capture the maximum variance in the representation space. However, these existing methods do not leverage the temporal or inherent relations among sequentially observed samples within the original data; hence, explaining the dynamic nature of the time‐series data by these methods is difficult. Besides, these methods are sensitive to outliers and thereby susceptible to noise interference. These limitations are a barrier to the practical usage of data representation for real‐world dynamical systems. Deep neural networks (DNNs) have emerged as powerful tools in sequential data analysis and dynamic modeling, leveraging deep learning techniques to handle nonlinear embedding and intricate patterns in dynamical systems, including recurrent neural networks (RNNs)^[ [48]^13 ^] and the low‐rank variants,^[ [49]^14 , [50]^15 ^] temporal convolutional networks,^[ [51]^16 ^] autoencoders (AEs),^[ [52]^17 ^] deep belief networks (DBNs),^[ [53]^18 ^] transformers,^[ [54]^19 ^] CEBRA,^[ [55]^20 ^] LFADS,^[ [56]^21 ^] CANDyMan,^[ [57]^22 ^] and MARBLE.^[ [58]^23 ^] Despite of the state‐of‐the‐art performances of these DNN architectures, traditional RNNs can be computationally intensive and may suffer from issues such as vanishing or exploding gradients, which can affect their ability to capture all relevant information. Additionally, many DNN architectures still demand significant computational and memory resources, which can be limiting in resource‐constrained environments or scenarios with strict latency requirements. Recently, the Koopman operator theory offers an alternative perspective, suggesting that linear superposition may characterize nonlinear dynamics through the infinite‐dimensional linear Koopman operator,^[ [59]^24 , [60]^25 , [61]^26 , [62]^27 ^] but obtaining its representations has proven difficult in all but the simplest systems. One approach to find tractable finite‐dimensional representations of the Koopman operator is DNN,^[ [63]^28 ^] however, they still face challenges related to the requirement of large training samples and other computational issues. Dynamic mode decomposition (DMD)^[ [64]^24 , [65]^25 , [66]^26 ^] is another commonly employed method for approximating the Koopman operator from time‐series data. It decomposes the system dynamics into temporal modes, each of which is associated with a fixed oscillation frequency and decay/growth rate. Some variants of the DMD algorithm mitigate noise interference by using techniques such as a Schweppe‐type Huber maximum‐likelihood estimator to flag outliers, incorporating regularizers and constraints on the decomposition parameters, or deriving a maximum a posteriori (MAP) estimator for systems corrupted by multiplicative noise.^[ [67]^29 , [68]^30 , [69]^31 ^] Nonetheless, these DMD methods may require extensive time‐series data, and involves the computation of pseudo‐inverses of linear matrix operator and truncation of spectrum, all potentially impacting accuracy. Another method for representing dynamical systems is sparse identification of nonlinear dynamical systems,^[ [70]^32 ^] which combines sparsity‐promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations, but the accuracy highly depends on the choice of measurement coordinates and the sparsifying function. Furthermore, these dimensionality reduction methods mentioned above generally require multiple variables or principal components to represent the original high‐dimensional data. In other words, multiple representation components are usually mandatory for reconstructing a dynamical system by those methods to avoid information loss, which leads to challenges in efficiently visualizing or quantifying nonlinear phenomena, e.g., the critical slowing down (CSD) effect.^[ [71]^33 , [72]^34 ^] In fact, when a dynamical system approaches a bifurcation point from a stable steady state, the space is generically constrained to a center manifold, which typically has codim‐1 local bifurcation with a dominant real eigenvalue such as the saddle‐node bifurcation; thus, this one‐dimensional variable is roughly considered to represent the center manifold near the tipping point. A natural question arises for analyzing the observed high‐dimensional time‐series data: is it possible to devise an ultralow‐dimensionality reduction method with only a single latent variable to fully represent or reconstruct the whole dynamics of the original, yet unknown, high‐dimensional system? Based on the generalized Takens’ embedding theory,^[ [73]^35 , [74]^36 ^] the dynamics of a system can be topologically reconstructed from the delay embedding scheme if L > 2d > 0, where d is the box‐counting dimension of the attractor for the original system and L represents the embedding dimension. By assuming that the steady state of a high‐dimensional system is contained in a low‐dimensional manifold, which is generally satisfied for dissipative real‐world systems, the spatial‐temporal information (STI)^[ [75]^37 ^] transformation has theoretically been derived from the delay embedding theory. This approach transforms the spatial information of high‐dimensional data into the temporal dynamics of any (one) target variable, i.e., a one‐dimensional delay embedding of this single variable, thus exploiting not only inter‐sample (cross‐sample) information but also intra‐sample information embedded in the high‐dimensional/spatial data. Several methods, such as the randomly distributed embedding,^[ [76]^37 ^] the autoreservoir neural network,^[ [77]^38 ^] and the spatiotemporal information conversion machine,^[ [78]^39 ^] have been developed for predicting short‐term time series within the framework of STI transformation, which depends on an explicit target variable from high‐dimensional data. In this study, according to the STI transformation but relying on a latent variable, we develop a novel ultralow‐dimensionality reduction framework: spatial‐temporal principal component analysis (stPCA), to fully represent the dynamics of high‐dimensional time‐series data by only a single latent variable. This framework enables effective detection of the impending critical transitions in dynamical systems with codim‐1 local bifurcation by examining the fluctuation of this single variable derived by transforming high‐dimensional spatial information into low‐dimensional temporal information. On the basis of a solid theoretical background in nonlinear dynamics and delay embedding theory (Figure [79] 1A), stPCA naturally imposes constraints on samples from the inherent temporal characteristics of dynamical systems, rendering it an applicable method in representing a complex dynamical system with almost no information loss. Furthermore, this single representation variable is analytically obtained from a delay embedding‐based optimization problem by solving a characteristic equation, rather than using an iterative numerical optimization algorithm which generally depends on initial values of parameters W,^[ [80]^40 ^] and can theoretically preserve the dynamical properties of the original high‐dimensional system. Figure 1. Figure 1 [81]Open in a new tab Schematic illustration of stPCA. A) The given high‐dimensional “spatial” information X ^t can be transformed into one‐dimensional “temporal” information Z ^t based on the STI equations derived from the delay‐embedding theorem. B) Given the high‐dimensional time series X, stPCA seeks to obtain the transformation matrix W and the projected Hankel matrix Z. The optimization objective consists of two terms: the first term aims to maximize the variance of the projected variable Z, and the second term ensures that the projected Hankel matrix Z satisfies the delay embedding condition. The analytical solution to the optimization problem is obtained by solving a characteristic equation H(X)V = αV, where H(X) is a block‐tridiagonal matrix. C) Through stPCA, the high‐dimensional time series X is effectively reduced to the Hankel matrix Z, which is actually formed by a single variable. On the other hand, PCA is employed to reduce the high‐dimensional time series X to Y. Notably, both the single latent variable z from stPCA and multiple latent variables from PCA can topologically reconstruct the original dynamical system. Specifically, stPCA employs a spatial‐temporal transformation denoted as W, merging i) the attributes of linear interpretation from PCA and ii) the temporal dynamics from delay embeddings (Figure [82]1B). On one hand, similar to PCA, stPCA requires the maximum projected variance to attain a proper feature transformation (i.e., explainable features), ensuring the capability of a latent variable to quantify critical state transitions based on CSD theory; on the other hand, it requires the projected latent variable to adhere to the delay embedding constraints (i.e., temporal features);^[ [83]^38 , [84]^39 ^] thus, this ultralow 1‐dimensional variable can be theoretically used to topologically reconstruct the original high‐dimensional dynamical system. Therefore, it can be obtained by simultaneously optimizing the projected variance and the consistency with the delay embedding for the representative data. The existence of a closed‐form solution makes stPCA analytically tractable and offers a fast computational way to even large datasets. Owing to the single representative variable of stPCA, it may become convenient to explore, quantify, and even predict the short‐term nonlinear phenomena/behaviors of real‐world dynamical systems from the observed high‐dimensional time‐series data, e.g., the deterioration of diseases,^[ [85]^41 ^] climate changes,^[ [86]^42 ^] and the occurrences of strong earthquakes.^[ [87]^43 ^] In particular, such a single representative variable derived from stPCA can be directly utilized to detect early‐warning signals of a critical state or tipping point just before a sudden event based on the CSD effect (a state transition condition).^[ [88]^33 , [89]^41 , [90]^44 ^] We applied stPCA first to three representative models, including coupled Lorenz systems,^[ [91]^45 , [92]^46 ^] multiple‐node networks with Michaelis‒Menten forms,^[ [93]^47 ^] and a DREAM4 dataset.^[ [94]^48 ^] Subsequently, we extended the analysis to real‐world scenarios, i.e., the Medical Information Mart for both Intensive Care III (MIMIC‐III)^[ [95]^49 ^] and Intensive Care IV (MIMIC‐IV)^[ [96]^50 ^] datasets, with the aim of identifying the tipping points of patients during their ICU hospitalizations. With the dynamics of a single variable derived by stPCA from heterogeneous high‐dimensional ICU data, we proposed a decision‐making scheme for ICU discharge with the goal of improving patient survival rates and ICU performance. In addition, stPCA was utilized to analyze a single‐cell RNA sequencing dataset of human embryonic stem cells,^[ [97]^51 ^] successfully detecting the cell fate transition of differentiation into definitive endoderm. Finally, we predicted the risk of transitioning from a healthy state to type 2 diabetes (T2D) based on continuous glucose monitoring data.^[ [98]^52 ^] All of these results validated the effectiveness and efficiency of the proposed stPCA, as a new ultralow‐dimensionality reduction tool for dynamically decoding high‐dimensional time series. 2. Results 2.1. The stPCA Framework for High‐Dimensional Time‐Series Data: an Analytical and Explainable Approach of a Dynamical System The proposed stPCA framework is a dynamic dimensionality reduction method designed to maintain the interpretability of a linear subspace while representing a high‐dimensional dynamical system with a single projected variable using a delay embedding strategy. For an observed high‐dimensional time series [MATH: Xt=(x1t,x2t, ,xnt< /msubsup>) :MATH] with n variables and [MATH: t=1,2,,m :MATH] , we construct a corresponding delayed vector Z ^t = ( [MATH: Zt=(zt,zt+1, ,zt+L1) :MATH] by a delay‐embedding strategy with L > 1 as the embedding dimension (Figure [99]1A), where the symbol “ ′ ” represents the transpose of a vector. Clearly, X ^t is a spatial vector with n variables observed at one‐time point t, while Z ^t is a temporal vector formed by only one variable z but at many time points [MATH: t,t+1,,t+L 1 :MATH] . According to Takens’ embedding theory and its generalized versions, such as a delay‐embedding scheme, Z ^t topologically reconstructs the equivalent dynamics of the original system X ^t if L > 2d > 0, where d is the box‐counting dimension of the attractor of X ^t , via the spatial‐temporal information transformation (STI) equations Φ (X ^t ) = Z ^t shown in the Materials and Methods section. In this way, [MATH: Z=[Z1,Z2,,Zm] :MATH] is a Hankel matrix formed by an extended vector [MATH: z=(z1,z2,,zm,zm+1 ,,z m+L1< /mrow>) :MATH] of a single variable. Then, aiming to identify the critical state transition of a complex dynamical system, we linearize the STI equation as Z = WX. Based on the delay embedding requirement for Z, the latter m − 1 elements in the i‐th row are identical to the former m − 1 elements in the (i + 1)‐th row of the matrix Z, thus we can establish the following equation: [MATH: WiP=Wi+1Q :MATH] (1) where P and Q are two submatrices of X, i.e., P = [X ^2  X ^3 ⋅⋅⋅ X ^m ], Q = [X ^1  X ^2 ⋅⋅⋅ X ^m − 1], and W [i ] = (w [i1], w [i2],⋅⋅⋅, w[in] ), i = 1, 2, ⋅⋅⋅, L − 1 (see the [100]Experimental Section for details). More explanations about linear embedding are provided in the Discussion section. On the other hand, to obtain W such that Z can be obtained through Z = WX given X = [X ^1  X ^2 ⋅⋅⋅ X ^m ] with n variables { [MATH: x1t, x2t,,xnt< /mrow> :MATH] }, i.e., Z [i ]= W [i ]  X, it is necessary to minimize the reconstruction error or maximize the variance of the projected variable [MATH: Zi22 :MATH] . Then, the sample mean of X is removed as is done in PCA. Thus, the loss function is shown as [MATH: minimizeW(1λ)i=1LWiX22+λi =1L1WiPWi+1Q22,< /mrow> :MATH] (2) [MATH: s.t.i=1LWiWiT=1 :MATH] (3) where λ is a predetermined regularization parameter between 0 and 1. The first term of Equation ([101]2) is the typical PCA loss, which guarantees the maximum projected variance and enables identification of the critical state transition by examining the fluctuation of z, as detailed in the Materials and Methods section. The second term makes Z a Hankel matrix, i.e., governed by the delay embedding constraint according to Equation ([102]1). Theoretically, the obtained single variable z^t can be used to topologically reconstruct the dynamics of the original system X ^t , thus characterizing the dynamical features of the whole original system. Given X, in this work, an analytical solution to the optimization problem in Equation ([103]2) was presented by solving the following characteristic equation: [MATH: HXV=αV :MATH] (4) where H is a function that converts X to a block‐tridiagonal matrix, α and V denote the dominant eigenvalue and a corresponding eigenvector of matrix H(X), respectively. Matrix W is obtained by reshaping vector V (Figure [104]1B), then leading to Hankel matrix Z (Figure [105]1A). The main time cost comes from solving the dominant eigenvalue of H(X). Utilizing the tridiagonal matrix algorithm,^[ [106]^53 ^] its time complexity is O(nL), which ensures the high efficiency of the stPCA algorithm. 2.2. Ultralow‐Dimensionality Data Representation and High Dynamic Robustness: Application of stPCA to Time‐Series Data of a Coupled Lorenz Model To illustrate the mechanism of the proposed framework, a series of coupled Lorenz models^[ [107]^45 , [108]^46 ^] [MATH: X˙t=GXt;P :MATH] (5) were employed to generate synthetic time‐series datasets under different noise conditions, where G(·) is the nonlinear function of the Lorenz system with [MATH: X(t)=(x1t,,xnt) :MATH] , P is a parameter vector, and n is the dimension of X(t). More details about the Lorenz system are provided in Note [109]S4 (Supporting Information). To study the intrinsic dynamical information of variable z via stPCA, singular value decomposition (SVD) is carried out on the Hankel matrix Z stacked by z: [MATH: Z=z1z2zmz2z3zm+1z LzL+1zm+L1L×m=UΣR :MATH] (6) The columns of matrices U and R after SVD are structured hierarchically according to their capability to represent the columns and rows of Z. Usually, Z can be derived via a low‐rank approximation by utilizing the first r columns from U and R.^[ [110]^20 , [111]^21 ^] The initial r columns of R represent a time series of the intensity of each column of UΣ.^[ [112]^54 ^] In this manner, we can decompose the single variable z into multiple principal components projections (Figure [113] 2C,D). Figure 2. Figure 2 [114]Open in a new tab Synthetic time‐series datasets of a coupled Lorenz model were generated in noise‐free and noisy situations. A–I) For the strong noise‐perturbed scenario (noise intensity σ = 20), the dimensionality reduction results of a time series with length m = 50 and embedding dimension L = 20 of the 60D‐coupled (n = 60) Lorenz system. A) The original high‐dimensional time series X. B) The heatmap of the matrix [MATH: Z^ :MATH] obtained by stPCA. C) The one‐dimensional variable z obtained by stPCA. D) Projections of the top three principal components (denoted as PC for short in the figure) obtained from the Hankel matrix from stPCA. E) The projected phase of the one‐dimensional variable z. F) The heatmap of the projections Y of X obtained by PCA. G) The proportions of PC variances from PCA. H) Projections of the top three PCs from PCA. I) The projected phase of the #1 PC projection from PCA. J) The projections of the top 2 PCs from stPCA or PCA for a low‐dimensional system (n = 3) with m = 10, L = 5 and noise intensity σ = 0. K–N) For dimensionalities n = 3, 9, 15, and 30, the curves of top 2 PC projections from stPCA and PCA, along with the PCFD between the curve in Figure 2J and that in the strong noise‐perturbed scenario by stPCA and PCA, respectively. To quantify the similarity between noisy cases and noise‐free cases for the principal component projections of the stPCA and PCA algorithms respectively, we designed a metric called the principal component Frechet distance (PCFD) based on the Frechet distance^[ [115]^55 , [116]^56 ^] (Note [117]S7 for details, Supporting Information). Given a metric space [MATH: (H,D) :MATH] with D as the distance metric in [MATH: H :MATH] , for two curves [MATH: y:[b1,b2]H :MATH] and [MATH: z:[b1,b2]H :MATH] , their discrete Fréchet distance (DFD) is given as [MATH: DFDy,z=infβy, βzmaxt0,1< /munder>Dyβyt,zβzt :MATH] (7) where β[ y ]and β[ z ]are arbitrary continuous nondecreasing functions from [0, 1] onto [b [1], b [2]] and [MATH: [b1,b2] :MATH] , respectively. To better evaluate the distance of the two projection trajectories, we normalize the curves/projections by [MATH: y=Normalize(y) :MATH] and [MATH: z=Normalize(z) :MATH] . Any commonly used normalization methods, such as the z‐score, can be applied. Then [MATH: PCFDy,z=minDFDy,z,DFDy,z :MATH] (8) By plotting the projections of the first three columns of matrix R in Equation ([118]6) of one noise‐free case (m = 50, L = 20, n = 20, σ = 0), we found that the projections of the top principal components from Z by stPCA are very similar to those by PCA (Figure [119]S5, Supporting Information), with PCFD values of 0.108, 0.291 and 0.254 for the top 3 components, respectively, which indicates that the 1‐dimensional variable z actually contains as much information as multiple projected variables from PCA. This characteristic is evidence that our stPCA method achieves efficient dimensionality reduction to an ultralow‐dimensional space with almost no information loss compared with traditional PCA. The analysis results obtained by stPCA and PCA for more cases are presented in Figure [120]S1 (Supporting Information). Furthermore, both stPCA and PCA were applied in two scenarios: one with the original sample order and another with a randomly shuffled sample order, as shown in Figure [121]S2 (Supporting Information). In addition, we discuss the situation when there is additive noise, where n = 60, m = 50, L = 20, and noise intensity σ = 20 (Figure [122]2A). The results are illustrated in Figure [123]2B–I. The projections of the top three principal components from stPCA in the presence of noise closely resemble those from the noise‐free situation, which are shown in Figure [124]S5 (Supporting Information). Besides, Figure [125]2D,H demonstrates the high robustness of stPCA against noise in contrast to PCA, which may be due to the inherent temporal constraints of stPCA imposed by the dynamical system in a steady state. The results of more cases with different noise intensities are illustrated in Figure [126]S5 (Supporting Information). The quantified PCFD values for different embedding dimensions L are presented in Table [127]S1 (Supporting Information). We also applied stPCA and PCA to short‐term time series with m = 10 and L = 5. With the relatively low dimensionality n = 3 and a noise intensity σ = 0 of system Equation ([128]5), the projections of the top two principal components obtained from stPCA and PCA are illustrated in Figure [129]2J. When strong noise is introduced (σ = 20), we illustrated the top two principal components for dimensionalities n = 3, 9, 15, and 30 in Figure [130]2K–N. Clearly, with low dimensionality (n = 3), both stPCA and PCA exhibit unsatisfactory performance due to the presence of noise compared with the noise‐free situation (Figure [131]2K). As dimensionality increases, the #1‐#2‐projection curve obtained from stPCA becomes smoother, and the PCFD between this curve and that of stPCA in Figure [132]2J decreases from 1.6 to 0.23 (Figure [133]2L–N). Meanwhile, the projection curve from PCA always exhibits irregular fluctuations owing to noise perturbations. The improved performance of stPCA with the increasement of dimensionality of system Equation ([134]5) indicates that the inclusion of additional relevant variables results in an expansion of spatial information. According to Takens' embedding theory, such extra/increased spatial information can be transformed into temporal information, thereby enhancing the robustness of stPCA to noise, which also demonstrates the ability of stPCA in converting spatiotemporal information. More results are demonstrated in Figure [135]S3 (Supporting Information). We also applied stPCA to longer time series of a coupled Lorenz system and the DREAM4 dataset, and the results are shown in Figures [136]S4 and [137]S7 (Supporting Information). 2.3. The Early‐Warning Signals of a Critical Transition: Application of stPCA to the Tipping Point Detection of Multiple‐Node Networks To illustrate how stPCA effectively characterizes high‐dimensional time series, two network models were employed:^[ [138]^47 ^] a sixteen‐node/variable system with a Fold bifurcation (Figure [139] 3A) and an eighteen‐node/variable system with a Hopf bifurcation (Figure [140]3E). More details of these two networks are presented in Figures [141]S8, [142]S9, and in Note [143]S5 (Supporting Information). From these systems, two datasets were generated by varying parameter τ from 1 to 310 and from 1 to 190, respectively (Tables [144]S2 and [145]S3, Supporting Information). The critical transition points were set at τ = 210 (Hopf bifurcation) and τ = 100 (fold bifurcation). To investigate different states with the varying parameters of a dynamical system, a sliding‐window scheme was implemented to divide the entire process into smaller windows. For each window, the dynamics of the high‐dimensional time series are reduced to that of a latent variable z from stPCA (Figure [146]3B,F). Based on the CSD or dynamic network biomarker (DNB),^[ [147]^44 ^] the abrupt increase in the standard deviation (SD) of z = (z ^1,z ^2,⋅⋅⋅, z^m ,z ^m + 1, ⋅⋅⋅, z ^m + L − 1) (or the fluctuations of the “z” dynamics in a sliding window) indicates the upcoming critical transition (Figure [148]3C,G). The details are shown in the Materials and Methods section. Figure 3. Figure 3 [149]Open in a new tab Application of stPCA to the tipping point detection in simulated datasets. The datasets are generated from a multiple‐node network model. A) A 16‐node model of a 16‐dimensional dynamical system with a Hopf bifurcation. B) The observed dynamics of each node/variable as parameter τ varies for the 16‐node model. To capture the early‐warning signals of the tipping points before the critical transitions of the dynamical system, the original high‐dimensional time series are partitioned into sliding windows. The one‐dimensional latent variable z is obtained by stPCA from each sliding window. C) In each sliding window, the standard deviation (SD) of the latent variable z is calculated with the varying parameter τ and regularization parameter λ for the 16‐node model. D) The red and gray curves represent the average embedding error and all embedding errors as λ varies, respectively. E) An 18‐node model of an 18‐dimensional dynamical system with a fold bifurcation. F) The observed dynamics of each node/variable as parameter τ varies for the 18‐node model. The one‐dimensional latent variable z is obtained by stPCA from each sliding window. G) The SD of z is calculated within a sliding window as parameters τ and λ varies for the 18‐node model. H) The red curve represents the curve of the average embedding error as λ varies, while the gray curves represent the curve of all embedding errors as λ varies for the 18‐node model. In addition, when the regularization parameter λ varies from 0.78 to 0.98, stPCA can effectively detect the imminent critical transitions of systems with fold bifurcation (Figure [150]3C) and Hopf bifurcation (Figure [151]3G) through the one‐dimensional dynamics of z. However, the embedding error [MATH: Eembedding =ZZ^F :MATH] increases accordingly (Figure [152]3D,H) with decreasing λ. To meet the delay embedding condition so that [MATH: Z^ :MATH] can reconstruct the original high‐dimensional dynamics, it is necessary to keep the embedding error below 2, which suggests that λ typically ranges from 0.9 to 1. 2.4. ICU Decision‐Making: Real‐World Application of stPCA on MIMIC Datasets The MIMIC‐III database^[ [153]^49 ^] is a widely used resource for studying critical care patients.^[ [154]^57 , [155]^58 ^] It comprises deidentified health‐related data associated with patients who stayed in the critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 and contains comprehensive clinical information collected from the ICU stays. The percentage of patients who experienced death within a 5‐year period was 2.3%, and unplanned 30‐day readmissions were 12.9%; therefore, making accurate judgments about whether a patient is ready for discharge from the ICU is of utmost importance in optimizing life‐saving measures with the constraints of available medical resources. By analyzing these heterogeneous high‐dimensional noisy data, we aimed to gain insights into the complex nature of critical care conditions and contribute to improving patient management and decision‐making processes. In this study, we first applied stPCA to a subset of the MIMIC‐III database. We randomly selected 10 000 patients from this database. After a rigorous preprocessing procedure shown in Figure [156]S14 (Supporting Information), which involved data cleaning and quality control measures, a final dataset comprising 2908 patients was obtained (Figure [157] 4A). The data of each patient were represented by a heterogeneous high‐dimensional time series, i.e., a matrix consisting of at least 10 time points indicating the number of hours since his/her admission and containing no less than five diagnosis‐related indicators. Additional details of the preprocessing process are provided in Figure [158]S11 (Supporting Information). Figure 4. Figure 4 [159]Open in a new tab Performance of stPCA applied on the MIMIC‐III dataset with three groups on patients’ ICU stay. A) Patient admission data were extracted from the ICU PDMS. According to the screening criteria, we obtained 2908 patient‐specific matrices that represent tested disease indicators at different hours. The patients were divided into three groups, with B–D) representing patients who stayed in the ICU for 1–24 h, E–G) representing patients who stayed in the ICU for 25–48 h, and H–J) representing patients who stayed in the ICU for 49–96 h. For each group, the high‐dimensional disease indicators of each patient were dimensionally reduced to a single variable by stPCA. Then, a DTW scheme was adopted to align these time series so that they have the same length. B) Each colored curve corresponds to the SD of the one‐dimensional variable for each patient, and the black curve represents the average SD. Then, time series clustering was performed on these curves, resulting in four clusters. Notably, Cluster 3 exhibits strong fluctuations in the later stages. C) Survival analysis results for the four clusters with an ICU duration of 1–24 h. D) Proportion of patients in the four clusters with an ICU duration of 1–24 h who experienced the highest disease mortality rate. E) Time series clustering results for patients with an ICU duration of 25–48 h, with Cluster 3 exhibiting strong fluctuations in the later stages. F) Survival analysis results for the four clusters with ICU durations of 25–48 h. G) Proportion of patients in the four clusters with ICU durations of 25–48 h who experienced diseases with the highest mortality rates. H) Time series clustering results for patients with ICU durations of 49–96 h, with Cluster 2 exhibiting strong fluctuations in the later stages. I) Survival analysis results for the four clusters with ICU durations of 49–96 h. J) Proportion of patients in the four clusters with ICU durations of 49–96 h who experienced diseases with the highest mortality rates. For each patient, the high‐dimensional time series was reduced to one variable z by stPCA with the sliding‐window scheme, and then the fluctuation of z, Fl^z , was calculated;, i.e., [MATH: Fl(k)z=SD(z(k)1,z(k)2,,z(k)m,z(k)m+1,,z(k)m+L1) :MATH] was calculated for the kth sliding window (Figure [160]S12, Supporting Information). Based on the individual variations in ICU stay duration, we classified the curves of Fl^z into four groups: Group 1 (ICU stay duration 1–24 h), Group 2 (ICU stay duration 25–48 h), Group 3 (ICU stay duration 49–96 h), and Group 4 (ICU stay duration exceeding 4 days). To explore underlying dynamic patterns among patients in each group, we first used the dynamic time warping (DTW) algorithm^[ [161]^59 , [162]^60 ^] to align the curves of Fl^z so that they have a consistent length. The details of the DTW algorithm are presented in Note [163]S8 (Supporting Information). Then, the SD curves were clustered^[ [164]^60 , [165]^61 ^] within each group based on the aligned time series. Four clusters were obtained within each group (Figure [166]4B,E,H). Notably, there is a common pattern among Cluster 3 of Group 1 (Figure [167]4C), Cluster 3 of Group 2 (Figure [168]4F), and Cluster 2 of Group 3 (Figure [169]4I), all of which exhibit relatively strong fluctuations in z in the later stages, correlating with the poorest recorded prognosis within the corresponding cluster. In addition, there was a significant difference (pvalue < 0.05) in the prognoses of patients among different clusters in each group. Furthermore, by examining the proportion of patients within each cluster who suffered from diseases/diagnoses with the highest mortality rates (Figure [170]4D,G,J), it was found that in these clusters with poor prognoses, the proportions of patients afflicted by diagnoses with high mortality rates were also higher. Based on these observations, we speculate that for an individual patient, the significant fluctuations of z from stPCA may indicate a critical condition, warranting close monitoring and ongoing treatment until the Fl ^z index stabilizes. Furthermore, on the basis of Fl^z , i.e., the fluctuation of the latent variable z, and 2–5 diagnosis‐specific indicators, we proposed a new scheme to determine whether a patient should be discharged from the ICU at time point t, i.e., ICU decision‐making, as follows: In recent hours, [MATH: {twl+1,,t1,t} :MATH] , where wl represents the window length, if the average Fl ^z for a patient is significantly lower than the highest value observed since admission, it indicates a relatively stable condition after treatment. An index idx[ z ](t) of z is defined as follows: [MATH: idxzt=maxmeanSpastj /meanSrecent :MATH] (9) where vector S [past(j)] represents a vector of standard deviations of z during a past period (time window j);, i.e., [MATH: Spast(j)=(Fl(jwl+1)z,,Fl(j1)z,F< msubsup>l(j)z) :MATH] , with [MATH: j=wl,wl+1,, t1 :MATH] . In addition, vector S [recent] represents the standard deviations of z over the most recent wl time points, i.e., [MATH: Srecent=(Fl(twl+1)z,,Fl(t1)z,F< msubsup>l(t)z) :MATH] . During the recent period, 2–5 diagnosis‐specific items [MATH: xit :MATH] that are most closely associated should fall within the normal range (see Table [171]S4, Supporting Information). This can be evaluated using the following condition: [MATH: itmFlg(t)=i< /mi>=1K#(xit) :MATH] with item flags [MATH: #xit=1,LbixitUbi ,0,otherwise, :MATH] (10) where K is the number of selected items, Lb[i] and Ub[i] represent the lower and upper bounds of the normal range for [MATH: xit :MATH] , respectively. Then, the decision for ICU discharge at time point t is described as follows: [MATH: Decisiont=1,idxztFCanditmFlgt=K0,otherwise, :MATH] (11) where FC represents the fold change threshold. If Decision (t) = 1, i.e., a positive decision at time t, then the patient's condition is stable at time point t, thereby indicating suitability for ICU discharge. Otherwise, if Decision (t) = 0, i.e., a negative decision at time t, then the patient should stay in the ICU for further treatment and monitoring. An illustration of this discharge scheme is presented in Figure [172]S13 (Supporting Information). A set of standardized metrics are employed for evaluating the performance of the ICU discharge decision for a specific method as follows. When Decision (t) = 1, if the patient is discharged during a period of 5 h after t or all the relevant items thereafter are within the normal range, then this decision is considered a true positive (TP); otherwise, it is a false‐positive (FP). When Decision (t) = 0, if the patient is currently receiving treatment in the ICU, then this decision is considered a true negative (TN); otherwise, this decision is considered a false negative (FN). The F1‐score, accuracy (ACC), true positive rate (TPR)/recall, false positive rate (FPR), and precision metrics are calculated based on TP, FP, TN, and FN. The detailed descriptions and calculation formulas for these metrics are provided in Note [173]S6 (Supporting Information). Based on the aforementioned scheme, the overall performance of stPCA‐based decision‐making was assessed over the course of time since admission (Figure [174] 5A). The precision and F1‐score are ≈0.6 and 0.7, while TPR/recall and ACC metrics are all greater than 0.8 when t ≥ 30 h. When t < 30 h, the slightly lower F1‐score and TPR are due to the limited number of discharged patients. Then, the performances across five age cohorts are demonstrated in Figure [175]5B. Moreover, Figure [176]5C presents the decision performance of stPCA for the top‐8 diagnoses with the highest mortality rates: sepsis, intracranial hemorrhage, pneumonia, congestive heart failure, altered mental status, abdominal pain, stroke, and chest pain. Across various metrics, our decision scheme based on stPCA consistently demonstrates favorable results. Figure [177]5D illustrates the assessment of ICU discharge for four subjects diagnosed with sepsis, intracranial hemorrhage, pneumonia, and congestive heart failure. When t ≥ 40 h, 38 h, and 36 h for Subjects 1, 2, and 3, respectively, we determined that Decision (t) = 1; thus, Subjects 1–3 met the criteria for ICU discharge after treatment, which is consistent with their positive prognoses, as recorded. Conversely, Subject 4 exhibited inadequate treatment response and experienced a deteriorating state over time, which is consistent with the recorded negative prognosis. Figure 5. Figure 5 [178]Open in a new tab Performance of the decision for ICU discharge. A) For all samples across diagnoses of 2908 patients, the performance of stPCA‐based discharge decision was assessed for each hour since admission. B) TPR/Recall and Precision across five age cohorts are presented. C) For the top eight diagnoses with the highest mortality rate, we evaluate the performance of our discharge decision metrics based on the stPCA algorithm. D) We present four typical cases (patients suffering from sepsis, intracranial hemorrhage, pneumonia, and congestive heart failure). Combining the dimensionality reduction results of stPCA and 2–5 diagnosis‐related indicators, we can make decisions on whether patients should be discharged from the ICU. The red interval indicates that the patient should continue receiving ICU treatment or observation, while the blue interval suggests that the patient's condition is relatively stable and that they can be discharged from the ICU. 2.5. Performance Comparisons of Dimensionality Reduction Methods on the MIMIC‐III and MIMIC‐IV Datasets To further evaluate the performance of the stPCA algorithm in ICU decision‐making, stPCA is compared with seven commonly used dimensionality reduction or data representation methods including (1) PCA,^[ [179]^9 ^] (2) LLE,^[ [180]^7 ^] (3) t‐SNE,^[ [181]^4 ^] (4) MDS,^[ [182]^6 ^] (5) ISOMAP,^[ [183]^8 ^] (6) AE,^[ [184]^17 ^] (7) variational autoencoder (VAE),^[ [185]^62 ^] (8) MARBLE,^[ [186]^23 ^] (9) CANDyMan,^[ [187]^22 ^] (10) HAVOK,^[ [188]^26 ^] which can potentially be applied to the decision‐making process of ICU patient discharge. The ten methods for comparison are listed below, with hyperparameters mainly adopted from the original references. Further