Abstract Importance Clinical decision and immunosuppression dosing in kidney transplantation rely on transplant biopsy tissue histology even though histology has low specificity, sensitivity, and reproducibility for rejection diagnosis. The inclusion of stable allografts in mechanistic and clinical studies is vital to provide a normal, noninjured comparative group for all interrogative studies on understanding allograft injury. Objective To refine the definition of a stable allograft as one that is clinically, histologically, and molecularly quiescent using publicly available transcriptomics data. Design, Setting, and Participants In this prognostic study, the National Center for Biotechnology Information Gene Expression Omnibus was used to search for microarray gene expression data from kidney transplant tissues, resulting in 38 studies from January 1, 2017, to December 31, 2018. The diagnostic annotations included 510 acute rejection (AR) samples, 1154 histologically stable (hSTA) samples, and 609 normal samples. Raw fluorescence intensity data were downloaded and preprocessed followed by data set merging and batch correction. Main Outcomes and Measures The primary measure was area under the receiver operating characteristics curve from a set of feature selected genes and cell types for distinguishing AR from normal kidney tissue. Results Within the 28 data sets, the feature selection procedure identified a set of 6 genes (KLF4, CENPJ, KLF2, PPP1R15A, FOSB, TNFAIP3) (area under the curve [AUC], 0.98) and 5 immune cell types (CD4^+ T-cell central memory [Tcm], CD4^+ T-cell effector memory [Tem], CD8^+ Tem, natural killer [NK] cells, and Type 1 T helper [T[H]1] cells) (AUC, 0.92) that were combined into 1 composite Instability Score (InstaScore) (AUC, 0.99). The InstaScore was applied to the hSTA samples: 626 of 1154 (54%) were found to be immune quiescent and redefined as histologically and molecularly stable (hSTA/mSTA); 528 of 1154 (46%) were found to have molecular evidence of rejection (hSTA/mAR) and should not have been classified as stable allografts. The validation on an independent cohort of 6 months of protocol biopsy samples in December 2019 showed that hSTA/mAR samples had a significant change in graft function (r = 0.52, P < .001) and graft loss at 5-year follow-up (r = 0.17). A drop by 10 mL/min/1.73m^2 in estimated glomerular filtration rate was estimated as a threshold in allograft transitioning from hSTA/mSTA to hSTA/mAR. Conclusions and Relevance The results of this prognostic study suggest that the InstaScore could provide an important adjunct for comprehensive and highly quantitative phenotyping of protocol kidney transplant biopsy samples and could be integrated into clinical care for accurate estimation of subsequent patient clinical outcomes. Introduction Breakthroughs in surgical approaches and development of newer generations of immunosuppressive drugs have resulted in reduction of clinical allograft acute rejection (AR) and improvements in life expectancy and quality of life for kidney transplant recipients.^[31]1 Nevertheless, a burden of subclinical AR is present only at a molecular level, not associated with an alteration in graft function, and often not accompanied by changes in graft histology.^[32]2,[33]3,[34]4,[35]5,[36]6,[37]7,[38]8 In addition, the significant discrepancies (19%-55%) among pathologists for histologic phenotyping^[39]9,[40]10 result in a lack of consistency in interpreting an allograft as rejected,^[41]11,[42]12 not rejected, or stable,^[43]7,[44]9,[45]10,[46]13 thereby introducing bias in the interrogative mechanistic studies on allograft pathology. Furthermore, there is a failure to uncover the molecular biologic diversity in the histologic definition of a stable allograft. This bias is further amplified during interrogation of kidney transplant biopsy samples across different pathologists and investigators in public data sets. In this study, we have aggregated, to our knowledge, the largest public data set for human kidney transplantation to date: 2273 kidney tissue microarray samples from 28 publicly available normal and transplant kidney tissue data sets^[47]14 in Gene Expression Omnibus,^[48]15 a public genomics data repository, to investigate the molecular diversity of stable allografts.^[49]16,[50]17,[51]18,[52]19 We proposed that for accurate definition of a stable allograft, the sample must be associated with (1) stable clinical function, (2) normal kidney histology with AR (histologically stable [hSTA]), and (3) absence of a transcriptional signature of AR (molecularly stable [mSTA]). Recognizing the previously discussed variabilities in allograft histology interpretation, we expected that some of the labeled stable samples in these data sets (that only use the first 2 criteria listed above) would have inherent molecular variability. Our analysis has resulted in the generation of a histology-independent composite gene and cell-specific computational Instability Score (the InstaScore) to discern molecular rejection in hSTA allografts, classifying clinically stable (truth) samples as histologically and molecularly stable (hSTA/mSTA) or clinically and histologically stable (untruth) samples with molecular rejection (hSTA/mAR). Thus, our prognostic study proposes an approach to recognize immunologic heterogeneity in hSTA kidney allografts. Methods Data Collection For this prognostic study, we carried out a comprehensive search for publicly available microarray data at the National Center for Biotechnology Information Gene Expression Omnibus database^[53]15 for biopsy kidney transplant samples from January 1, 2017, to December 31, 2018. Any public, deidentified data available as open access were not subject to local institutional review board requirements or patient consent as allowed under the Common Rule. For any private data used, we obtained the approval of the institutional review board of the University of California, San Francisco, and written informed consent from all patients. After stringent data quality control procedures (eMethods in the [54]Supplement), the final data set consisted of 28 studies with 2273 samples. Their diagnostic annotations included 510 AR samples (including antibody-mediated rejection, T-cell–mediated rejection, AR, AR with chronic allograft nephropathy, borderline rejection, borderline rejection and chronic allograft nephropathy, and mixed rejection), 1154 stable samples, and 609 normal samples (ie, biopsy conducted before organ transplant). The summary for the collected studies is represented in eTable 1 in the [55]Supplement. This study adhered to the Preferred Reporting Items for Systematic Reviews and Meta-analyses ([56]PRISMA), Standards for Reporting of Diagnostic Accuracy ([57]STARD), and Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis ([58]TRIPOD) reporting guidelines. Data Processing and Normalization Raw fluorescence intensity data were downloaded and preprocessed depending on the microarray platform. The data processing included background correction, log2 transformation, quantile normalization, and probe to gene mapping using R language, version 3.5.1^[59]20 (R Foundation) (eMethods and eFigure 1A in the [60]Supplement). To perform a meta-analysis, we merged all the studies and corrected for potential batch effects using the ComBat^[61]21 approach (eFigure 2 in the [62]Supplement); however, other approaches were evaluated (eMethods in the [63]Supplement). Statistical Analysis To identify differentially expressed genes, we used the Significance Analysis of Microarrays,^[64]22 which used the siggenes package.^[65]23 We used the false discovery rate^[66]24 with the Benjamini-Hochberg^[67]25 method for multiple testing correction (P < .05 and FC > 1.5). Pathway Analysis We leveraged the Gene Ontology database using the gene set enrichment analysis with clusterProfiler^[68]26 to perform functional annotations for the significantly upregulated and downregulated genes with a false discovery rate less than 0.05. For the gene network analysis, we used the STRING protein-protein association networks database.^[69]27 Cell Type Enrichment Analysis To estimate the presence of certain cell types in biopsy samples, we used the cell type enrichment tool xCell.^[70]28 xCell leverages gene expression data from microarray or RNA-sequence experiments to estimate the presence of up to 64 immune and stromal cell types in a mixture. We focused on 34 immune-related and 11 nonimmune cell types (eTable 3 in the [71]Supplement) that were manually selected as relevant to the transplant injury process. The enrichment scores for each cell type were used to compare AR and normal samples by performing the nonparametric 2-sample Mann-Whitney-Wilcoxon statistical test. The P values were adjusted using the Benjamini-Hochberg method (P < .05). Feature Selection Procedure In order to select the most important features in distinguishing AR vs normal samples, first, the data were split into training and testing sets in the ratio 80:20. All feature selection steps were performed on the training set with benchmarking on the testing set. Among the significant features, we searched for features correlated with the outcome (r > 0.75 × max[r]). After, we applied the recursive feature elimination technique with the random forest (RF) model using caret.^[72]29 We used a 5-fold cross-validation technique with 100 repeats and benchmarked a model by computing the area under the receiver operating characteristic (AUROC) curve, and the results were reported with both AUROC and precision-recall area under the curve (AUCPR). To minimize possible bias of the data random split and to avoid the model overfitting, the tolerance of 1% to the feature selection mechanism was introduced, ie, the algorithm chose a model with a smaller number of features that performed no worse than 99% from the best model. A final set of selected features was benchmarked by applying the RF model to the testing set. The R package feseR^[73]30 was adopted and modified for the implementation of the parallel computations. Instability Score and hSTA Subphenotyping The method of subphenotyping hSTA samples was based on selected features from the normal or AR analysis and applied for scoring the hSTA samples. The hSTA samples were then identified as mAR or mSTA. We denoted this split as hSTA/mAR and hSTA/mSTA, respectively. Based on gene expression and cell type enrichment data, the feature selection procedure was performed to find sets of genes and cell types highly associated with AR. Next, with Z-scaled features, we built a logistic regression model and, using model coefficients, created a linear score function, the InstaScore: InstaScore = 0.596 + 2.096 × KLF4 + 2.534 × CENPJ + 0.311 × KLF2 + 1.447 × PPP1R15A – 1.633 × FOSB + 0.268 × TNFAIP3 + 2.249 × natural killer (NK) cells +0.542 × CD4^+ T-cell central memory (T[cm]) cells +0.833 × CD4^+ T-cell effector memory (T[em]) cells +0.709 × CD8^+T[em] cells +0.146 × Type 1 T helper (T[H]1) cells Therefore, the positive InstaScore values separate AR from normal samples, which obtain negative values (eFigure 1B in the [74]Supplement). Using this definition, the InstaScores were computed for the hSTA samples, and the zeroth threshold was applied to perform the split into mAR and mSTA subtypes (eFigure 1C in the [75]Supplement). All the code has been uploaded to github.^[76]31 Results From the total 28 data sets, the feature selection procedure identified a set of 6 genes (KLF4, CENPJ, KLF2, PPP1R15A, FOSB, and TNFAIP3) (AUC, 0.98) and 5 immune cell types (CD4^+ Tcm, CD4^+ Tem, CD8^+ Tem, NK cells, and T[H]1 cells) (AUC, 0.92) that were combined into 1 composite InstaScore (AUC, 0.99). We leveraged all currently publicly available kidney biopsy microarray data (eFigure 1A in the [77]Supplement) from 28 studies with 2273 samples and performed a feature selection procedure based on the RF algorithm to identify a subset of genes and cell types that better distinguish AR and normal kidney samples, which were combined into the InstaScore (eFigure 1B in the [78]Supplement) to reclassify all annotated stable samples (hSTA) and identify variances in the samples by recognizing similarities to either the molecular rejection signature (hSTA/mAR) or the molecular quiescence (hSTA/mSTA) (eFigure 1C in the [79]Supplement). The clinical validity and prediction performance of the InstaScore were demonstrated on independent data wherein falsely classified stable samples (hSTA/mAR) showed significant projected differences in reduced graft function and survival over the true stable samples (hSTA/mSTA). Differential Gene Expression Analysis for Upregulation of Immune-Related Pathways in Rejection We performed differential gene expression analysis comparing AR with normal samples and identified 1509 significantly differentially expressed genes including 848 upregulated and 661 downregulated genes (eTable 2 in the [80]Supplement). Further hierarchical clustering analysis on the significant genes based on the Ward clustering technique was performed, and a significant separation was found^[81]32 (1119 samples, 1509 genes; P < .001) between classes ([82]Figure 1A). Additionally, the principal component analysis and Uniform Manifold Approximation and Projection dimensionality reduction confirmed the class separation (eFigure 3 in the [83]Supplement). The functional annotation of the significant genes found that upregulated genes were enriched in the regulation of the immune response, cell aggregation and activation, and innate immunity (eFigure 4A in the [84]Supplement). The downregulated genes were enriched in metabolic processes (eFigure 4C in the [85]Supplement). The network analysis showed connectivity between the sets of genes (eFigure 4B and 4D in the [86]Supplement). Figure 1. Heat Map Plots for Differentially Expressed Genes and Significantly Enriched Cell Types. Figure 1. [87]Open in a new tab A, Heat map clustering plot for significant genes from Significance Analysis of Microarrays (SAM) of acute rejection (AR) vs normal samples. B, Heat map clustering plots for significant cell types from the nonparametric Wilcoxon statistical test (Benjamini-Hochberg, P < .05) in the analysis of AR vs normal samples. aDC indicates activated dendritic cell; cDC, conventional dendritic cell; DC, dendritic cell; FC, X; HSC, hematopoietic stem cell; M1, X; MSC, mesenchymal stem cell; NK, natural killer; NKG, X; pDC, plasmacytoid dendritic cell; Tcm, X; Tem, X; Tgd, X; Th1, Type 1 T helper cell; Th2, Type 2 T helper cell; T regs, regulatory T cell. Cell Type Enrichment Analysis for Immune Cell Types Associated With AR To highlight the biologic heterogeneity and to capture signals from infiltrating cell type–specific outcomes in injured and stable kidney transplants, we performed cell type enrichment analysis. We leveraged xCell^[88]28 to focus on 45 cell types (eTable 3 in the [89]Supplement) that are relevant for organ transplants. We found 25 cell types (mostly lymphoid and myeloid cells) that were significantly (Wilcoxon test, Benjamini-Hochberg; 1119 samples; P < .05) enriched in AR and 12 cell types (immune, stromal cells, and hematopoietic stem cells) that were enriched in normal kidneys ([90]Figure 1B). As seen on the heat map, the hierarchical clustering revealed 2 main AR subclusters (510 samples; P < .001): one was mostly enriched in lymphocytes, NK cells, and macrophages, and the other had minimal lymphocyte activation and may have represented temporal differences in rejection evolution or recovery. We observed that B cells, dendritic cells, macrophages, and T cells formed cell type–specific subclusters that suggested the coordinated activation of immune cells in the kidney tissues. These results are in agreement with previous observations^[91]33 that have shown AR subphenotypical splits by gene expression and cell type. Unsupervised clustering of hSTA along with AR and normal samples exposed their heterogeneity, hinting that some hSTA samples have molecular signal closer to AR samples (eFigure 5 in the [92]Supplement). Machine Learning Feature Selection Procedure to Optimize AR Classification Following the feature selection procedure (eMethods in the [93]Supplement), we dramatically decreased the number of model features from all 1509 differentially expressed genes (1) to only 6 pivotal upregulated genes (KLF4, CENPJ, KLF2, PPP1R15A, FOSB, and TNFAIP3; AUROC, 0.98; AUCPR, 0.99) (eFigures 6A and 7A in the [94]Supplement); (2) to genes enriched as zinc finger proteins and expressed mostly in CD33^+ myeloid cells; and (3) to 5 cell types from the original set of 37 significantly enriched cell types: CD4^+ Tcm, CD4^+ Tem, CD8^+ Tem, NK cells, and T[H]1 cells, with CD4+ Tcm having the largest effect size in this model (AUROC, 0.92; AUCPR, 0.88) (eFigures 6B and 7B in the [95]Supplement). The feature selected cell types showed a predominant role for infiltration and activation of effector T cells and NK cells in AR, and the feature selected genes appeared to have broad cellular functions in AR, triggered by mononuclear activation and infiltration and collectively driving a variety of functions, such as DNA recognition, RNA packaging, transcriptional activation, and regulation of apoptosis. Interestingly, although the set of 11 genes in the common rejection module previously identified from a cross-organ (kidney, heart, liver, and lung) meta-analysis study of transplant rejection^[96]16 was enriched in this current analysis, none of them made it to this final minimal feature selection set. This finding suggests that the current 6-gene set might be more specific for the absence of AR in the kidney allograft, as the precise definition of a hSTA/mSTA allograft was not available in the earlier analysis. A generated RF classification model for these 6 genes and 5 cell types, internally validated using 5-fold cross-validation with 100 repeats, obtained an AUROC of 0.98 (sensitivity, 0.94; specificity, 0.94) for the genes alone and an AUROC of 0.92 (sensitivity, 0.85; specificity, 0.88) for the cell types for identification of a tissue sample with histologically confirmed AR ([97]Figure 2A). We further combined the feature selected genes and cell types into 1 score value, called the InstaScore (eMethods in the [98]Supplement), and were able to perform the split into AR and normal samples with a slightly improved AUROC of 0.99, with sensitivity of 0.95 and specificity of 0.94 ([99]Figures 2B and [100]2C), and an AUCPR of 0.99 (eFigure 7C in the [101]Supplement). Figure 2. Feature Selected Genes and Cell Types and the Instability Score as Their Combination. Figure 2. [102]Open in a new tab A, Hierarchical clusterings of acute rejection (AR) and normal samples. B, Combined selected features with AR and normal samples. C, Instability Score plot for AR and normal samples. NK indicates natural killer; Tcm, T-cell central memory; Tem, T-cell effector memory; T[H]1, Type 1 T helper cell. Selected Features to Create a Scoring Function to Carry Out Precision Subphenotyping of Stable Samples We then applied the InstaScore to the 1154 transplant samples that were identified by pathologists in each of the data sets as hSTA, classifying samples as more similar to normal kidneys or as more similar to the rejected kidney allograft group (mAR); hSTA/mSTA identified samples with molecular and histologic evidence of no rejection, and hSTA/mAR identified histologically stable allografts with transcriptional evidence of ongoing molecular rejection. The InstaScore identified 528 hSTA grafts (46%) in this study as having mAR ([103]Figure 3A), a misclassification rate in line with previously reported discrepancies in transplant phenotyping across different pathologists.^[104]9,[105]10 Figure 3. Plots of Acute Rejection (AR), Subphenotyped Histologically Stable (hSTA), and Normal Samples Based on Instability Score Results. Figure 3. [106]Open in a new tab A, Instability Score plots. B, Heat map of AR and normal samples. C, Uniform Manifold Approximation and Projection (UMAP) plot of AR and normal samples. mAR indicates molecular rejection; mSTA, molecularly stable; NK, natural killer; Tcm, T-cell central memory; Tem, T-cell effector memory; Th1, Type 1 T helper cell. We represented the scores for each sample as a scatterplot in [107]Figures 2C and [108]3A. The InstaScore was able to significantly distinguish AR and normal samples (1119 samples; P < .001; [109]Figure 2C) and distinguish hSTA/mAR and hSTA/mSTA samples (1154 samples; P < .001; [110]Figure 3A) by thresholding with zero. The hSTA/mAR samples clustered with AR and separately from hSTA/mSTA samples and had intermediate scores between normal and AR samples ([111]Figure 3B and [112]3C). Validation of hSTA Subphenotyping Using Clinical Follow-up Data In order to assess the functional relevance of the InstaScore by gene expression and cell types, we explored its clinical use in an independent microarray data set from 67 unique patients with hSTA grafts (stable clinical graft function, no donor-specific antibody, and no AR) from a randomized clinical trial^[113]34 with transcriptional data on serial protocol kidney transplant biopsy samples at 0, 3, 6, 12, and 24 months^[114]35,[115]36 and with longitudinal functional outcomes up to 5 years after initial engraftment. We tested the correlation association of the locked InstaScore with the change in estimated glomerular filtration rate (eGFR) and graft loss events over this time period and found high correlation values for cell type infiltration and activation model with delta eGFR ([116]Figure 4A) (r = 0.52; P < .001) and graft loss events (r = 0.17; P = .26). Figure 4. Validation Plots on the Independent Clinical Data. Figure 4. [117]Open in a new tab A, Change in estimated glomerular filtration rate (eGFR) after biopsy by Instability Score (InstaScore) (r = −0.52, P < .001). B, Change in eGFR distributions for predicted histologically stable (hSTA) subpopulations by InstaScore (P < .001). Using the predicted hSTA subphenotypes, we estimated a delta eGFR separating threshold of −10 at 5-year follow-up ([118]Figure 4B, P < .001). Given these results, it appears that the InstaScore on the 6-month protocol biopsy samples could differentiate patients more likely to have progressive graft injury and decline in graft function over time, even though the 6-month biopsy histology findings, serum creatinine, or donor-specific antibodies cannot provide the same discriminatory information. Discussion Tissue histology is indispensable for the diagnosis of allograft pathology, but its recognized limitations have resulted in the incorporation of data inputs from transcriptional and proteomic studies. Here, we present, to our knowledge, the first unsupervised transcriptional and cell-state framework to map and rephenotype human kidney allografts with undiagnosed graft dysfunction. Unlike other published studies, by others^[119]37,[120]38,[121]39,[122]40,[123]41 and by members of our group,^[124]2,[125]33,[126]42,[127]43 that have only focused on general transcriptional perturbations in rejection, the present study is, to our knowledge, the first development and validation of an approach that leverages the statistical power of a large public transcriptional data set. Along with the cell type enrichment analysis using xCell,^[128]28 we used logistic regression to build the InstaScore. By doing so, we reclassified kidney transplant biopsy samples, otherwise described as hSTA, into samples that have no molecular injury (hSTA/mSTA) and those that are most likely incorrectly annotated as stable but have molecular injury similar to AR (hSTA/mAR). Approximately half (46%) of the biopsy samples were wrongly annotated as stable and reclassified as hSTA/mAR by the InstaScore; these samples were found to be scattered across multiple data sets, supporting that their presence is not due to failure of histologic characterization at any particular transplant program, and highlighting the failure of histology to detect relevant molecular inflammation. The InstaScore was independently validated for functional relevance,^[129]35 as it identified hSTA/mAR 6-month protocol biopsy samples that had a higher risk of progressive graft injury and failure at 5 years posttransplant. The 6 feature selected genes, KLF4, CENPJ, KLF2, PPP1R15A, FOSB, and TNFAIP3, in the InstaScore are biologically relevant in the immune response and activation and innate immunity. KLF2, KLF4, and TNFAIP3 regulate kidney injury.^[130]44 KLF2 is vasoprotective, and KLF4 is renoprotective; both genes are highly expressed in the endothelium^[131]45,[132]46,[133]47 and are associated with endothelial ischemia reperfusion injury in AR.^[134]48 TNFAIP3 has antiapoptotic and anti-inflammatory functions and expression in endothelial, myeloid, and infiltrating T cells, which results in adverse clinical outcomes in AR.^[135]49,[136]50,[137]51 CENPJ functions as a transcriptional coactivator in STAT 5 signaling and tumor necrosis factor–induced NF-κB–mediated transcription,^[138]52,[139]53 both of which are central regulators of inflammation. The phosphatase PPP1R15A is only expressed in stressed cells and negatively regulates acute kidney injury via type 1 interferon;^[140]54 clonal expansion; and memory T-cell, plasma cell differentiation, and enhanced B-cell responses.^[141]55 FOSB expression is associated with the progression of kidney disease.^[142]56 Thus, all InstaScore genes are crucial for endothelial cell integrity, and T-cell activation, and have functional relevance to the kidney and rejection.^[143]57,[144]58 The 5 feature selected cell types, CD4^+ Tcm, CD4^+ Tem, CD8^+ Tem, NK, and T[H]1, also relate to rejection biology,^[145]59 together with macrophages, NK cells, and B cells.^[146]60,[147]61,[148]62,[149]63,[150]64 In the immunologic response to the allograft, T cells terminally differentiate and divide into Tcm cells and CD8^+ and CD4^+ Tem cells,^[151]65 which produce interferon-γ, IL-4, and IL-5 and cytotoxic molecules like granzyme, granulysin, and perforin.^[152]66,[153]67 CD4^+ Tcm had the largest effect size in the InstaScore, likely because Tcm cells are characterized by slow effector function and reactive memory and increased response to repeat antigenic stimulation.^[154]68 In the hSTA/mAR grafts, these cells are primed to differentiate into Tem with low levels of antigen recognition, such as with varying exposure to baseline immunosuppression.^[155]69 Hence, identification of the hSTA/mAR phenotype in an otherwise clinically and histologically stable allograft may be of critical importance to triage allografts at greater risk of accelerated temporal immune injury. Limitations Given the design of the study, there are a few inherent limitations. First, the publicly available data had limited access to clinical and demographic reports, which could potentially be valuable in incorporation with InstaScore. Second, batch effects had to be controlled for, for which we conducted comparisons of multiple normalizing methods. We chose the RF model to capture possible nonlinear feature interactions to identify the best feature set, although other more complex (eg, neural nets) or less complex (eg, elastic net) methods could also be considered as optional methods. Although the study is based on bulk microarray data, more precise measurement techniques, eg, single-cell RNA sequence, might be used to better capture finer changes in gene expression and cell composition, provide additional validation to the results and, in a future study, refine InstaScore. This may give InstaScore the ability to recognize different types of rejection, which can be identified at the molecular level, long before they can be detected at the histology. Conclusions This prognostic study leverages supervised machine learning on the largest bulk transcriptional human kidney and kidney transplant data set to improve kidney allograft sample phenotyping beyond the capabilities of tissue histology alone. In this study, the InstaScore revealed a level of biologic diversity within the classification of a stable graft not shown by histology alone; based on these findings, the InstaScore may provide an immune map to help refine our understanding of diverse graft functional states. The InstaScore provides a new tool to apply polymerase chain reaction–based analysis of the minimal gene set to kidney allograft biopsy samples embedded in formalin frozen paraffin to identify hSTA/mAR grafts at greater risk for subsequent overt rejection and allograft damage. These patients may benefit from proactive immunosuppression adjustments to reduce molecular inflammation, preserve allograft function, and improve allograft survival. Supplement. eMethods eFigure 1. Flow Chart eFigure 2. Scatterplots of Gene Expression Data After Data Sets Merging eFigure 3. PCA Clustering Plot for Differentially Expressed Genes From Analysis of AR vs Normals eFigure 4. Pathway Enrichment Analysis of DE Genes eFigure 5. Heatmap of Enrichment Scores of Significant Cell Types From the AR vs Normal Comparison eFigure 6. Plots of Feature Selected Genes and Cell Types for all AR and Normal Samples eFigure 7. AUROC and AUCPR Plots of Feature Selected Genes, Cell Types and InstaScore eFigure 8. Combined Benchmark Based on P-Value, Delta Statistic and the Percentage of Variability for Batch Correction Methods Tested eTable 1. Datasets Collected From Gene Expression Omnibus (GEO) eTable 2a. Upregulated Differentially Expressed Genes From SAM Analysis of Comparison of Acute Rejection to Normal Kidney Tissues eTable 2b. Downregulated Differentially Expressed Genes From SAM Analysis of Comparison of Acute Rejection to Normal Kidney Tissues eTable 3. Cell Types Considered in Cell Type Enrichment Analysis With xCell eReferences