Graphical abstract graphic file with name fx1.jpg [37]Open in a new tab Highlights * • The KRT6, KRT16, and KRT17 keratin genes are highly induced in stressed and diseased skin * • Single-cell transcriptomic data were queried to learn about stress keratin expression * • Stress keratin expression reflects altered differentiation and active innate immunity * • Regulation of stress keratin genes differ from the keratins expressed in healthy skin __________________________________________________________________ Dermatology; Biological sciences; Immunology; Transcriptomics Introduction Keratin genes and proteins serve as highly informative markers when probing the identity and status of epithelial cells throughout the body, a reality that has been exploited for >40 years.[38]^1^,[39]^2 In humans, 28 type I and 26 type II intermediate filament (IF) genes code for keratin proteins of sizes ranging between 40 and 70 kDa.[40]^3 The dual nature of keratin IF genes[41]^4 reflects a strict requirement for heteropolymerization of keratin proteins into 10 nm IFs.[42]^5 Accordingly, epithelial cells must coordinate the expression of (at least) one type I and (at least) one type II keratin gene to assemble an IF network in their cytoplasm. Mutations that alter the coding sequence of many keratins are causative for a large number of individually rare diseases.[43]^6 Besides, many prevalent chronic disorders (e.g., psoriasis in skin) entail a departure from normal differentiation and homeostasis and thus are typified by altered regulation of keratin genes,[44]^7 placing keratin genes and proteins, and their associates, as focal points in most epithelial disorders. The full significance of the latter and how it relates to tissue homeostasis remain partially understood. In a recently published study, we mined existing single-cell RNA sequencing (scRNA-seq) datasets with the goal of re-examining long-held views regarding the significance of keratin expression in a classical healthy surface epithelium. The effort was focused on the epidermis of thin skin in human and mouse, in which six keratin genes, the type I KRT15, KRT14, and KRT10, and the type II KRT5, KRT1, and KRT2,[45]^8 are regulated in differentiation-dependent manner. We confirmed that these keratins are among the most highly expressed genes in keratinocytes of thin epidermis, and that specific combinations of type I and II keratin genes, e.g., KRT5-KRT14 and KRT1-KRT10, maintain tight pairwise regulation across a broad range of expression levels. This effort also provided strong support for the notion that differentiation is initiated in keratinocytes that are still mitotically active and that the differentiation-dependent switch in keratin expression occurs gradually.[46]^8 Here, we examine the significance of stress-responsive keratin genes in chronic skin diseases focusing on the type II paralogs KRT6A, KRT6B, and KRT6C and their type I partner genes KRT16 and KRT17 genes (see refs.[47]^9^,[48]^10^,[49]^11 and “[50]supplemental information” for a recognition of the importance of examining regulation at the protein level as well). Monitoring the expression of the stress keratin genes and their protein products is widely used by both researchers and clinicians to differentiate between lesional and healthy tissue in skin disorders and models thereof. Our analyses not only support a core role for stress keratins in regulating differentiation and inflammatory/immune responses in a keratinocyte-autonomous fashion, but also point to distinct associations of individual stress keratins with specific signaling pathways. Results Scope of the analysis, and strategy Single-cell RNA-seq datasets were queried in specific ways to address four topics relevant to the significance of stress keratin expression in normal and diseased human skin. First, we assessed the abundance of stress keratin mRNA transcripts (KRT6A, KRT6B, KRT6C, KRT16, and KRT17) in keratinocytes of normal human trunk and scalp skin vs. chronic skin diseases, and related them to the normal differentiation-related KRT5-KRT14 and KRT1-KRT10 combinations. Second, we analyzed the preferred pairings for KRT16 and KRT17 as type I keratins and KRT6A-C as type II keratins in these settings. Third, we probed the biological significance associated with stress keratins by identifying tightly co-regulated genes and identified themes and relationships that transcend specific contexts or pathophysiologies. We also revisited the widely held beliefs that expression of stress keratins in diseased epidermis (i) reflects a state of enhanced proliferation and (ii) occurs at the expense of keratins expressed during normal differentiation (e.g., KRT1-KRT10). Fourth, we probed for similarities and differences in the signatures associated with KRT16 and KRT17, specifically. The diseases included in these analyses are psoriasis (PSOR), atopic dermatitis (AD), cutaneous lupus erythematosus (CLE), and hidradenitis suppurativa (HS). While heterogeneous in terms of etiology and pathophysiology, these disorders all feature substantial expression of the stress keratins, creating an opportunity to probe for context-independent and unique cellular functions associated with stress keratin expression. The datasets included in the study are listed in [51]Table 1. Three datasets (healthy trunk and scalp skin, and lesional trunk skin of individuals with PSOR) were reported by Cheng et al.[52]^12 and used in our recent effort focused on healthy human thin skin.[53]^8 Additional datasets from individuals with plaque psoriasis (PSOR2),[54]^13 atopic dermatitis,[55]^14 CLE,[56]^15 and HS[57]^16 were included as well as their matching non-lesional datasets were available ([58]Table 1). The raw datasets were imported and re-analyzed using a SCTransform + CCA Seurat pipeline[59]^17 followed by dimensional reduction (principal component analysis). The total number of cells that met inclusion criteria (see [60]STAR Methods) are listed in [61]Table 1, and the resulting UMAPs are presented in [62]Figure S1. Cell clusters were devised using the Louvain algorithm with the resolution parameter set at 0.6. Keratinocytes (as identified by expression of a unique set of genes) were retained for (i) analyses based on Seurat clusters, utilizing the principal component analysis of the pipeline to infer unique cell populations; and (ii) direct analysis of single-cell expression values within specific datasets, disregarding pre-defined clusters and focusing on properties relevant to biological questions of interest in our study. The latter two strategies are complementary in that they allow for analyses of genes across entire datasets or within specific cell populations of interest. All analyses were conducted within datasets with subsequent comparisons rather than combining the data under a single uniform manifold approximation and projection analysis, primarily because of methodological differences in their genesis. Table 1. scRNA-seq datasets used in study Abbreviation Description Dataset Total Cells Keratinocytes PMID SCALP Healthy scalp skin Non-lesional 24,414 22,637 [63]30355494 TRUNK Healthy trunk skin Non-lesional 27,452 26,063 PSOR Psoriasis (type undisclosed) Lesional 30,164 27,052 PSOR2 Plaque Psoriasis Lesional 22,819 15,261 [64]37308489 Non-lesional 20,658 13,691 AD Atopic dermatitis Lesional 26,480 13,972 [65]37506977 Non-lesional 22,371 7,817 HS Hidradenitis suppurativa Lesional 27,500 20,020 [66]32853177 Peri-lesional 1,186 1,021 CLE Cutaneous lupus erythematosus Lesional 7,085 3,545 [67]35290245 Non-lesional 14,685 9,368 Healthy 15,434 5,939 [68]Open in a new tab Abundancy of stress keratin transcripts in chronic skin diseases We previously reported that the type I KRT14 and KRT10, and the type II KRT5 and KRT1, are among the most abundant mRNA transcripts in keratinocytes of thin epidermis in human trunk and mouse ear skin.[69]^8 High transcript levels for these genes also occur in human scalp skin and in PSOR, AD, CLE, and HS skin ([70]Figures 1A and 1D; [71]Figures S2A–S2G). Transcript abundance (i.e., number of reads) for KRT5, KRT14, KRT1, and KRT10 vary across clusters, primarily reflecting the progenitor vs. differentiated state of keratinocytes ([72]Figures S2A–S2G). By comparison, the number of reads for KRT16 and KRT17 mRNAs ([73]Figures 1A–1C), and KRT6 paralog mRNAs ([74]Figures 1D–1F) are markedly lower in healthy scalp, healthy trunk, and disease datasets (see [75]STAR Methods regarding these comparisons). This reality is also reflected in read counts within the Seurat clusters that show highest expression of these genes ([76]Figures S2A–S2G). Read counts for KRT16, KRT17, and the KRT6 paralogs vary significantly across clusters with each dataset, as expected ([77]Figures S2A–S2G). Consistent with previous reports on human KRT6 paralogs expression,[78]^18^,[79]^19 KRT6A is consistently the most abundantly expressed KRT6 paralog while KRT6B and especially KRT6C exhibit less abundant expression ([80]Figure 1D). Figure 1. [81]Figure 1 [82]Open in a new tab Abundancy and pairwise regulation of stress-responsive keratin transcripts in healthy and diseased skin (A–F) Abundance of (A–C) type I and (D–F) type II keratin transcripts in scRNA-seq datasets obtained from healthy (trunk and scalp skin) and lesional human skin (PSOR, PSOR2, HS, AD, and CLE). (A and D) Mean expression of (A) type I and (D) type II stress-responsive keratins and differentiation-related keratins. (B and E) Histogram of (B) KRT16 and (E) KRT6A-expressing cells in the PSOR dataset, illustrating the broad distribution of expression levels for these keratins. Dashed lines represent the mean and threshold for top-keratin expressing cells in PSOR. (C and F) Expression level for (C) type I or (F) type II keratins within the top decile of cells expressing the keratin of interest. (G–M′) Transcript pairwise associations for type I and type II keratin permutations across datasets. (G–I) Pearson correlations between different keratin pairs show similar trends in pairwise associations across healthy and disease datasets. (J–M′) Pairwise associations in healthy tissue (scalp) and diseased (PSOR) for keratin pairs (J–J′) KRT1-KRT10, (K–K′) KRT6A-KRT16, (L–L′) KRT6A-KRT17, and (M–M′) KRT6 paralogs A and B. A large number of keratinocytes lack expression of stress keratins, even in clinically significant skin diseases (e.g., [83]Figures 1B and 1E). This reality can misrepresent stress keratin levels when looking across an entire tissue, as is done in [84]Figures 1A and 1D, or when using bulk RNA-seq or target-specific RT-PCR assays. We capitalized on the information gained by single-cell RNA-seq to further analyze stress keratin levels in select subsets of keratinocytes, independent of cluster assignment, in two complementary ways. First, we selected the top 10% of cells exhibiting the highest number of reads for individual keratins, in each dataset, and compared expression values ([85]Figures 1C and 1F). This analysis showed that, among type I keratins, read number maxima are higher for KRT14 and KRT10 relative to KRT16 and KRT17 across all datasets. In the highest expressing cells for each keratin, KRT14 and KRT10 read counts are relatively constant across all datasets, healthy and diseased skin analyzed ([86]Figure 1C). For KRT16 and KRT17, significantly higher expression levels are attained in diseased skin compared to healthy trunk and scalp skin ([87]Figure 1C). For type II keratins, read number maxima are higher for KRT5 and KRT1 compared to KRT6 paralogs in all datasets with one exception, KRT6A in HS ([88]Figure 1F). KRT6A occurs at markedly higher levels relative to the KRT6B and especially KRT6C paralogs ([89]Figure 1F). Among the disorders surveyed, HS shows the highest levels of KRT16 and KRT17, and of KRT6A ([90]Figures 1A–1F). Second, we determined the ratio between differentiation-related and stress-induced keratins in single cells, biasing the analysis on the 10% of keratinocytes expressing stress-induced keratins at the highest levels ([91]Figures S2H–S2K). While KRT16 and KRT6A represent the most abundant type I and type II stress-responsive keratins in a significant fraction of such keratinocytes, high KRT16-expressing cells maintain appreciable levels of differentiation-related keratin transcripts ([92]Figures S2H and S2J). Again, here, read counts for KRT6B and KRT6C are much lower than those of KRT6A in these keratinocyte subpopulations, with the exception of HS ([93]Figures S2I and S2K). These analyses establish that the stress keratin genes are generally expressed at lower levels relative to KRT1-KRT10 and especially KRT5-KRT14, not only in keratinocytes from healthy human trunk and scalp but also from PSOR, AD, CLE, and HS disease samples. This reality applies even when singling out cells expressing stress keratins at the highest levels, with the exception of KRT6A in HS. The KRT16 and KRT17 mRNAs occur in keratinocytes that also exhibit high levels of KRT14 and KRT10 reads. Likewise, the most highly expressed of the KRT6 paralogs, KRT6A, occurs in keratinocytes that express KRT5 and KRT1. Finally, our analyses confirm the existence of quantitative differences in the expression of KRT6 paralogs and point to intriguing differences in KRT16 and KRT17 expression, which are further discussed in the following section. Assessing the pairwise regulation of type I and type II stress keratin genes We previously reported that remarkably high Pearson correlation coefficients occur in the reads for KRT5 and KRT14 (R^2 = 0.79), and for KRT1 and KRT10 (R^2 = 0.84), in the epidermis of healthy human trunk skin,[94]^8 with tight pairwise regulation prevailing over the entire range of expression levels. Here, we find that this property also applies to healthy human scalp skin ([95]Figure 1G). In diseased skin (PSOR, HS, CLE, and AD), remarkably, KRT1-KRT10 reads are co-regulated to a nearly equally impressive extent ([96]Figures 1H–1J′, and [97]Figures S3A–S3C) while KRT5-KRT14 reads show a lesser but still highly significant correlation ([98]Figures 1H and 1I; [99]Figures S3A–S3D′). In part, the decrease observed for KRT5-KRT14 co-regulation reflects higher expression levels for KRT15, another type I keratin expressed in progenitor keratinocytes.[100]^20^,[101]^21 These findings are significant as they suggest that the keratin pairings expressed in healthy epidermis maintain a high level of pairwise regulation in disease settings, in which differentiation and homeostasis are perturbed. By comparison, pairwise regulation of type I and II stress keratins is variable, generally lesser, and shows a clear dependence on tissue context. In healthy scalp skin (where stress keratins are primarily expressed in pilosebaceous units), correlations between any of the KRT6 paralogs and either KRT16 or KRT17 are lower. The stronger correlations observed, e.g., KRT6B-KRT17 (R^2 = 0.22), KRT6A-KRT17 (R^2 = 0.18), and KRT6A-KRT16 (R^2 = 0.16), show significantly lesser R^2 values relative to KRT1-KRT10 and KRT5-KRT14 ([102]Figures 1G, 1J, and [103]S3D–S3E′). The reasons for this are unclear – it may be that stress keratin genes show a more complex transcriptional regulation, a non-linear mode of co-regulation, and/or that the signaling pathways at play in epithelial appendages of healthy skin are more complex than in epidermis. Within the stress keratin group, KRT6A and KRT16 show markedly better correlations across the disease datasets examined ([104]Figures 1H–I, 1K-1K′, and [105]S3F–S3F′). Indeed, the KRT6A-KRT16 pairing exhibits R^2 = 0.44 and R^2 = 0.47, respectively, in the PSOR and PSOR2 datasets ([106]Figures 1H, 1I, and 1K′). Despite their high homology (>98%) and adjacent positions on the type II keratin gene cluster, weak pairwise correlations occur among the KRT6 paralogs, highlighting their differential regulation in both healthy and diseased skin ([107]Figures 1G–1I, and 1M-1M′; [108]Figures S3A–S3C and S3G–S3G′). Assessing the biological significance associated with expression of stress keratins The contextual regulation of stress-responsive keratins has contributed much to their use as biomarkers in pathophysiological settings including PSOR and HS.[109]^22^,[110]^23 Early studies involving cell culture models or human lesional skin samples led to three prevalent notions regarding their expression: (i) that stress keratins (KRT16 and KRT6, specifically) reflect hyperproliferation[111]^24^,[112]^25^,[113]^26^,[114]^27^,[115]^28^,[11 6]^29; (ii) that, when induced, stress keratins replace or displace differentiation-related keratin[117]^30^,[118]^31; and, finally, (iii) that expression of KRT6 and KRT16, in particular, reflects an alternative pathway of differentiation.[119]^32^,[120]^33^,[121]^34^,[122]^35^,[123]^36 In this section, we conduct several analyses to put these notions to a test and identify core vs. unique signatures of stress keratin expression in skin diseases. Stress keratins and keratinocyte proliferation We first tested whether the expression of stress keratins in interfollicular epidermis is associated with enhanced proliferation at the tissue and/or at the single-cell level. To identify mitotic cells, we used a composite score (G2M Score) comprised of five genes specific for the G2M stage of the cell cycle (MKI67, TOP2A, CDC20, BUB3, and CCNB2[124]^8). We applied a stringent threshold (mean ±2 SD) to identify the top G2M-score expressing cells in each dataset (G2M^high; [125]Figures 2A and 2B). This strategy highlighted an increase in the fraction (%) of proliferating cells in many of the diseased samples relative to their non-lesional control tissues ([126]Figure 2B). G2M^high cells occur at a higher frequency in PSOR and PSOR2 (9% vs. 2.6% and 5.9% vs. 0.2%, respectively) ([127]Figure 2B), as expected.[128]^37 AD and CLE exhibit smaller increases in G2M^high cells relative to PSOR, again matching literature-based expectations.[129]^38^,[130]^39 Since mitosis is a relatively short part of the cell cycle, we repeated the analysis utilizing five genes specific to S-phase (RPA2, UHRF1, ATAD2, RFC2, and RRM2[131]^8) and a similar outcome occurred ([132]Figures 2B′, [133]S4A–S4C, and S4E). Together, these findings support the validity of our strategy to computationally single out cells that are mitotically active. Figure 2. [134]Figure 2 [135]Open in a new tab Expression of stress-responsive keratins correlates with enhanced proliferation at the tissue-wide but not single-cell level in diseased skin (A) Distribution of composite G2M transcript scores in lesion and matching non-lesional skin datasets across diseases. Red line depicts mean expression; dashed green line depicts the threshold, set at the mean + 2SD, for G2M^high expression in each disease. (B) Composite scores were used to compare the fraction (%) of keratinocytes in (B) G2M or (B′) S-Phase in lesional and matched non-lesional skin datasets. (C–C″) Correlation between expression (abundance>0) of stress-responsive keratins (C) KRT16, (C′) KRT17, and (C″) KRT6A against the G2M score across disease datasets (normalized to matching non-lesional data). (D) Relationship between G2M scores (y axes) and expression of specific keratin genes (x axes) for all keratinocytes in the PSOR dataset. In this analysis, high KRT14 levels are more likely to exhibit a high G2M score while high KRT2 levels associate with low G2M scores, as expected. (E and F) Comparing the fraction of keratinocytes with a dual “high-G2M score” and “high expression level” for (E) type I keratin genes and (F) type II keratin genes across skin disease datasets. See [136]STAR Methods for an explanation of this analysis. We next plotted the relationship between individual stress keratin expression and G2M scores. At a whole tissue level and across all samples analyzed, a linear relationship is observed between KRT16, KRT17, and KRT6A reads and higher proliferation activity (R^2 = 0.74, 0.61, and 0.81, respectively; [137]Figures 2C–2C″). Such an association between inflammation, keratin expression, and hyperproliferation aligns well with myriad studies utilizing stress keratins as markers of chronic inflammatory diseases. To examine whether such an association holds true at the single-cell level, we tracked fluctuations in G2M scores vs. the number of reads for individual keratins, in single cells, over the entire keratinocyte population in each dataset ([138]Figures 2D and [139]S4D). To render this analysis effective for any keratin in any dataset, we devised a method to quantify the percentage of high G2M-score cells (mean + 2SD) that exhibit high expression of a keratin of interest ([140]Figures S4F–S4F″). From this, we determined that, across all datasets, cells with high G2M scores co-segregate with high KRT14 expression (see “right shifted” distribution in [141]Figures 2D–2F and [142]S4F′) but segregate away from high KRT2 expression (“left shifted” distribution in [143]Figures 2D–2F and [144]S4F‴), while the relationship between KRT10 and G2M proved bimodal (“mixed” distribution in [145]Figures 2D, 2E, and [146]S4F″). The latter case, KRT10, was expected based on several recent studies.[147]^8^,[148]^40^,[149]^41^,[150]^42^,[151]^43 Relative to those “reference keratins,” high expression of KRT16, KRT17, KRT6B, or KRT6C does not clearly associate with high G2M scores ([152]Figures 2D–2F). The relationship between high G2M score and high expression of the stress keratins proved more similar to KRT2 and KRT10 relative to KRT14 ([153]Figures 2D–2F), suggesting that stress keratins relate better to differentiation than to proliferation. As a potential exception, high levels of KRT6A reads show a stronger association with high G2M scores across the datasets analyzed, though again clearly less so than KRT14 ([154]Figures 2D and 2F). Such findings, overall, do not support the notion that stress keratins are related to an enhanced proliferative character at a cellular level and caution against “automatically” relating tissue-level association with causation. Do stress-responsive keratins “replace” normal differentiation-related keratins? Based on protein-level analyses (e.g., immunostainings), several laboratories have reported that induction of K16 and K17 after skin injury occurs at the expense of K10 in keratinocytes located proximal to the wound edge.[155]^31^,[156]^44 To assess whether keratin replacement occurs at the mRNA level, we first tested whether an increase in stress keratin expression correlates with a decrease in KRT10 expression at the single-cell level. Instead, we found a weak but positive association (R^2 < 0.3) between KRT16 or KRT6 paralogs and KRT10 across all disease datasets ([157]Figures 1H, 1I, [158]S2A–S2C, and [159]S4G). Second, we tested whether expression of KRT10 is lower in cells expressing low or high transcript levels for each of KRT16 and KRT17 (in the 25^th & 75^th percentile subgroups; see [160]Figure S4). We found that high levels of KRT10 transcripts prevail in high KRT6- and KRT16-expressing cells when compared to low-expressing cells. Such an increase is most apparent in PSOR ([161]Figure S4H) and HS and does not occur in CLE (data not shown). We conclude from this that the previously documented replacement relationship between KRT16 and KRT10, if it occurs, may reflect post-transcriptional regulation[162]^18 or differential epitope accessibility,[163]^45 or is specific to acute stress responses in healthy skin.[164]^31^,[165]^32 Stress keratins and keratinocyte differentiation The finding that differentiation-related KRT10 and KRT1 remain tightly regulated yet show co-expression with stress keratin genes in all the diseases surveyed here ([166]Figures 1G–1J′, [167]S3A–S3C, [168]S4G, and S4H) led us to probe for an association between stress keratin expression and epithelial differentiation. There is strong evidence for such an association in plantar skin and in cornea.[169]^33^,[170]^36^,[171]^46 Besides, mutations in stress keratins can cause pachyonychia congenita (PC), a debilitating disease typified by abnormal differentiation in palmar-plantar epidermis and in epithelial appendages including glands, hair, and nail.[172]^47^,[173]^48^,[174]^49 Bulk RNA sequencing of lesional plantar skin biopsies from PC patients[175]^50 and of Krt16 null mice with PC-like PPK[176]^36 showed that many of the differentially expressed genes have a known role in differentiation and map to the epidermal differentiation complex (EDC) in the human genome (chr. 1).[177]^51 To examine this issue at the single-cell level, we devised a strategy that organizes keratinocytes relative to their differentiation state in a way that is applicable across all datasets. To do so, we related a custom basal score (composite of COL17A1, DST, ITGB1, LAMA5, POSTN) capturing a progenitor signature to a custom differentiation score (“diff”; composite of DMKN, DSG1, DSP, KRTDAP, SBSN) capturing terminal differentiation ([178]Figure 3A; see [179]STAR Methods; [180]^8). Keratin genes were intentionally excluded when devising the basal and differentiation scores because they represent the focal point of our analyses. Applying this strategy resulted in a highly concordant partitioning of individual cell clusters in all datasets analyzed ([181]Figures 3B and 3C). For instance, clusters that are “basal score^high” and “diff score^low” feature high levels of KRT15 transcripts while clusters that are “basal score^low” and “diff score^high” feature high KRT2 transcript levels. Clusters showing intermediate basal and differentiation scores correspond to transitional cells with significant levels of both KRT14 and KRT10 reads and a strong G2M^high character ([182]Figures 3B and 3C, [183]STAR Methods).[184]^8 These findings, and others not related here, establish that integrating the custom “basal” and “differentiation” scores can be used effectively to achieve a sound and consistent organization of keratinocyte clusters across all healthy and disease datasets analyzed ([185]Figures S2A–S2G). Figure 3. [186]Figure 3 [187]Open in a new tab Stress-responsive keratins are associated with alternate differentiation paths (A) Schematic accounting for how composite scores were devised to identify keratinocytes with a basal progenitor vs. a suprabasal differentiating character. (B and C) Plot of the “basal” and “differentiation” scores for Seurat keratinocyte clusters in (B) healthy human trunk skin and (C) site-matched psoriatic skin. Gray dashed box identifies the cluster showing highest levels of G2M cells. (D and E) UMAP of psoriasis cells (D) re-clustered using only keratinocytes and then (E) color-coded according to the ratio of “basal” and “differentiation” scores (blue conveys a high basal score; yellow conveys a high differentiation score). (F) Filtered PSOR UMAP (see D) to show the highest expressing KRT6A cells (top quartile). (G) PCA plot using PC_1 and PC_2 from the PSOR keratinocyte datasets, color-coded to reflect the basal score to differentiation score ratio (see E). The plot is overlayed with unsupervised slingshot trajectory analysis that compute two trajectories based on 4 principal components (see [188]STAR Methods). Keratinocytes with (H) low-to-medium KRT6A expression and (I) high KRT6A expression preferentially map to different clusters and slingshot trajectories (see G). (J and K) Differential gene expression analysis comparing cluster 1 (red) vs. clusters 4 + 6 (green), followed by Reactome pathway analysis (K). The table highlights pathways with multiple components enriched in the analysis. In putting this strategy to use, we first computed the ratio between differential and basal scores and applied it to all keratinocytes in the PSOR dataset ([189]Figures 3D and 3E). We next singled out the top quartile of KRT6A-expressing cells ([190]Figure 3F), focusing on this gene because it features high levels of reads across a broader fraction of cells relative to the other stress keratins ([191]Figures 1A–1F). As it turns out, KRT6A reads are highest in cells that also score high for differentiation (see [192]Figures 3E and 3F). The top 4 principal components were next used as input for unsupervised pseudotime trajectory analysis (slingshot analysis) with no predetermined start or endpoint input provided.[193]^52 The computational algorithm identified two potential trajectories that are readily apparent on a 2-dimensional principal component analysis plot ([194]Figure 3G; see [195]STAR Methods). Applying this strategy reveals an intriguing divergence when comparing the top quartile (76–100^th percentile) vs. the rest (0–75^th) of KRT6A-expressing cells in the PSOR dataset ([196]Figures 3H and 3I). Indeed, the top quartile is mainly populated by keratinocytes from cluster 1 in the Seurat analysis, while keratinocytes showing lower KRT6A expression and a high differentiation score map primarily to clusters 4 and 6 ([197]Figure 3J). We next identified differentially expressed genes (>1.5-fold change, adj.p < 0.05) in keratinocytes from cluster 1 vs. cluster 4/6 combined and conducted pathway analysis using the Panther/Reactome enrichment tool.[198]^53 Pathways enriched in KRT6A^high differentiated keratinocytes (cluster 1) relate to innate immunity, cellular respiration, and P53 responses ([199]Figure 3K). Such relationships were previously noticed when studying models involving null mutations or expression of Krt6a/Krt6b, Krt16, and Krt17 mutants.[200]^54^,[201]^55^,[202]^56^,[203]^57^,[204]^58^,[205]^59 Reactome pathways previously implicated in the pathophysiology of chronic inflammatory diseases of the skin, such as aberrant interferon and cytokine signaling,[206]^60^,[207]^61^,[208]^62^,[209]^63^,[210]^64 also emerged from these analyses and should be pursued in future studies. The pathways enriched in KRT6A^low keratinocytes (clusters 4 and 6) also showed relevance to keratinization and differentiation, suggesting that distinct sets of genes control the paths between normal and alternative differentiation. Core transcriptional signatures associated with stress keratin expression To help identify cellular processes promoted by stress keratin gene expression, we compared gene expression in cells that express them at high (top quartile) vs. low levels (bottom quartile) in all datasets included in this study. Differentially expressed genes with a fold-difference >1.5 in either direction (adj. p < 0.05) between the top and bottom subgroups were identified ([211]Figures 4A–4C). When focusing on the KRT16 gene in PSOR, as an example, 913 genes showed differential expression between KRT16^high (top quartile) and KRT16^low- (bottom quartile) expressing cells ([212]Figures 4A and 4B). Further filtering by fold-change yielded 174 genes in total (88 up, 96 down), which were next subjected to reactome pathway enrichment analysis. Such a strategy was applied to all datasets ([213]Figure 4C) and the resulting reactome pathways of significance in at least 4 out of 5 disease datasets are reported for KRT16 ([214]Figure 4D), KRT17 ([215]Figures 4E and [216]S5A), and KRT6A ([217]Figures 4F and [218]S5A). From this, we find an enrichment for genes involved in “keratinization” (R-HSA-6805567) and “metal sequestration by antimicrobial proteins” (R-HSA-6799990) in cells that express KRT16, KRT17, and KRT6A at high levels in all disease datasets ([219]Figures 4D–4F). In line with our previous analysis connecting stress keratins with differentiation, we note that multiple components of the EDC are differentially expressed in KRT16^high, KRT17^high, and KRT6A^high cells ([220]Figures S5C–S5E). Additionally, cells expressing any of the stress keratins also show an enrichment for “innate immune system genes” (R-HSA-168249). On the other hand, cells that express KRT6 or KRT16, but not those expressing KRT17, exhibit differential “expression of cytokine signaling” (R-HSA-1280215) across 4 out of 5 datasets ([221]Figures 4D and 4F). These findings are of great interest given the known role of cytokines in the pathophysiology of PSOR, HS, CLE, and AD, along with studies showing that loss of either Krt16 or Krt17 in mice leads to an altered immune response to various challenges.[222]^36^,[223]^54^,[224]^55^,[225]^65 Figure 4. [226]Figure 4 [227]Open in a new tab Expression of stress keratins is associated with molecular pathways involved in immunity and differentiation (A) Distribution of KRT16 expression in keratinocytes from PSOR dataset. Red lines represent the top and bottom quartiles denoting KRT16^High and KRT16^low expression, respectively. (B) Differential gene expression between KRT16^high and KRT16^low cells in PSOR dataset yielded 914 significantly expressed genes (excluding KRT16 itself). Genes are plotted based on their average log2 fold change (x axis) and pct1-pct2 differential (y axis). Dashed line represents set threshold used in following enrichment analysis (Fold-change>1.5). (C) Differentially expressed genes (adj. p < 0.05) between low (bottom quartile) and high (top quartile) KRT16-expressing cells across all disease datasets. Dashed line represents set threshold used (Fold-change>1.5). (D–F) Differentially expressed genes between low (bottom quartile) vs. high (top quartile) expressing keratinocytes for (D) KRT16, (E) KRT17, and (F) KRT6A were analyzed for reactome pathway enrichment across all disease datasets. Pathways enriched (FDR adj. p < 0.05) in all five (left), and in at least four of the five, disease datasets (right) are reported. Error bars represent mean +/− SEM To extend these analyses, we performed correlation analyses between individual stress keratins and all other genes in each dataset ([228]Figures S5F–S5I). For any given stress keratin, a normal distribution of correlations, centered around 0, is observed ([229]Figures S5F and S5G). Then, we selected genes exhibiting the highest correlation with a given stress keratin (top 100 correlations with r > 0.2; [230]Figure S5G) and scanned for genes that overlap across datasets. As expected (see [231]Figures 1G–1I), stress keratin genes correlate with one another in this analysis. Remarkably so, however, non-keratin genes showing high correlations also emerged, for each keratin, across all the disease datasets. The latter included genes mapping to the EDC, again, further supporting a role for KRT16 and KRT6 in keratinocyte differentiation ([232]Figures S5H and S5I). Additionally, KRT6A and KRT6B showed correlations to genes coding for proteins with roles in EGFR (SH3BGRL3[233]^66) and retinoic acid signaling (CRABP2, FABP5[234]^67). Both these pathways have been implicated in keratin biology and are of interest owing to their role and therapeutic value for inflammatory diseases and for pachyonychia congenita.[235]^68^,[236]^69^,[237]^70^,[238]^71 KRT16 and KRT17 partake in partially distinct gene expression signatures There are striking differences in the phenotype exhibited by Krt16 null[239]^72 and Krt17 null[240]^73 mice and also between cases of PC involving dominantly acting alleles in KRT16 vs. KRT17.[241]^49^,[242]^74 This is in spite of the high degree of sequence conservation between K16 and K17 proteins.[243]^75 The current effort uncovered differences in high correlating genes for KRT17 and KRT16 in the chronic skin disease datasets ([244]Figures 4 and [245]S5). Given these observations, we next aimed to probe for genetic and functional associations that are unique to each of KRT16 and KRT17. To do so, we first tracked the KRT16:KRT17 expression ratio in all Seurat clusters within each datasets. Clusters that had at least 2-fold increase in one stress keratin over the other were retained for further analysis ([246]Figure 5A). In KRT16>KRT17 expressing keratinocytes, we identified multiple genes related to epidermal differentiation, extending findings reported previously ([247]Figure 5B). This effort yielded additional genes of interest, including SFN (14-3-3σ), a known binding partner of K17,[248]^76 K14,[249]^77 and K8/K18[250]^78 but not yet studied in the context of KRT16/K16. In the subset of KRT17>KRT16-expressing keratinocytes, we noticed a strong enrichment for genes whose protein product are involved in translation elongation ([251]Figure S5J). These findings are likely relevant to the role of K17 protein in regulating mTOR and Akt signaling[252]^76^,[253]^79^,[254]^80^,[255]^81 ([256]Figure S5J). Figure 5. [257]Figure 5 [258]Open in a new tab Gene regulatory networks in KRT16^high and KRT17^high across disease datasets Identification of cell clusters showing high (2:1) and low (1:2) KRT16:KRT17 ratios, on average, in all disease datasets. Clusters with a ratio >2 are marked by asterisk and were used for further analysis. (B) Identification of genes shared across datasets for all keratinocyte clusters showing greater expression of KRT16 relative to KRT17 (KRT16:KRT17 >2). This analysis includes 11 keratinocyte clusters across the five diseases analyzed (see A). Remarkably 14 genes, listed at right, are shared across all disease clusters. (C) Distribution of the KRT16-KRT17 ratio across all keratinocytes in the PSOR dataset. Red and green dashed lines represent the 5^th and 95^th percentiles in the distribution. (D) Relating KRT16 (X axis) and KRT17 (Y axis) expression levels for all keratinocytes within the PSOR dataset (see [259]Table 1). Cells within the 5^th or 95^th percentile of this ratio were labeled in red and green, respectively. (E) Violin plots showing the distribution of KRT16-KRT17 ratios across cells in all datasets analyzed. Red line represents cells within the 5^th percentile of the ratio (KRT17>KRT16) and green line represent cells within the 95^th percentile the ratio (KRT16>KRT17). (F and G) STRING-DB network analysis of differentially expressed genes in (F) KRT16^high-KRT17^low and (G) KRT16^low-KRT17^high keratinocytes. Circles denote the differentially enriched genes in the condition, and blue color intensity represents how many disease datasets (out of 5) feature this gene in the differential analysis. Group of genes are labeled by their association to known molecular pathways. We supplemented this analysis by next looking at KRT16 and KRT17 expression in individual keratinocytes making up all datasets. This allowed us to focus on keratinocytes showing the highest expression differences between KRT16 and KRT17, regardless of cluster assignment. To do so, we calculated the ratio of KRT16:KRT17 expression across each dataset, focusing on the top 95^th percentile subsets in the KRT16>KRT17 and KRT17>KRT16 ([260]Figures 5C–5E). We then performed differential expression between the groups and used STRING-DB (see[261]^82 and [262]STAR Methods) to query whether these genes are integrated in specific functional networks ([263]Figures 5F and 5G). For KRT16, we observed associations with genes and pathways previously connected to K16 through biochemical studies in mouse models and cultured cells, including differentiation,[264]^35^,[265]^36 innate immunity,[266]^55 cellular respiration,[267]^57^,[268]^58 and oxidative stress.[269]^83^,[270]^84 Besides these, new associations previously unconnected to KRT16, e.g., interferon signaling,[271]^85 emerged from this analysis. For KRT17, strong associations were observed with the ERK/MAPK,[272]^81 TNF-α,[273]^86 and integrin/β-catenin[274]^87 pathways, again building on previous studies. The reproducibility of both KRT16 and KRT17-related pathways across pathophysiologies provides a compelling roadmap to study the unique roles of these type I keratins in inflammatory diseases. Discussion We reanalyzed single-cell RNA-seq data obtained from the skin of individuals suffering from four significant chronic inflammatory diseases with the simple goal of gaining deeper insight into the properties, significance, and cellular roles of stress keratin genes. Although each of the diseases considered (PSOR, AD, HS, and CLE; [275]Table 1) has a distinct etiology and pathophysiology, all four exhibit the hallmarks of disrupted tissue homeostasis including robust levels of KRT6A-C, KRT16, and KRT17 transcripts. Our analysis uncovered biological themes that recur across disease settings and reinforce and extend previous laboratory studies involving experimental models, in particular, transgenic mouse strains. Three general conclusions come to the fore regarding the significance of stress keratin expression in chronic skin diseases. First, the regulation of stress keratin genes significantly differs from that of keratins expressed in healthy skin, in several respects. Second, expression of KRT6A-C, KRT16, and/or KRT17 convey(s) an altered state of keratinocyte differentiation, along with a clear engagement in inflammatory and innate immune responses, in the skin diseases surveyed. Third, individual stress keratin genes are associated with partially distinct gene networks, and accordingly they (or their protein products) can be expected to exert a differential impact at the cell and tissue levels. At a more granular level, additional conclusions from the analyses reported here are as follows. * (1) Stress keratin transcripts occur at lower levels relative to keratins expressed as part of normal epidermal physiology, namely, KRT5-KRT14 and KRT1-KRT10. Besides, stress keratins do not “replace” the differentiation-related keratins expressed as part of normal epidermal physiology, at least at the mRNA transcript level. * (2) Type I and II stress keratins show a significantly lesser degree of pairwise regulation compared to KRT5-KRT14 and especially KRT1-KRT10. The reasons for this difference are unclear, and it may be that this conclusion applies to skin in particular. More generally, the molecular basis for pairwise regulation of any type I and II keratin gene combination remains a widely open issue. * (3) The connection between stress keratins and enhanced proliferation applies at the tissue but not at the single-cell level, i.e., it is not a cell-autonomous property. Examples of experimental evidence supporting this notion can be found in studies of wound re-epithelialization, where induction of stress keratin expression is spatially segregated from enhanced proliferation,[276]^32^,[277]^88^,[278]^89 and of the response of keratinocytes to retinoids.[279]^90 * (4) Stress keratin genes are consistently co-regulated with genes involved in keratinocyte differentiation, including a sizable number of genes located within the EDC on human chromosome 1.[280]^51 Trajectory analyses suggest that higher level expression of specific stress keratins, e.g., KRT6A, reflect or may even drive an alternative path of differentiation. This possibility was raised in previous studies focused on stress keratin proteins in healing skin post-injury[281]^32 and in the cornea,[282]^33 as well as in the palmoplantar keratoderma-like lesions that arise spontaneously in footpad skin of Krt16 null mice.[283]^36 Finally, our analyses suggest that individual stress keratins may exert a differential impact on pathways activated in stressed keratinocytes. For instance, keratinocytes showing a high KRT16:KRT17 ratio express high levels of genes involved in innate immunity (including type I interferon signaling), keratinization, desmosome cell-cell adhesion, oxidative stress, and glycolysis across several skin diseases settings. There is experimental evidence supporting many of these preferential associations.[284]^34^,[285]^36^,[286]^55^,[287]^83 Conversely, stressed keratinocytes showing a high KRT17:KRT16 ratio exhibit high expression of genes involved in inflammatory responses, e.g., pathways anchored by AP-1/TNF-α/MAPK and by NF-κB, ribosomes, and translation, and reflecting a distinct type of cell junctions. Again, there is experimental evidence supporting these preferential associations.[288]^54^,[289]^65^,[290]^76^,[291]^79 Finally, and consistent with previous reports,[292]^18^,[293]^91^,[294]^92 our analyses confirmed that individual KRT6 paralogs show a distinct regulation. Limitations of the study The findings reported here are derived from analyses of single-cell, genome-wide transcriptomic datasets obtained from human skin. There are limitations associated with any analysis of high-throughput transcriptomics data, at the single-cell level.[295]^93 For this study, however, such limitations were mitigated by the inclusion of several independent and robust single-cell datasets processed and analyzed in parallel, along with separate consideration of five stress-responsive genes. Indeed, the overlapping conclusions we reached from findings derived from the study of distinct genes, datasets, and diseases support the validity of our findings and conclusions regarding stress keratin genes. A potential limitation, still, relates to cell representation in the samples analyzed, in terms of both diseased vs. more normal tissue, as well as the efficiency of collecting various types of keratinocytes as single cells, for instance, terminally differentiated vs. progenitor keratinocytes. Besides, these analyses were strictly focused on mRNA transcripts, and need to be followed up at the protein level. Our findings do not factor in the important role of protein-protein interactions and where they take place within keratinocytes (e.g., nucleus vs. general cytoplasm vs. plasma membrane), and of post-translational modifications. Each of these elements has been shown to impact stress keratin function in vivo.[296]^94^,[297]^95^,[298]^96^,[299]^97 Regarding the associations highlighted by our analyses, we cannot distinguish between correlation and causation nor can we distinguish between positive vs. negative influences. In spite of these limitations, our findings validate and extend previous experimental findings and are rich in leads for future laboratory studies. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Deposited data __________________________________________________________________ Trunk, scalp and psoriasis European Genome-phenome Archive EGAS00001002927; PMID: [300]30355494 Psoriasis (2) GEO DataSets [301]GSE173706; PMID: [302]37308489 Atopic dermatitis GEO DataSets N/A; PMID: [303]37506977 Hidradenitis suppurativa GEO DataSets [304]GSE154775, [305]GSE154773; PMID: [306]32853177 Cutaneous lupus erythematosus GEO DataSets [307]GSE186476; PMID: [308]35290245 __________________________________________________________________ Software and algorithms __________________________________________________________________ Panther over-representation tools [309]https://www.pantherdb.org/ PMID: [310]34717010 PMID: [311]30804569 Reactome knowledgebase [312]https://reactome.org/ PMID: [313]34788843 String-DB API [314]https://string-db.org/ PMID: [315]30476243 R version 4.2.2 [316]https://www.R-project.org/ Seurat version 4.1.1 [317]https://github.com/satijalab/seurat/releases PMID: [318]34062119 Slingshot version 2.2.1 [319]https://bioconductor.org/packages/release/bioc/html/slingshot.html PMID: [320]29914354 Graphpad Prism 9.1.0 GraphPad Software, Boston, Massachusetts USA, [321]www.graphpad.com N/A [322]Open in a new tab Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Pierre A. Coulombe (coulombe@umich.edu). Materials availability This study did not generate new unique reagents. See “[323]data and code availability” for more information regarding sequencing datasets. Data and code availability The data analyzed in the manuscript has been previously made publicly available and deposited through GEO as of the date of publication. Accession numbers, as well as original publication identifiers are listed in the [324]key resources table. Methodology used to analyze the datasets is further described in the [325]method details section below and is also listed in the [326]key resources table. This article does not report any additional code. Any additional information required to reanalyze the data reported in this article is available from the [327]lead contact upon request. Experimental model and study participant details This study is a reanalysis of previously generated single-cell RNAseq datasets obtained from biopsies of human skin. All relevant peer-reviewed publications are listed in the “[328]key resources table”. All biopsy materials for those studies were obtained through an informed consent process with approval and oversight from Institutional Review Boards (IRB). Consideration was given to age, sex and/or gender, ancestry, race or ethnicity. Method details Description of scRNAseq datasets and their analysis Standard SCTransform + CCA Seurat pipelines were used to analyze, integrate, and cluster the data. Due to variation in pathophysiologies and methodologies when obtaining the data (e.g., body site, representative controls, number of cells in lesional vs. non-lesional sets; see [329]Table 1), each dataset was clustered and analyzed separately. The PSOR and matching non-lesional TRUNK and SCALP datasets were originally published by Cheng et al.[330]^12 The raw data was filtered so as to retain cells with more than 500 features, less than 7,500 features, and mitochondrial content less than 15%. Seurat’s SCTransform “v2” ([331]https://doi.org/10.1186/s13059-021-02584-9) was used for normalization with percent mitochondrial regressed out. The top 30 principal components were used to construct a shared nearest neighbor (SNN) graph, and subsequently the Louvain algorithm was used with resolution parameter of 0.5 for scalp and trunk, and 0.7 for psoriasis, to generate clusters. UMAP was used for visualization and Wilcoxon Rank Sum was used to identify differentially expressed genes between clusters. Clusters with keratin markers KRT10, KRT15, or KRT2 were labeled as keratinocytes and maintained for in-depth analyses. Excluded clusters and their primary markers included Melanocytes (PMEL+), T cells (CD3D+), Langerhans Cells (LYZ+) and an unknown cluster (KRT23+, KRT18+, MMP7+). Values in plots are log-normalized, scaled, and centered values using the following R code (`Counts.matrix |> NormalizeData() |> ScaleData()`) and not using SCTransform residuals. The AD[332]^14 and PSOR2[333]^13 datasets, lesional and non-lesional, were processed using a pipeline highly similar to Cheng et al.[334]^12 Adjustments were made to some of the parameters to better fit the data characteristics resulting from different lab methodologies. Cells with more than 500 features, less than 50,000 counts, and mitochondrial content less than 15% were kept for AD while a cutoff of 60,000 counts used for PSOR2. Sample replicates were integrated using canonical correlation analysis (CCA) from the top 3,000 highly variable genes (see [335]https://doi.org/10.1038/nbt.4096). For cluster identification, 20 principal components and a resolution parameter of 0.6 were used. The same cell types were excluded with the addition of fibroblasts (DCN+) and endothelial (PECAM1+) cells. The HS dataset[336]^16 was filtered to include cells with more than 500 features, less than 50,000 counts, and mitochondrial content less than 15%. 15 principal components and 0.6 resolution were used for clustering. The same cell types were used to exclude clusters with the addition of a plasma cell (JCHAIN+) cluster. This dataset did not include a healthy or non-lesional control. The CLE dataset[337]^15 was analyzed similarly to others but samples integrated with SCTranform + CCA instead of Harmony. Consistent with the original data treatment,[338]^15 we retained cells with more than 200 features, less than 3,500 features, and mitochondrial content less than 15%. The top 20 principal components and 0.6 resolution were used for clustering. For the analyses reported here, smooth muscle cells (ACTA2+), mast cells (TPSB2+) and neurons (MPZ+) were excluded along with the non-keratinocyte cell types mentioned above. Due to the low number of cells available relative to the other datasets, clusters were generated using the compilation of healthy controls, non-lesion and lesion cells as was done in the original publication.[339]^15 Analyses focused on all keratinocytes across clusters in each dataset The clusters resulting from Seurat processing were visualized via UMAPs, as shown in [340]Figure S1. Seurat clustering was primarily used to delineate keratinocytes for additional follow-up analyses. Analyses across all keratinocyte clusters (see [341]Figures 1, [342]2, [343]4, and [344]5C–5G) was performed by evaluating keratin expression values in single cells within all keratinocyte clusters. Mean expression, expression quartiles and percentiles were calculated based on gene expression in keratinocyte clusters. Generation of transcription signature scores Composite scores were generated using the arithmetic mean of log expression values for a set of markers. For the “G2/M score” ([345]Figure 2) the gene markers used were MKI67, TOP2A, CDC20, BUB3, and CCNB2, as previously described.[346]^8 For the “S Phase” score ([347]Figure 2), the gene markers used were RPA2, UHRF1, ATAD2, RFC2, and RRM2, again as previously reported.[348]^8 Tissue-level analysis of proliferation ([349]Figures 2C–2C″) was done by relating KRT16 expression in keratinocytes (% cells with expression value > 0) to the average of the composite ‘G2M score’ per disease dataset. For each dataset, the ‘G2M score’ was normalized to the matching non-lesion dataset. To create a composite describing the association of individual keratin genes with G2M scores, we devised a method to reduce the dimensionality of G2M vs. keratin association. First, G2M^high cells were selected in each disease dataset using a threshold of Mean+2SD. Then, for each keratin, we identified cells with higher-than-average expression of keratin (>mean of all expressing cells) per dataset, denoted as Keratin^high. Utilizing these metrics, we calculated the fraction of Keratin^high G2M^high double-positive cells from all G2M^high cells in each dataset (see [350]Figures 2F and 2G; additional examples are provided in [351]Figures S4F–S4F‴). Finally, the genes used for calculating the “Basal score” ([352]Figure 3) were POSTN, ITGB1, LAMA5, COL17A1, DST, and the genes used to calculate the “Differentiation score” ([353]Figure 3) were DSP, DMKN, KRTDAP, DSG1, SBSN, as described before.[354]^8 Keratinocyte trajectory and PCA analysis Trajectory analysis was performed using the R/Bioconductor slingshot package.[355]^52 After the keratinocytes were filtered in as described in the “analysis of scRNAseq data across datasets” section, principal component analysis was performed on the log-normalized, scaled, and centered data. Embeddings from the first 4 principal components were input into slingshot without supplying a start or end cluster. The parameter `allow.breaks` was set to FALSE to prevent branching of the trajectory and the parameter `stretch` was set to 0 to prevent extrapolation beyond the determined endpoints. Quantification and statistical analysis The number of cells analyzed per dataset are reported in [356]Table 1. Differential gene expression was determined using pre-set thresholds as described for each analysis related in the main text and in figure legends (adj.p < 0.05, fold-change >1.5). Differential expression was performed using Wilcoxon RankSum tests through the FindMarkers Seurat function. Resulting genes were processed for further analysis using Panther API[357]^53 using over-representation tools[358]^98 with reactome pathway knowledgebase.[359]^99 All results reported were filtered for FDR adj.p < 0.05. Results from differential gene expression between type II stress keratins ([360]Figures 5C–5G) were analyzed and plotted using String-DB API.[361]^82 To generate the map, full string network is shown, with lines indicating confidence (with minimum confidence set at >0.4). Text-mining connections and disconnected nodes were removed from the map and bubbles were colored to reflect the number of datasets (out of 5) a gene was found significant for during differential analysis. Correlation analysis ([362]Figures S6D–S6G) was performed by relating linear correlation for each gene in the dataset against the selected keratin. Pearson correlations (r) are provided within figures and in the “[363]results” section of the manuscript. Data was analyzed using R version 4.2.2, Seurat version 4.1.1 and slingshot version 2.2.1. All graphs were made using Graphpad Prism 9.1.0 with the exception of [364]Figures S2A–S2G (R 4.2.2 ggplot2), and Venn diagrams (Ghent University Venn Diagram API at [365]https://bioinformatics.psb.ugent.be/webtools/Venn). Linear correlations (r and R^2) and statistical analysis reported within figure panels and in the “[366]results” section were calculated using Graphpad Prism. Acknowledgments