Abstract

   Neutrophils are frequently studied in mouse models, but the extent to
   which findings translate to humans remains poorly defined. In an
   integrative analysis of 11 mouse and 13 human datasets, we find a
   strong correlation of neutrophil gene expression across species. In
   inflammation, neutrophils display substantial transcriptional diversity
   but share a core inflammation program. This program includes genes
   encoding IL-1 family members, CD14, IL-4R, CD69, and PD-L1. Chromatin
   accessibility of core inflammation genes increases in blood compared to
   bone marrow and further in tissue. Transcription factor enrichment
   analysis implicates members of the NF-κB family and AP-1 complex as
   important drivers, and HoxB8 neutrophils with JunB knockout show a
   reduced expression of core inflammation genes in resting and activated
   cells. In independent single-cell validation data, neutrophil
   activation by type I or type II interferon, G-CSF, and E. coli leads to
   upregulation in core inflammation genes. In COVID-19 patients, higher
   expression of core inflammation genes in neutrophils is associated with
   more severe disease. In vitro treatment with GM-CSF, LPS, and type II
   interferon induces surface protein upregulation of core inflammation
   members. Together, we demonstrate transcriptional conservation in
   neutrophils in homeostasis and identify a core inflammation program
   shared across heterogeneous inflammatory conditions.

   Subject terms: Neutrophils, Immunogenetics, Immunogenetics, Innate
   immunity
     __________________________________________________________________

   Difficulties can be encountered when translating research between cells
   from animals and humans because of gene expression differences. Here
   the authors perform an integrative transcriptomic analysis from human
   and mouse neutrophils and identify a core inflammation program shared
   across inflamed contexts.

Introduction

   Neutrophils mediate homeostatic and inflammatory processes and display
   substantial phenotypic and functional heterogeneity. While animal
   models fuel fundamental discoveries in immunology, differences between
   humans and mice can impair the translation of findings^[38]1. To
   maximize impact on human health, life sciences increasingly benefit
   from seamless transitions between the mouse and human system. However,
   due to structural and functional differences in genomes, it is often
   unclear which aspects reflect conserved biology. Therefore, integrative
   analyses of cellular systems across species are important for the
   success of translational research.

   Structurally, the mouse and human genomes are closely related. They
   harbor ~16,000 protein-coding genes considered to be one-to-one
   orthologs with high confidence^[39]2. However, structural orthology
   does not equal functional similarity since expression patterns of
   orthologous genes can deviate substantially across organs and
   development^[40]3. In leukocytes, expression of most orthologous genes
   and lineage-specific genes, in particular, is well-conserved between
   humans and mice^[41]4. Despite this overall similarity, different
   species can display substantial differences in ortholog expression
   between tissues^[42]5. For example, human neutrophils are highly
   abundant in defensins, yet their mouse orthologs are expressed in gut
   epithelial cells, not in neutrophils. Furthermore, neutrophils display
   high phenotypic and functional heterogeneity as a function of organ,
   maturation, and inflammatory condition^[43]6–[44]9, but whether a core
   inflammation program consisting of genes that become induced across a
   range of inflammatory conditions exists is not known. It is thus
   unclear how similarities and differences between human and mouse
   transcriptomes should be interpreted, particularly in the context of
   different inflammatory conditions.

   To address these gaps in knowledge, we perform an integrative analysis
   of resting and inflamed leukocytes from humans and mice and assess the
   degree of conservation of gene expression. We find that human and mouse
   transcriptomes can be analyzed together and that lineage-specific gene
   expression was closely related between humans and mice. We further
   study how the neutrophil transcriptome changes in inflammation, using a
   wide range of studies covering in vitro and in vivo inflammation as
   well as resting conditions in human^[45]10–[46]21 and
   mouse^[47]12,[48]22–[49]31. While transcriptional responses to
   different activating stimuli are heterogeneous, we identify a core
   inflammation program in neutrophils conserved across species and
   conditions. We predict upstream regulators and find increasing
   accessibility of core inflammation program members in ATAC-seq.
   JunB^−/− HoxB8 cells display a lower upregulation of core inflammation
   genes when stimulated with zymosan compared to wild-type cells. In
   single-cell RNA-seq data from resting and activated neutrophils,
   stimulation with type I and II interferon, G-CSF, E. coli is associated
   with higher expression of core inflammation genes. Further, neutrophils
   from COVID-19 patients with more severe disease display higher
   expression of core inflammation genes. Finally, we validated members of
   the core inflammation program using flow cytometry of stimulated human
   and mouse neutrophils and identified an interplay between tissue of
   origin and stimulation in driving the phenotype of the neutrophil
   inflammatory response. Our approach illustrates that multiple datasets
   of mouse and human gene expression data can be effectively combined to
   identify patterns shared across conditions and conserved across
   species. This approach can be transferred to other cell types and
   organisms to facilitate studies comparing gene expression across
   species.

Results

Integrative analysis of leukocyte gene expression across species

   To assess gene expression similarities and differences between human
   and mouse immune cells, we obtained bulk RNA-seq data from six sorted
   leukocyte lineages from the Haemopedia atlas^[50]12,[51]32
   (Supplementary Fig. [52]1). This dataset consisted of a total of 76
   samples of T cells, B cells, dendritic cells, monocytes, NK cells, and
   neutrophils (Supplementary Fig. [53]1). Sequencing depths for samples
   across all lineages are shown in Supplementary Fig. [54]2a, b, and
   detailed quality control metrics are summarized in Supplementary
   Data [55]1. We then integrated gene expression matrices by mapping
   protein-coding, one-to-one orthologous genes with high confidence,
   according to ENSEMBL^[56]33.

   To evaluate the robustness of this approach, we performed a principal
   component analysis on the integrated expression matrix. For each
   lineage, up to 200 lineage-associated genes were selected. Here, sample
   distribution was driven predominantly by lineage, followed by species
   (Fig. [57]1a). As envisioned, lineage-associated gene expression was
   highest in each respective lineage and occurred across species in all
   lineages (Fig. [58]1b). Similarly, clustering of sample-wise Pearson
   correlation coefficients based on these genes was driven predominantly
   by lineage, confirming that in our analytical approach, lineage
   identity dominates species differences (Fig. [59]1c).

Fig. 1. Integrative analysis of leukocyte gene expression across species.

   [60]Fig. 1
   [61]Open in a new tab

   a Principal component analysis based on per-species mean-centered
   log[2](CPM) of lineage-specific genes shows a distribution driven
   predominantly by lineage. b Concordant expression of the top
   lineage-specific genes for each lineage in each species. Shown is the
   average of log[2](CPM) centered expression values in each lineage. The
   number of lineage-specific genes shown is indicated on the left, up to
   200 genes are shown per lineage. For each lineage, 8 genes with the
   highest expression in the respective cell line are labeled. c
   Lineage-specific gene expression dominates the species effect. Shown is
   the clustering of Pearson’s R based on the centered expression of top
   lineage-specific genes. d Neutrophils display the strongest correlation
   of lineage-specific gene expression across humans and mice compared to
   other leukocyte lineages. Gene expression (mean-centered log[2](CPM))
   of lineage-specific genes was defined as above. Pearson correlation
   coefficient and P-value (two-sided) between human (x) and mouse (y)
   gene expression are shown on the top left (B cells: r = 0.65,
   P < 2.2e-16; Dendritic cells: r = 0.22, P = 0.3; Monocytes: r = 0.56,
   P < 2.2e-16; Neutrophils: r = 0.79, P < 2.2e-16; NK cells: r = 0.24,
   P = 0.005; T cells: r = 0.65, P < 2.2e-16). The top 10 most abundantly
   expressed genes are labeled. Source data are provided as a Source Data
   file.

   Correspondingly, expression of key lineage-associated genes was highly
   conserved between humans and mice (Fig. [62]1d), such as CSF3R and
   CHI3L1 in neutrophils, CD19 and CD22 in B cells, CD3 molecules and CD28
   in T cells, NKG7 and GZMA in NK cells, MSR1 and SERPINB2 in monocytes
   and FLT3 and MYCL in dendritic cells. The highest correlation between
   human and mouse gene expression was observed in neutrophils (r = 0.79),
   followed by T cells (0.65), B cells (0.65), Monocytes (0.56), and a
   weaker correlation in NK cells (0.24) and dendritic cells (0.22)
   (Fig. [63]1d).

   This analysis demonstrates that mapping one-to-one orthologs allows an
   integrated analysis of leukocyte transcriptomes across species to
   identify conserved and divergent expression patterns of structurally
   related genes. Of note, although these data indicate a higher
   correlation in neutrophils compared to other lineages, this effect may
   have been influenced by smaller library complexities in neutrophils.

Transcriptional conservation in resting neutrophils

   To systematically analyze which genes display similar and divergent
   expression across species, we integrated transcriptional profiles of
   resting (not activated) neutrophils available through the Sequence Read
   Archive (SRA). In this context, resting neutrophils were defined as
   those isolated from blood or tissue in the absence of disease or
   experimental manipulation. In a total of 84 human and 39 mouse samples,
   we observed a high correlation in overall gene expression,
   transcription factor expression, and lineage-associated gene expression
   across humans and mice (Pearson’s r between 0.78–0.87,
   P < 2.2 × 10^−16) (Fig. [64]2a). These results were remarkably similar
   to those obtained from the more homogenous Haemopedia dataset, further
   illustrating the robustness of this approach even when integrating
   multiple datasets from different sources.

Fig. 2. Conservation of neutrophil gene expression in homeostasis.

   [65]Fig. 2
   [66]Open in a new tab

   a Strong correlation of gene expression between resting human (x) and
   mouse (y) blood neutrophils. Left: all genes (r = 0.84, P < 2.2e-16).
   Middle: transcription factors, highlighted in green (r = 0.87,
   P < 2.2e-16). Transcription factors were retrieved from a curated set
   of transcription factors in ChEA3. The top 5 TFs (based on the sum of
   the average expression in human and mouse) were labeled and highlighted
   in red. Additionally, we manually labeled and highlighted the genes
   JUND, KLF2, ATF3, CEBPA, CEBPB, and CEBPE. Right: lineage-specific
   genes as depicted and defined in Fig. [67]1, highlighted in green
   (r = 0.78, P < 2.2e-16). Neutrophil genes were labeled as in
   Fig. [68]1c and highlighted in red. Shown are (log[2](TPM+1))
   expression values. Pearson correlation coefficients between human and
   mouse gene expression for the three groups as well as P-values
   (two-sided) are shown in the upper left of each panel. b Neutrophil
   lineage-associated genes with orthologs can show concordant or
   discordant expression across species. Gene expression heatmap
   (log[2](TPM+1)) of neutrophil lineage-associated genes that were
   assigned to five different expression profile groups: high expression
   in both species, high expression in human/mouse and low in the other
   species, high in human/mouse and no high-confidence ortholog; see
   “Methods”. Gene-gene pairs of particular importance in neutrophils are
   highlighted (HUMAN SYMBOL: Mouse Symbol). Annotated are Orthology
   relationships between the respective genes as well as species in which
   the gene was detected as lineage-associated. Right, violin plots of
   selected gene-gene-pairs show their expression in individual samples
   for each species. Benjamini-Hochberg adjusted P-values derived from a
   gene-wise likelihood ratio test between two linear mixed models with
   and without species as fixed effect are shown for each highlighted
   gene-gene pair; see “Methods”. Source data are provided as a Source
   Data file.

   We next focused on neutrophil lineage-associated genes and defined five
   GENE: Gene (HUMAN: Mouse) pairs based on their expression patterns. In
   addition to one-to-one orthologs, we considered high-confidence
   one-to-many and many-to-many orthologs.

   Orthologs with high expression in both humans and mice included the key
   neutrophil genes CSF3R (encoding the G-CSF receptor), CXCR2, NCF4
   (neutrophil cytosolic factor 4), the transcription factors MCL1, SPI1
   (encoding PU.1, an essential transcription factor for terminal
   granulopoiesis^[69]34,[70]35) and JUNB, a transcription factor
   prominently expressed in late neutrotime which plays a vital role in
   the inflammatory response of neutrophils^[71]9,[72]36 (Fig. [73]2b). As
   CSF3R, CXCR2 and JUNB expression changes along neutrophil development,
   their concordance in expression might suggest that the analyzed
   neutrophils from humans and mice were of comparable developmental
   stage.

   Orthologs with higher expression in human neutrophils included FCGR3A
   and FCGR3B (encoding CD16A and CD16B, respectively), which both are
   one-to-many orthologs of mouse Fcgr4. This group also included the
   receptor for activated complement (C5AR1) and CXCR1, the receptor for
   CXCL8 (human)/KC (mouse). Genes with higher expression in mouse
   neutrophils included the protease Mmp9, Camp (encoding Cathelicidin
   Antimicrobial Peptide), Il1b, and Retnlg (encoding Resistin-like gamma)
   (Fig. [74]2b).

   Of note, most genes in categories 1–3 were one-to-one orthologs,
   although 13/133 (9.8 %) were one-to-many orthologs. However, well-known
   neutrophil genes without one-to-one orthologs were also identified
   (categories 4 and 5) and included CXCL8 in humans, a cytokine
   abundantly expressed in blood neutrophils, and Ccl6, one of the most
   abundant chemokines in mouse neutrophils (Fig. [75]2b). Enrichment for
   neutrophil-related GO terms was found across all five groups of genes
   (Supplementary Fig. [76]3).

   Thus, while resting human and mouse neutrophils display conserved
   expression of many key neutrophil genes and transcription factors, gene
   expression can deviate substantially in the same lineage between
   species, even for structurally highly related genes.

A core inflammation program is shared across conditions and conserved across
species

   We next assessed how the expression of one-to-one structural orthologs
   changes in different inflammatory contexts. Neutrophils display varied
   phenotypes in homeostasis and inflammation^[77]6,[78]7,[79]9,[80]37,
   but it is unknown if a proportion of the transcriptional
   characteristics of different neutrophil states is shared across
   different inflammatory conditions^[81]9. Here, resting neutrophils were
   defined as above and compared with their respective inflammatory
   condition.

   To identify changes in inflammation, we analyzed 11 studies
   encompassing a total of 46 resting and 66 activated neutrophil samples
   across different conditions (Fig. [82]3a, Supplementary Data [83]2). We
   tested for differential expression of genes with high-confidence
   one-to-one orthologs according to ENSEMBL separately within each study,
   comparing all reported conditions against their own resting controls to
   reduce the effect of technical variation between studies.

Fig. 3. A core inflammation program is conserved across mouse and human
neutrophils.

   [84]Fig. 3
   [85]Open in a new tab

   a Overview of the 11 studies integrated for analysis. Differential
   expression testing was performed independently for each study, resting
   neutrophils within each study were used as control. b A combined
   analysis of the neutrophil response to activation/inflammation
   identifies 179 consistently upregulated (core inflammation program) and
   42 downregulated genes in inflammation. Shown are all (N = 9697) tested
   genes ranked by their −log[10] Benjamini-Hochberg adjusted Fisher
   P-value (adjusted Fisher’s combined test on two-sided P-values from
   individual differential expression analyses for each comparison). The
   500 genes with the lowest P-values were subjected to an additional
   filtering step based on a log[2] fold change cutoff ≥0.5 and ≤−0.5 for
   upregulated and downregulated genes. Highlighted (IL4R, CD14, CD69,
   CD274, CD40) upregulated genes were validated experimentally
   (Figs. [86]6 and [87]7). c 42 genes downregulated in inflammation and
   179 core inflammation genes are shared across studies. Shown are the
   log[2] fold changes across comparisons of genes up- and downregulated
   in inflammation. Rows represent a comparison, columns represent genes
   that passed our meta-analysis thresholds. Columns are arranged by the
   mean log[2] fold change across all comparisons. For each direction, the
   15 genes with the highest absolute log[2] fold change are labeled, as
   well as genes encoding for proteins validated in Figs. [88]6 and [89]7.
   d Core inflammation genes are not expressed in resting neutrophils in
   both species and are induced upon activation. Shown is a heatmap with
   relative expression values (z-score for each gene across samples) of
   the core inflammation genes. Each column represents a gene, and each
   row a sample. P-values: Results of a Benjamini-Hochberg adjusted
   Fisher’s combined test. We labeled the top 20 genes with the lowest
   P-values, genes that were also labeled in (c), and manually labeled
   TRAF1 and JUNB. e Conserved Gene Set Enrichment Analysis based on
   rankings derived from each comparison’s log[2] fold changes. Heatmap
   showing normalized enrichment scores. Only pathways that have been
   significant in more than 50% of comparisons are depicted. Gray fields
   indicate nonsignificant NES values. Source data are provided as a
   Source Data file.

   Compared to controls, inflamed neutrophils displayed 975 (median)
   differentially expressed genes (adjusted P < 0.05, absolute log[2] fold
   change ≥ 0.5). These comprised 621 (median) significantly increased and
   205 (median) significantly decreased genes (Supplementary Fig. [90]4a).
   Both the number of differentially expressed genes and the genes
   themselves were heterogeneous—concordant with the diverse
   transcriptional responses neutrophils can undergo in inflammation.

   We next searched for potential overlap in the inflammatory response
   shared across conditions. Such an overlap may represent a “core
   inflammation program”, from which neutrophils preferentially upregulate
   genes across a broad range of activating conditions.

   We used Fisher’s combined test to obtain a combined test statistic for
   each gene, summarizing individual comparisons from all datasets
   (Supplementary Data [91]3). Based on the elbow of the P-value-rank
   plot, we selected from the top 500 genes with the lowest P-value those
   with absolute log[2] fold change ≥0.5 (Fig. [92]3b).

   A total of 221 genes displayed consistent changes in inflammation
   across studies: 179 genes were upregulated across comparisons (the
   “core inflammation program”), and 42 genes were downregulated
   (Fig. [93]3c). Effect sizes of those 221 up- and downregulated genes
   agreed well across all tested comparisons and across species
   (Fig. [94]3c, Supplementary Fig. [95]4b).

   Core inflammation genes included the IL-1 molecules IL1A and IL1B, the
   LPS co-receptor CD14, the adhesion molecule ICAM1, the lectin receptor
   CD69, CD40, IL4R and CD274 (encoding PD-L1) (Fig. [96]3c, [97]d).
   Downregulated genes in inflammation included the cyclin-dependent
   kinase CDK5R1, TLR5 (encoding Toll Like Receptor 5, an essential
   pathogen recognition receptor^[98]38), CXCR4, CD101, and the member of
   the mitogen-activated protein kinase family MAP3K15 (Fig. [99]3c, d).

   As expression of CD101 and CXCR4 changes throughout neutrophil
   maturation and aging, we compared the fold change of these markers
   between neutrophils activated in vitro and those activated in vivo to
   rule out the effects of differential release from the bone marrow under
   stress. No differences were observed in either marker (Supplementary
   Fig. [100]4c), suggesting that the transcriptional downregulation of
   CXCR4 and CD101 observed during neutrophil activation are
   cell-intrinsic and do not reflect a different maturation stage of
   neutrophils captured in the in vivo studies.

   On the level of individual samples, we could confirm that the group of
   179 core inflammation genes had either weak or absent expression in
   healthy neutrophils and were induced in inflamed neutrophils
   (Fig. [101]3d).

   Gene set enrichment analysis identified a conserved enrichment of
   pathways related to apoptosis, inflammatory response, IL-2 and IL-6
   signaling, IFN-γ response, and TNF signaling via NFKB and KRAS
   signaling (Fig. [102]3e).

   Taken together, this integrative analysis of resting and activated
   neutrophils nominated a core inflammation program in neutrophils which
   is shared across inflammatory conditions and across species.

The core inflammation program is detectable using different analytical
strategies and in single-cell data

   To further test the robustness of the core inflammation program, we
   performed two independent analyses. Using a linear mixed model, we
   observed high replicability of our results, with differentially
   expressed genes (absolute β ≥ 1, P[adj] < 0.05) identified by the
   linear mixed model showing a strong skewing toward low Fisher P-values
   and a π[1]-statistic of 0.71 (Supplementary Fig. [103]5).

   We additionally assessed the replicability of differentially expressed
   genes between all tested comparisons. Median values of the
   π[1]-statistic ranged from 0.06 to 0.60, depending on the study, and,
   importantly, did not show systematic species-driven differences
   (Supplementary Fig. [104]6a). Normalized enrichment scores for
   differentially expressed gene sets were in concordance with
   up-/downregulation of the tested sets across all studies, supporting
   the existence of a shared core inflammation program. Of note, the
   downregulation of specific genes in inflammation was more variable
   across studies and hence less informative (Supplementary Fig. [105]6b).
   Pearson correlation coefficients of log[2] fold change values showed
   strong positive skewing, again pointing toward a core inflammatory
   response across conditions and species (Supplementary Fig. [106]6c).

   As an additional analytical approach, we performed a weighted
   correlation network analysis (WGCNA)^[107]39. WGCNA constructs
   correlation networks and can help to identify clusters of genes
   (“modules”) that are co-expressed across different conditions. It
   identified four modules (19, 5, 8, and 4) with significant enrichment
   for core inflammatory response genes (Fisher’s exact test,
   P[adj] < 0.05). Gene expression within those four modules increased in
   inflammation and contained several members of the core inflammation
   program (Supplementary Fig. [108]7).

   For validation purposes, we analyzed four recent single-cell
   RNA-sequencing datasets that had not been used to derive the core
   inflammation program. These included neutrophils from healthy control
   individuals and those with mild to moderate or severe COVID-19 (Combes
   et al., dataset 1)^[109]40, human neutrophils stimulated with G-CSF,
   IFN-β or IFN-γ (Montaldo et al., datasets 2+3)^[110]41 and mouse
   neutrophils infected with E. coli (Xie et al., dataset 4)^[111]7.

   Expression of most of the 179 core inflammation genes increased in
   inflamed neutrophils (Fig. [112]4a). A gene set was created based on
   the 179 core inflammation genes, and changes in expression were tested
   compared to random background genes with the same expression abundance.
   A significant increase in the core inflammation genes was detected in
   all conditions and was higher in patients with severe compared to mild
   to moderate COVID-19 (Fig. [113]4b). However, examination of the
   expression of the core inflammation program on a single cell level
   indicated heterogeneity within the population of neutrophils, which was
   characterized by the presence of groups of cells with exceptionally
   high or low expression of the defined gene set in inflamed states
   (Fig. [114]4c).

Fig. 4. Validation of the enrichment of core inflammation genes in
single-cell RNA sequencing.

   [115]Fig. 4
   [116]Open in a new tab

   The first dataset (Combes) includes neutrophils of patients without
   (control), mild (M/M), and severe cases (severe) of COVID-19. The
   second dataset (Montaldo I) was derived from in vivo neutrophils from
   healthy controls (control) and patients treated with G-CSF (G-CSF). The
   third dataset (Montaldo II) used in vitro stimulation of umbilical cord
   blood neutrophils with G-CSF, interferon-β, or interferon-γ. The fourth
   dataset (Xie) consisted of neutrophils isolated from different organs
   of mice challenged with E. coli. a Core inflammation genes show a
   higher expression in inflamed samples compared to controls. Shown is a
   heatmap with mean scaled expression values of the core inflammation
   genes across all cells per condition and sample. Rows represent genes,
   columns samples. b Inflamed cells show a higher gene module score for
   the core inflammation program. Each cell has been scored for the
   enrichment of the core inflammation program genes compared to a random
   reference gene set of similar expression. Shown are control samples in
   gray alongside inflamed cells in orange. P-values were derived from a
   maximum likelihood ratio test of linear mixed models. Box plots: Median
   between the 25th and 75th percentile, whiskers extend to 10% and 90%.
   Outliers are shown as dots. Top plot: Statistics are derived from a
   total of 10,782 cells (Control: 1026, M/M: 5732, Severe: 4024) cells
   from N = 56 independent samples (P[Control:M/M] = 0.0109,
   P[Control:Severe] = 0.000013). Top mid: The experiment included a total
   of 48,875 cells (healthy: 12,338, G-CSF: 36,537) from N = 6 independent
   samples (P[Control:G-CSF] = 0.01416). Bottom mid: A total of 26,312
   cells (Control: 4296, G-CSF: 5049, IFN-beta: 8472, IFN-gamma: 8495)
   from N = 4 independent samples were analyzed
   (P[Control:G-CSF] = 9.42 × 10^−5, P[Control:IFN-beta] = 2.67 × 10^−5,
   P[Control:IFN-gamma] = 2.71 × 10^−5) Bottom plot: Calculation was
   performed on 26,239 cells (Control: 8990, E. coli: 17,249) from N = 10
   independent samples (P[Control:E. coli] = 0.0376). c UMAP embedding.
   Coloring by gene set score (top panel) and the cell’s inflammatory
   state (bottom panel) shows an increase in the core inflammation program
   expression in inflamed samples compared to their control. Source data
   are provided as a Source Data file.

The core inflammation program shows conserved transcriptional regulation
across species

   To identify putative regulators of neutrophil activation in
   inflammation, we applied transcription factor (TF) enrichment analysis
   individually to up- and downregulated genes in each study. TF
   enrichment across mouse and human inflamed neutrophils was highly
   consistent in TFs with decreasing (Supplementary Fig. [117]8a) and
   increasing (Supplementary Fig. [118]8b) activity.

   Transcription factors that we found to be enriched in genes expressed
   in resting neutrophils include AKNA, PU.1 (encoded by SPI1), FOXO3,
   FOXO1, TFEB, RARA, and STAT5B (Supplementary Fig. [119]8a).
   Transcription factors that we found to be enriched in genes associated
   with inflamed neutrophils included CSRNP1, PLSCR1, FOS, FOSB, the NF-κB
   components NFKB1/NFKB2, the emergency granulopoiesis transcription
   factor CEBPB and JUNB (Supplementary Fig. [120]8b).

   To reduce this selection of transcription factors to those with the
   highest changes in inflammation, we compared the predicted regulatory
   activity of transcription factors and their respective gene expression
   in inflammation. This analysis highlighted that the genes encoding for
   CSRNP1, JUNB, CEBPB, XBP1, and ETS2 were strongly upregulated in
   inflamed neutrophils while also displaying strongly increased
   regulatory activity (Fig. [121]5a).

Fig. 5. Genes in the core inflammation program are predisposed to be
upregulated with maturation and activation.

   [122]Fig. 5
   [123]Open in a new tab

   a Shown is the mean log[2] fold change of gene expressions across all
   comparisons versus the regulatory activity (inverse logarithm of ChEA3
   Scores per species). Colors and sizes indicate −log[10](P), using the
   Benjamini-Hochberg adjusted P-values from a Fisher’s combined test
   (Fig. [124]3). b–d ATAC-sequencing from the air pouch model^[125]36. b
   Members of the core inflammation program show increasing chromatin
   accessibility (blood vs. bone marrow; membrane and air pouch vs. bone
   marrow or blood). Genes part of the inflammatory response program were
   compared with the list of genes that showed increased/decreased
   accessibility in the depicted comparisons. Asterisks indicate
   significance versus 1000 repeats of an accessibility analysis of random
   RNA-expression-matched background genes (two-sided studentized
   bootstrap: P[1]: 7.26 × 10^−8; P[2]: 6.86 × 10^−17; P[3]:
   2.09 × 10^−21; P[4]: 1.56 × 10^−18; P[5]: 3.07 × 10^−22). Bottom-right:
   Two-sided Fisher’s exact test on the number of core inflammatory genes
   with increased accessibility versus no increase. FDR-adjusted P-values.
   c Core inflammation program gene accessibility for each comparison.
   Rows ordered by nested decreasing rank of peaks associated with an
   increase, with both, with none, and with a decrease. d A subset of
   transcription factors shows increased motif enrichment and increased
   gene expression in inflamed neutrophils. Motif enrichment analysis was
   performed using HOMER and is compared to the mean log[2] fold change
   across species. A one-sided motif enrichment analysis was run
   separately for increased and decreased accessibility to calculate
   [MATH: <mi>l</mi><mi
   mathvariant="normal">n</mi><mrow><mo>(</mo><mrow><mi>P</mi></mrow><mo>)
   </mo></mrow> :MATH]
   . ρ, Spearman’s rank correlation coefficient with its respective
   P-value (two-sided). e Analysis of HoxB8 cells that were treated with
   zymosan^[126]36. Core inflammation genes show upregulation versus
   random genes in zymosan-stimulated HoxB8 cells and are downregulated in
   JunB^−/−. Core inflammation genes in HoxB8 neutrophil RNA-Seq
   data^[127]36. One row per comparison. Left, experimental conditions for
   each comparison and respective statistics. Middle, volcano plots for
   each comparison. Red, Core inflammation program members. Members with
   the highest combined significance and effect sizes are labeled. Right,
   histograms showing the percentage of expression-matched background
   genes (equally sized gene sets, 1000 simulations) up/downregulated (see
   x-axes) in each comparison. Red arrow indicates observed percentage for
   core inflammation program members, annotated with its
   P[overrepresentation] (one-sided, Wallenius method^[128]99). Source
   data are provided as a Source Data file.

   On the level of individual studies, we also found high consistency in
   the transcription factors predicted to be enriched in genes upregulated
   and downregulated in activated neutrophils (Supplementary
   Fig. [129]8c). These results were consistent with an independent
   enrichment analysis performed separately for each species
   (Supplementary Fig. [130]8d, e).

Migration into tissue and activation significantly enhance chromatin
accessibility and expression of core inflammation genes

   If genes in the core inflammation program are predisposed to be
   upregulated, then chromatin accessibility for these genes should
   increase upon neutrophil maturation, migration into tissues, and
   exposure to inflammatory stimuli.

   To test this hypothesis, we analyzed chromatin accessibility data
   derived from bone marrow, blood, and an air pouch model of acute
   inflammation. These data were generated using Assay for
   Transposase-Accessible Chromatin using sequencing (ATAC-Seq), a method
   that tests genome-wide chromatin accessibility. Briefly, ATAC-seq
   allows the analysis of chromatin accessibility by sequencing DNA
   fragments that are bound by a hyperactive Tn5 transposase, which
   preferentially inserts sequencing adapters into open chromatin
   regions^[131]42. In the air pouch model (executed on C57BL/6J mice),
   blood neutrophils first migrate into a sterile membrane in the skin
   before being activated by zymosan in the air pouch^[132]36. Of the 179
   core inflammation program genes, 29 displayed increasing accessibility
   in blood vs. bone marrow, compared to only 10 genes with decreased
   accessibility (Fig. [133]5b). Neutrophils that had transmigrated from
   the blood into the membrane displayed enhanced accessibility of 78
   genes. This increase was significantly (P = 5.1 × 10^−9) higher than
   the increase of 29 genes for neutrophils in blood compared to bone
   marrow.

   Similar skewing toward enhanced accessibility of core inflammation
   genes was observed when membrane vs. bone marrow (89 up,
   P = 1.2 × 10^−13), inflamed air pouch vs. blood (85 up,
   P = 1.5 × 10^−11), and inflamed air pouch vs. bone marrow (100 up,
   P = 2.1 × 10^−17) were compared to blood vs. bone marrow.
   (Fig. [134]5b). The number of genes with increased accessibility was
   significantly higher than expected by chance, as compared to the
   accessibility of randomly selected background genes. Importantly, the
   genes with increased and decreased accessibility were highly consistent
   across the comparisons (Fig. [135]5c).

   After finding that core inflammation genes have increased chromatin
   accessibility even before the onset of inflammation, we searched for
   potential driver transcription factors displaying increasing expression
   and regulatory activity in inflammation. Comparing motif enrichment
   (HOMER) with actual expression change in air pouch vs. blood, we
   observed an increase in both measures for a remarkably restricted set
   of transcription factors, namely ATF3, BATF, FOSL1, JUNB, and JUN
   (Fig. [136]5d).

   We next investigated whether the core inflammation program represents a
   group of genes from which neutrophils preferentially draw upon exposure
   to inflammatory stimuli. If this were the case, then it would be more
   likely for core inflammation genes to be upregulated in inflammation
   compared to all other genes. We analyzed RNA-seq data from
   differentiated HoxB8 neutrophils stimulated with or without zymosan for
   2 h^[137]36. We observed that in activated neutrophils, a significantly
   higher proportion of core inflammation genes (107/179 ≈ 60 %) was
   upregulated than expected by chance (36–74 genes in 1000 simulations
   using expression-matched background genes;
   P[overrepresentation] = 6.5 × 10^−41) (Fig. [138]5e).

   When evaluating predicted conserved regulatory activity and change in
   chromatin accessibility together, JUNB emerged as a prominently
   affected transcription factor and has previously been shown to control
   neutrophil activation^[139]36 and to be highly expressed upon
   neutrophil activation^[140]43. On the other hand, CEBPB has previously
   been shown to be a key transcription factor mediating emergency
   granulopoiesis^[141]44 and showed a high predicted regulatory activity
   in our analysis with limited changes in chromatin accessibility. To
   assess the impact of two transcription factors identified in our
   enrichment analysis on the expression of core inflammation program
   genes, we repeated the same analysis in differentiated HoxB8
   neutrophils carrying a genetic knockout of either JunB or Cebpβ. CEBPB
   showed upregulation in inflamed neutrophils as well as increased
   regulatory activity. In addition, JUNB, which plays an important role
   in the inflammatory response of neutrophils^[142]9,[143]36, also had
   increased motif enrichment in the air pouch vs. blood comparison.

   Based on these analyses, we expected a modest reduction in the
   expression of core inflammation genes in Cebpβ^−/− cells and a stronger
   reduction in JunB^−/− cells. Indeed, this was the case: In a direct
   comparison of resting knockout (JunB^−/− and Cebpβ^−/−) versus
   wild-type cells, we observed a significantly stronger downregulation of
   the core inflammation program in JunB^−/− cells (69 genes;
   P = 1.5 × 10^−9) than in Cebpβ^−/− cells (43 genes; P = 0.0011)
   (Fig. [144]5e and Supplementary Fig. [145]9).

   Comparing zymosan-stimulated knockout cells versus wild-type cells, we
   again saw a significant downregulation of core inflammation genes in
   the JunB^−/− condition (51 genes; P = 0.0025) but not in the Cebpβ^−/−
   condition (25 genes; P = 0.79) (Fig. [146]5e and Supplementary
   Fig. [147]9).

   Together, these results indicate that maturation and migration into an
   inflamed tissue site predispose neutrophils to upregulate genes of the
   core inflammation program and that knockout of Cebpβ and especially
   JunB leads to a weaker induction of core inflammation genes compared to
   WT cells.

Members of the core inflammation program can be validated on the protein
level in activated human and mouse neutrophils

   To validate members of the core inflammation program experimentally, we
   filtered the list of genes by surface proteins, yielding 36 markers
   (Fig. [148]6a)^[149]45. Based on antibody availability, we developed a
   flow cytometry panel including canonical lineage markers (human: CD15,
   mouse: Ly6G) and five proteins predicted to be part of the core
   inflammation program: CD14, CD69, CD40, CD274 (PD-L1) and IL-4R
   (Supplementary Tables [150]1 and [151]2).

Fig. 6. Experimental validation of the core inflammation program on the
protein level.

   [152]Fig. 6
   [153]Open in a new tab

   a Inflammation-specific protein-coding genes were identified by
   filtering the genes depicted in Fig. [154]3c for the surfaceome as
   previously described^[155]45. Genes that encode proteins selected for
   validation (based on antibody availability and panel design) are
   labeled in red. b Experimental overview. Human and mouse neutrophils
   were isolated from peripheral blood or bone marrow, respectively,
   cultured 48 h, and analyzed in flow cytometry. c Flow cytometry
   analysis of resting and activated mouse and human neutrophils. The
   gating strategy is depicted in Supplementary Fig. [156]11. Significance
   indicates adjusted P-values of a Dunn’s test (two-sided) that followed
   a Kruskal-Wallis H test (significant for all markers). Exact P-values
   are provided in the Source Data file. d Top, diffusion map embedding of
   neutrophils from humans and mice cultured for 48 h with or without
   GM-CSF + LPS and GM-CSF + IFN-γ. Diffusion map embedding calculated
   based on CD69, CD14, IL-4R, and PD-L1. Bottom, diffusion map embedding
   of human and mouse neutrophils, colored by species. e Diffusion map
   embedding colored by marker expression highlights a continuum driven by
   increasing expression of the activation markers. Source data are
   provided as a Source Data file.

   We isolated human neutrophils from peripheral blood and mouse
   neutrophils from bone marrow and cultured them over 48 h with or
   without the addition of GM-CSF + LPS and GM-CSF + IFN-γ (Fig. [157]6b).

   Prolonged cell culture without activation led to an increase in CXCR4
   and loss of CD62L and CD101 in human cells, while mouse cells showed a
   reversed phenotype with upregulation of CD62L and CD101 as well as a
   downregulation of CXCR4, suggesting continued maturation of bone marrow
   neutrophils in vitro and not classical neutrophil aging (Supplementary
   Fig. [158]10). Compared to unstimulated cells, activated mouse
   neutrophils significantly upregulated the predicted core inflammation
   program markers CD14, CD40, CD69, PD-L1, and IL-4R in the condition
   containing LPS and all but CD69 in the condition containing IFN-γ
   (Fig. [159]6c). Human neutrophils displayed a highly concordant
   increase in those markers. CD69 and PD-L1 increased with similar
   magnitude, while upregulation of CD14 was stronger in mouse neutrophils
   compared to human neutrophils. In human neutrophils, upregulation of
   CD40 was restricted to a small (~2%) population of neutrophils (in line
   with previous findings^[160]46) but reached significance on the bulk
   level for both stimulations (Fig. [161]6c).

   Differences were also noticeable between inflammatory conditions. In
   mouse neutrophils, the combination of GM-CSF and LPS led to a stronger
   increase in the expression of CD14, CD69, IL-4R, and CD40 compared to
   GM-CSF and IFN-γ. The reverse was true for PD-L1, which is driven
   substantially by IFN-γ signaling^[162]8. Further, IFN-γ stimulation
   reduced CD69 expression, while LPS increased it. In human neutrophils,
   the combination of GM-CSF and IFN-γ leads to stronger increases in
   CD14, CD69, IL-4R, and PD-L1 than the combination of GM-CSF and LPS.

   A combined diffusion map analysis revealed a high degree of overlap
   between mouse and human neutrophils, while cell distribution was driven
   predominantly by experimental conditions (Fig. [163]6d).
   Correspondingly, activated neutrophils of both species displayed a
   continuous upregulation of the inflammatory response markers
   (Fig. [164]6e).

   These findings confirm the predicted activation markers, further
   substantiating the conservation of inflammatory response programs in
   neutrophils while also revealing differences between species and
   inflammatory conditions.

Neutrophil origin and inflammatory condition influence the expression of the
core inflammation program

   Neutrophil heterogeneity is influenced by the tissue
   microenvironment^[165]6,[166]9. To evaluate the impact of tissue origin
   on the phenotype of neutrophils in inflammation, we performed
   stimulation experiments with paired leukocyte preparations from blood,
   bone marrow, and spleen of wild-type BL6 mice. In a principal component
   analysis of flow cytometry data, resting neutrophils clustered closely
   together, but each tissue remained distinguishable based on subtle
   baseline expression differences in IL-4R, CD69, and CD40
   (Fig. [167]7a). Inflamed neutrophils deviated markedly from their
   resting counterparts and reached distinct states as a function of
   tissue and inflammatory condition (Fig. [168]7a).

Fig. 7. Neutrophil origin and cytokine stimulation determine the
manifestation of the core inflammation program.

   [169]Fig. 7
   [170]Open in a new tab

   a Left: Principal component analysis performed on the median marker
   expressions per sample reveals distinct phenotypes of neutrophils
   driven by organ of isolation (color coded) as well as cytokine
   stimulation (line-segment coded). Right: PCA plots colored by the
   scaled expression for each of the quantified markers. b Diffusion Map
   embedding of neutrophils from bone marrow, blood, and spleen cultured
   for 8 h with or without GM-CSF + LPS and GM-CSF + IFN-γ. Embeddings
   were calculated based on the expression levels of CD69, CD14, IL-4R,
   PD-L1, and CD40. c Diffusion Map embedding of resting and
   cytokine-stimulated neutrophils (as in b) from bone marrow, blood, and
   spleen, colored by marker expression. Source data are provided as a
   Source Data file.

   Neutrophils from all tissues upregulated CD69 and IL-4R, suggesting
   that these markers can be utilized as neutrophil activation markers
   across a variety of conditions (Fig. [171]7a). In contrast, expression
   of CD40, CD14, and PD-L1 showed greater tissue dependence. CD40
   (evident most prominently in mouse neutrophils) was robustly
   upregulated in splenic neutrophils and less prominently in blood
   neutrophils. Conversely, CD14 and PD-L1 expression was inducible to a
   greater extent in blood neutrophils and bone marrow neutrophils but
   less in splenic neutrophils.

   We also noted differences related to activating stimuli, for example,
   through more prominent PD-L1 induction by IFN-γ compared to LPS. The
   single-cell analysis highlighted a continuum of states in all organs
   (Fig. [172]7b), driven by increasing expression of the core
   inflammation markers (Fig. [173]7c). Importantly, the core program was
   already inducible in bone marrow neutrophils, suggesting that in vitro
   and adoptive transfer experiments performed with bone marrow
   neutrophils can recapitulate important features of neutrophil biology
   in inflammation.

Discussion

   Neutrophils are important mediators of immune defense and protagonists
   in immune-mediated diseases. Mouse and human neutrophils differ in
   morphology, frequency in blood (humans ~50–70%, mice ~10–25%), and
   expression of marker proteins. For example, mouse neutrophils are
   defined by surface expression of Ly6G, not present in the human genome,
   whereas mouse neutrophils lack expression of defensins^[174]47.

   Both in humans and mice, neutrophils are phenotypically heterogeneous
   across different tissues and inflammatory
   conditions^[175]37,[176]48,[177]49. Recent studies suggest that
   neutrophil heterogeneity in homeostasis is driven by a chronological
   sequence of maturation and activation termed neutrotime, whereas the
   combination of aging, tissue factors, environmental features, and
   inflammatory signals promote their polarization toward distinct
   states^[178]6,[179]7,[180]9.

   While the neutrotime signature can be detected in both species and this
   overarching principle of neutrophil ontogeny is likely conserved across
   humans and mice, it is poorly understood which features of the
   neutrophil inflammatory response are shared across species.
   Furthermore, it is unclear which aspects of the neutrophil inflammatory
   response reflect a general inflammatory response program shared across
   multiple inflammatory conditions and which features are highly specific
   to certain triggers or sites of inflammation.

   To address these gaps in knowledge, we performed an integrative
   analysis of resting and inflamed RNA-seq samples from humans and mice.
   We validated our computational approach by comparing gene expression
   conservation across six immune cell lineages: T cells, B cells,
   monocytes, dendritic cells, NK cells, and neutrophils. Expression of
   lineage-specific genes was generally well-conserved across humans and
   mice. Intriguingly, neutrophils displayed both the greatest number of
   lineage-specific genes and the highest correlation of gene expression
   between mice and humans, suggesting a higher degree of conservation in
   this phagocytic cell compared to other lineages.

   While different inflammatory conditions induced highly heterogeneous
   responses in neutrophils, our combined analysis allowed us to predict a
   core inflammation program conserved across mice and humans. The
   robustness of this program was underscored by the high concordance
   between the gene set derived from Fisher’s combined test and
   complementary approaches based on a linear mixed model as well as
   weighted correlation network analysis (WGCNA^[181]39,[182]50,[183]51).
   It is important to note that different analytical strategies may be
   used to derive this core inflammation program, each detecting a varying
   number of genes. The situation is similar for differential gene
   expression in general, which depends on the chosen method, as has been
   reviewed extensively^[184]52. Nevertheless, our analysis indicates that
   a group of genes exists from which neutrophils preferentially draw when
   they become activated across humans and mice and across a large range
   of conditions and disease states. The conservation of a small set of
   transcription factors predicted to regulate a broad variety of
   conditions across humans and mice highlights the conserved nature of
   gene expression in neutrophils.

   To validate the predicted core inflammation program in different
   models, we analyzed differential gene accessibility in ATAC-sequencing
   data from a mouse air pouch model of inflammation. We found a
   significant proportion of core inflammation program genes to be more
   accessible with maturation and in pro-inflammatory conditions. While
   this model is very specific, it covered neutrophils from different
   maturation stages and presented the opportunity to study transmigrated
   and activated neutrophils separately. Further, analysis of the
   transcriptome on a single cell level in both in vivo and in vitro
   inflamed neutrophils of both species allowed us to validate the core
   inflammation program. While the overall enrichment of the proposed gene
   set on a pseudo-bulk level was clearly evident, our analyses also
   suggested significant heterogeneity within the population of inflamed
   neutrophils, consistent with recent
   analyses^[185]7,[186]9,[187]40,[188]41. These analyses further
   highlight the predictive value of the program in a method not used in
   its generation.

   HoxB8-derived neutrophils are a powerful tool to model neutrophil
   function. We assessed the differential expression of zymosan-activated
   HoxB8-derived neutrophils versus control, showing a significant
   overrepresentation of core inflammation genes in activated neutrophils.
   Zymosan-activated myeloid cells through TLR2 and is a commonly used
   pro-inflammatory trigger. The core inflammation program was reduced in
   resting cells carrying a knockout of key regulators of this program
   (JunB^−/− and Cebpβ^−/−). Core inflammation genes were also
   significantly less upregulated in zymosan-stimulated JunB^−/− cells,
   indicating an impaired neutrophil inflammatory response in these cell
   lines. Concordant with previous reports of a more limited impact of the
   Cebpβ knockout on inflammatory neutrophil functions compared to the
   JunB knockout^[189]36, the underrepresentation of core inflammation
   genes was nonsignificant in our analysis.

   Finally, we validated key components of the predicted core inflammation
   program experimentally. Using primary human and mouse neutrophils, we
   showed that the surface proteins CD14, CD69, IL-4R, CD40, and PD-L1 are
   induced by in vitro cytokine stimulation, and this upregulation is
   observable in both species, although CD40 was restricted to a small
   subset of neutrophils in humans, as expected^[190]46.

   This finding further underlines the conserved character of the
   inflammation program as presented in this study. Interestingly, while
   neutrophils from different mouse tissues upregulated the inflammatory
   response markers, the magnitude of upregulation differed across bone
   marrow, spleen, and blood, suggesting that the tissue origin of
   neutrophils is an important consideration in experimental studies.

   Recently, Jin et al. identified a distinct neutrophil population termed
   “antigen-presenting aged neutrophils (APANs)”^[191]53. In humans, this
   population was characterized as CD66b^+CXCR4^+CD62L^loCD40^+CD86^+,
   while in mice, they were identified as
   Ly6G^+CXCR4^+CD62L^-/loMHCII^+CD40^+CD86^+. APANs were capable of
   inducing CD4 T cell proliferation via IL-12 and exhibited a
   hyper-NETosis phenotype. The presence of these neutrophils in patients
   with sepsis was associated with increased mortality. While we also
   observed the upregulation of key marker genes like CD40 in our study’s
   core inflammation program, APANs displayed distinct features, such as
   elevated levels of CXCR4 and coexpression with CD74, suggesting a
   unique neutrophil polarization state discriminable from both neutrophil
   aging and canonical activation. The phenotype observed by the authors
   suggests the importance of further studying APANs, their features and
   their role in antigen presentation in humans and mice.

   The upregulation of IL-4R we observed is concordant with reports of
   IL-4R upregulation during sterile information in mice, with
   implications for diseases that are IL-4 mediated^[192]54. CD14 has
   recently been shown to be an important, highly cell-specific mediator
   of TNF response in a mouse sepsis model^[193]55. Interestingly, CD14^+
   macrophages and neutrophils were found to be key players leading to
   lethality in response to TNF (with improved survival in CD14-deficient
   mice), which provides a model for the cytokine storm seen in severe
   sepsis and provides evidence for the complexity of CD14-mediated
   inflammatory response beyond TLR-signaling. These examples highlight
   the importance of core inflammation program members and stress the need
   to study them in a broad variety of inflammatory contexts.

   The question of how well mouse models mimic human immunology is an area
   of ongoing debate. Even the same data can support different
   conclusions^[194]56,[195]57, highlighting the impact of analytical
   decisions. Furthermore, it is important to compare suitable datasets,
   control for batch effects, and make comparisons to varying controls to
   avoid a shared denominator effect^[196]56,[197]58,[198]59.

   In the context of neutrophils, fundamental differences between humans
   and mice exist^[199]60,[200]61. Those differences must be considered
   when using the mouse as a model to study neutrophil function,
   especially in disease, as previously discussed^[201]62. Granule
   proteins found in neutrophils play a key role in defense against
   infection. An important difference in the granule protein repertoire
   includes α-defensins, which exercise antimicrobe^[202]63,[203]64 and
   chemotactic^[204]65 activity and are absent in mouse neutrophils. It is
   also known that mouse neutrophils express less MPO, leading to a more
   limited capability to produce hypochlorous acid compared to their human
   counterpart^[205]66. The importance of cytokine production by
   neutrophils has been increasingly recognized^[206]67,[207]68, with some
   cytokines such as IFN-β and IL-17 apparently expressed in mouse and not
   human neutrophils. The different immunoreceptor reservoir^[208]69 is,
   in part, a result of pathogen responses that are exclusive to the human
   species. For example, human neutrophils express specific CEACAMs that
   mediate uptake of the human-specific pathogen Neisseria
   gonorrhea.^[209]70, which must be taken into account when modeling
   neutrophil responses to this pathogen^[210]71. Taken together, these
   studies provide important context to be taken into account when
   interpreting the core inflammation program identified.

   The derivation of the core inflammation program was limited to bulk
   RNA-sequencing samples since a similar analysis using single-cell
   studies requires datasets that are only now beginning to emerge. To
   circumvent potential batch effects, we focused our analysis on studies
   with internal controls of resting neutrophils, excluding other
   potentially interesting studies containing only neutrophils harvested
   from inflamed sites. Analyzed samples were also limited by technical
   factors, including the known intron retention in neutrophils^[211]72,
   as well as the less complex transcriptome associated with low RNA and
   high RNase content. Furthermore, analysis of single-cell RNA sequencing
   data, the ATAC-seq data from the air pouch model of inflammation, and
   RNA-seq data from zymosan-activated HoxB8 samples represent only
   selected validation strategies in specific modalities of inflammation,
   which might limit the generalizability of some of the findings.

   Nevertheless, our combined analysis of 11 human and mouse neutrophil
   transcriptomic datasets identified a largely conserved transcriptomic
   landscape across species, supporting the use of mouse neutrophils to
   illuminate human biology. We furthermore predicted and experimentally
   confirmed the existence of a core inflammation program conserved across
   human and mouse inflamed neutrophils. This study sets the stage for
   more fine-grained analyses of the epigenome, transcriptome, and
   proteome of neutrophils across varying conditions, which together will
   paint a clearer picture of the neutrophil response to different
   environments. Going forward, genetic perturbations and pharmacological
   interventions to interfere with pathologic neutrophil activation will
   be particularly informative if focused on programs conserved across
   species. The systems biology approach presented here can be transferred
   to other cell types and organisms to facilitate further studies
   comparing gene expression across species.

Methods

Ethics approval

   Research with healthy human participants followed the declaration of
   Helsinki. Blood of healthy donors was collected under IRB-approved
   protocols (Heidelberg S-272/2021 and Heidelberg S-285-2015) approved by
   the ethics committee of the University of Heidelberg, Heidelberg,
   Germany. Informed consent was obtained from all participants.

   Experiments involving animals were conducted under the approval of the
   Animal Care Facility Heidelberg and the Animal welfare officers
   (approval #T66/21) at the University of Heidelberg, Heidelberg,
   Germany.

   We obtained publicly available RNA sequencing data from mouse and human
   leukocytes through GEO (262 samples from 24 studies) and integrated
   these data by mapping orthologous genes. Differential expression
   analysis between resting and inflamed neutrophils was performed
   separately for each dataset, and the core inflammation program was
   derived using Fisher’s combined test. Transcription factor enrichment
   analysis was performed using ChEA3 and DoRothEA and compared to
   chromatin accessibility data from ATAC-seq ([212]GSE161765). The impact
   of Cebpβ and JunB knockout on the core inflammation program was studied
   using RNA-seq data from HoxB8 cells ([213]GSE161765). The core
   inflammation program was validated in stimulated mouse and human
   neutrophils by flow cytometry.

Datasets

   For all analyses, we used the following datasets:

RNA sequencing

   Datasets of interest were identified through a literature search on
   PubMed and the NCBI Gene Expression Omnibus. In total, 262 publicly
   available RNA sequencing samples from 24 studies were included.
     * Lineage atlas dataset (Table [214]1): 76 samples, 40 human samples,
       36 mouse samples. This dataset is a curated subset of the
       Haemopedia RNA-Seq atlas. Human cells were from buffy coats of
       healthy donors, and mouse cells were from blood, bone marrow,
       spleen, and lymph nodes.
     * Neutrophil dataset (Table [215]2): 195 samples (including the 9
       Haemopedia neutrophil samples, Choi J 2019), 136 human samples from
       13 studies, 59 mouse samples from 11 studies. All studies in this
       dataset were selected only to contain neutrophils. A subset of this
       dataset from studies with inflamed samples as well as healthy
       controls was used for differential expression testing and
       inflammatory core signature construction. Other subsets of samples
       from studies not selected for differential expression analysis have
       been used in analyses focusing on healthy control samples
       (Fig. [216]2).

Table 1.

   Lineage atlas dataset
       Lineage     Species N samples
   B cells         Human   8
   Dendritic cells Human   7
   Monocytes       Human   7
   Neutrophils     Human   3
   NK cells        Human   5
   T cells         Human   10
   B cells         Mouse   2
   Dendritic cells Mouse   4
   Monocytes       Mouse   6
   Neutrophils     Mouse   6
   NK cells        Mouse   2
   T cells         Mouse   16
   [217]Open in a new tab

Table 2.

   Neutrophil dataset
   Species         Study name        Used for signature construction N samples
   Human   Adrover JM 2020^[218]10   FALSE                           6
   Human   Catapano M 2020^[219]11   FALSE                           11
   Human   Choi J 2019^[220]12       FALSE                           3
   Human   Franco LM 2019^[221]13    FALSE                           2
   Human   Grabowski P 2019^[222]14  FALSE                           22
   Human   Vecchio F 2018^[223]19    FALSE                           8
   Human   McCreary M 2019           TRUE                            10
   Human   Miralda I 2020^[224]16    TRUE                            16
   Human   Mistry P 2019^[225]17     TRUE                            28
   Human   Ter Haar NM 2018^[226]107 TRUE                            6
   Human   Thomas HB 2015^[227]18    TRUE                            12
   Human   Wright HL 2013^[228]21    TRUE                            6
   Human   Wright HL 2020^[229]20    TRUE                            6
   Mouse   Bhalla M 2021^[230]22     FALSE                           7
   Mouse   Casulli J 2019^[231]23    FALSE                           3
   Mouse   Coffelt SB 2015^[232]24   FALSE                           4
   Mouse   Choi J 2019^[233]12       FALSE                           6
   Mouse   Germann M 2020^[234]26    FALSE                           4
   Mouse   Hsu BE 2019^[235]27       FALSE                           4
   Mouse   Zhu YP 2018^[236]31       FALSE                           3
   Mouse   Gal-Oz ST 2019^[237]25    TRUE                            12
   Mouse   Hutchins AP 2015^[238]28  TRUE                            4
   Mouse   Stasulli NM 2015^[239]29  TRUE                            6
   Mouse   Yan Z 2019^[240]30        TRUE                            6
   [241]Open in a new tab

RNA-Seq of HoxB8 cells

     * Khoyratty TE 2021^[242]36 18 samples

ATAC-Seq

     * Khoyratty TE 2021^[243]36 5 peak annotations

Flow cytometry

     * This study samples from 8 human donors and 9 mice
     * This study samples from different organs of 6 mice

Single-cell RNA sequencing

     * Xie et al., 2020
     * Montaldo et al., 2022
     * Combes et al., 2021

Data retrieval and processing

   We downloaded raw sequencing reads for the selected studies to the
   MLS&WISO bwForCluster using release 1.5 of the nf-core^[244]73 fetchngs
   pipeline and quantified them using release 3.6 of the nf-core rnaseq
   pipeline. The pipelines were launched using nextflow^[245]74
   (v22.04.0). To ensure high reproducibility, all pipeline processes were
   run inside singularity (v3.9.2) containers. For bulk RNA-seq samples,
   we mapped all downloaded samples using salmon^[246]75 (v1.5.2) with the
   parameters libType set to ‘A’ and indexing the reference genomes with
   21 base k-mers. Quantified transcripts were summarized to the gene
   level using bioconductor-tximeta^[247]76 (v1.8.0). All human samples
   were mapped to the GRCh38 genome. All mouse samples were mapped to the
   GRCm39 genome unless stated otherwise. Quality control was conducted
   using FastQC (Supplementary Data [248]1). Author-supplied metadata was
   queried using GEOquery^[249]77 (v2.64.0) and integrated manually to
   ensure consistency across studies (Supplementary Data [250]2). R
   (v4.2.0) was used for downstream analyses. Bioconductor (v3.15) and
   additional packages were used for downstream analyses and
   visualizations^[251]51,[252]78–[253]80. Sequencing depths (total amount
   of mapped reads) for human samples in the lineage atlas dataset ranged
   between 11,970,326 and 16,060,356 (median 13,267,958), for mouse
   samples between 16,076,257 and 33,595,867 (median 18,689,418;
   Supplementary Fig. [254]2a, b). In the neutrophil dataset, sequencing
   depths ranged between 195,712 and 57,144,004 (median 29,741,078) for
   human and between 1,022,486 and 42,005,334 (median 12,429,230) for
   mouse samples. The subset of samples that was selected for differential
   expression testing and inflammatory core program calculation was
   sequenced between 8,135,455 and 57,144,004 (median 33,522,466) for
   human and between 1,022,486 and 36,405,228 (median 6,388,735;
   Supplementary Fig. [255]2c, d) for mouse samples.

   For single-cell RNA sequencing data, the raw sequencing reads were
   downloaded as described above and aligned using cellranger (v7.1.0)
   using the GRCh38 genome for human datasets and mm10 genome for mouse
   samples, respectively. Downstream analysis was carried out in Python
   (v3.10) using the scanpy^[256]81,[257]82 API (v1.9.3) for data analysis
   and visualization.

Orthology analyses and mapping

   For downstream analyses, genes were mapped using ENSEMBL Version
   107^[258]2. We restricted all our composite cross-species analyses to
   protein-coding genes with a high-confidence orthology relationship and
   available gene symbols in both species. Mouse and human gene expression
   datasets were combined based on these orthologs.

Identification of lineage-associated genes

   Lineage-associated genes were identified using a linear model-based
   differential expression test, implemented in limma^[259]83 (v3.52.0)
   and edgeR^[260]84–[261]86 (v3.38.0). Differential expression testing
   was restricted to protein-coding genes that could be assigned
   high-confidence orthologs between human and mouse samples. We
   constructed a cross-species count matrix based on those mappings and
   referred to each mapped gene by its human gene symbol. Counts were
   filtered using edgeR’s filterByExpr filtering approach. We applied TMM
   normalization to account for differences in library composition. We
   then transformed counts to log[2](CPM) values and estimated weights for
   each observation using voom. We applied limma to fit a linear model to
   our data and calculated differential expression for a given lineage
   against all remaining lineages. Lineage-associated genes were defined
   as genes that were differentially expressed in each lineage against all
   other lineages at a Benjamini-Hochberg corrected P-value of ≤0.05 and a
   log[2] FC >0. Genes were ranked according to their F statistic, and up
   to 200 genes were selected per lineage.

Lineage PCA, correlation analysis, and clustering

   We used these balanced lineage-associated gene sets to perform PCA as
   well as correlation and clustering analysis on all samples. Human and
   mouse samples were combined as described above. To emphasize our focus
   on comparisons between lineages, we mean-centered log[2](CPM) for each
   species prior to combining the count matrices. A PCA was computed for
   all integrated samples, taking the concatenated lineage-associated gene
   sets as input features. Correlation of expression analyses was
   performed based on the same features, calculating Pearson’s r
   correlation coefficient for each inter-sample combination. We
   subsequently performed a hierarchical clustering analysis on the
   obtained correlation coefficients.

Comparison of neutrophil lineage gene expression profiles in resting
neutrophils

   To compare expression patterns of neutrophil lineage-associated genes
   in resting human and mouse neutrophils, we first defined
   lineage-associated genes for human and mouse samples separately. We
   defined those genes as lineage-associated that were upregulated
   (Benjamini-Hochberg corrected P-value of ≤0.05 and a log[2] FC >1) in
   neutrophils against all other lineages. We next mapped those gene sets
   to their human and mouse counterparts, considering only high-confidence
   one-to-one, one-to-many, and many-to-many orthology relationships.
   Based on those mappings, we merged all genes detected as
   lineage-specific in either of the considered species. We also included
   genes detected as lineage-specific in either species but could not be
   mapped to a high-confidence ortholog. The obtained genes were subset
   only to include genes that showed evidence of expression in the
   inflammatory dataset.

   Taking the computed mappings and log[2](TPM+1) expression values of
   mapped gene-gene pairs, we tested for differential expression of those
   pairs between species using a linear mixed model^[262]87 (lme4 v1.1-29)
   accounting for study-related batch effects by including the study
   annotation as a random effect:

   Full model: log[2](TPM+1) ~ species + (1|study)

   Null model: log[2](TPM+1) ~ (1|study)

   P-values were computed by performing a likelihood ratio test between
   these models. We subsequently adjusted those values using the
   Benjamini-Hochberg correction method based on the total number of
   tested gene-gene pairs (genes that appeared as lineage-specific in
   either species were expressed in the inflammatory dataset and could be
   mapped to one or more counterparts with high confidence).

   Using the average expression of mapped gene-gene pairs and differential
   expression P-value, we defined 5 different expression profiles: Genes
   that showed high (>95th percentile of all genes that were detected as
   lineage-specific in either species) average expression levels in both
   species and did not exhibit differential expression between species
   (Benjamini-Hochberg corrected P-value ≥ 0.05, absolute beta < 1).
   Additionally, we defined 4 divergent clusters of genes that had high
   expression levels in only one of both species and showed evidence of
   differential expression (Benjamini-Hochberg corrected P-value < 0.05,
   absolute β ≥ 1) or were abundantly expressed but could not be assigned
   an orthologous gene in the other species respectively.

Differential expression testing

   We performed differential expression analyses between inflamed and
   resting conditions on a total of 112 samples from N = 11 (human: 7,
   mouse: 4) studies. To account for potential batch effects between
   studies, we used DESeq2^[263]88 (v1.36.0) in each of the studies
   individually to identify differentially expressed genes in inflamed
   compared to healthy control samples. Each study’s gene list was
   pre-filtered to only include genes with counts >1 in at least 1 sample
   before differential expression analysis, based on the negative binomial
   distribution. To remove noise while preserving significant differences,
   log[2] fold change results were then shrunk using the apeglm
   package^[264]89. Differential gene expression results were additionally
   filtered through DESeq2’s default independent filtering approach, as
   well as its count outlier filtering.

Identification of a core inflammation program

   To assess which inflammation-driven changes in the neutrophil
   transcriptome are shared across conditions and conserved across
   species, we applied a Fisher’s combined test to the adjusted P-values
   of each gene in each study, restricting the analysis to genes that
   passed our expression filter as well as DESeq2 filters in ≥80% of
   studies. This analysis provided a Benjamini-Hochberg-corrected
   composite P-value for all genes and allowed us to rank genes by their
   likelihood of dysregulation in inflammation. Additionally, we
   calculated a mean log[2] fold change for each gene across all studies.

   Based on a rank-P-value plot (Fig. [265]3b), we determined a P-value
   cutoff at a rank equaling 500, corresponding to an adjusted Fisher
   P ≤ 6.164117 × 10^−40. Genes with an absolute log[2] fold change
   greater than or equal to 0.5 and an adjusted P-value below our defined
   threshold were considered conserved in inflammation. We defined the
   upregulated subset (log[2] FC ≥ 0.5) of those conserved genes as the
   core inflammation program.

Pathway enrichment analysis

   Inflammatory pathway enrichment analysis was performed for each study
   individually using the fGSEA implementation of the Gene Set Enrichment
   Analysis method. For each study, the differential expression analysis
   results were ranked by log[2] fold change. Enrichment was calculated
   for hallmark gene sets that were retrieved from the Molecular
   Signatures Database^[266]90 (v7.5.1).

Data preprocessing for single-cell RNA-seq

   Datasets were imported using the raw count matrices from cellranger.
   First, empty droplets were determined by estimating the profile of
   ambient mRNA and testing deviations from this profile using a
   Dirichlet-multinomial model of UMI count sampling as implemented in the
   EmptyDrops method^[267]91 (implemented in the DropletUtils package,
   v1.18.1). Ambient RNA correction was applied using the
   soupX-algorithm^[268]92 (v1.6.2), and doublets were determined using a
   computational doublet detection tool that uses artificially created
   cell doublets to identify real cell doublets by
   nearest-neighbor-analysis in gene expression space^[269]93 (v1.12.0).
   Cells expressing hemoglobin-related genes in a proportion above 0.02
   were excluded, as well as cells containing less than 250 (Xie et al.,
   Montaldo et al.) or 100 (Combes et al.) unique genes per cell. Cells
   with a content above 5% (Xie et al.) or 10% (Combes et al., Montaldo et
   al.) of mitochondrial genes were also excluded. Cell types were
   identified using the SingleR package^[270]94 (v2.0.0) in R with
   BlueprintEncodeData and MonacoImmuneData as reference datasets for
   human datasets and ImmGenData as reference dataset for mouse data, as
   provided by the package celldex^[271]94 (v1.8.0). Data were
   log-normalized, and neutrophils were selected. For UMAP visualization,
   the 2000 genes containing the highest variance were selected and UMAP
   was computed using the scanpy API with default settings.

Gene set enrichment in single-cell RNA-seq

   The gene counts were normalized and log1p transformed using scanpy. The
   enrichment of genes belonging to the core program was quantified using
   the difference between the average expression of the core inflammation
   program genes and the average expression of a random reference set of
   genes^[272]95 that have been sampled to match the expression
   distribution of the core inflammation program (scanpy score_genes
   function with default settings). We tested for enrichment in inflamed
   conditions using a linear mixed model with the sample (and organ for
   the dataset from Xie et al.) as a random effect and the respective
   treatment as a fixed effect. For datasets with multiple inflamed
   conditions, a respective model was created for each comparison
   separately. P-values were calculated by performing a maximum likelihood
   test between both models as described above. For visualization, the
   gene set scores were quantile-capped to the 5^th and 95^th percentile.

Transcription factor enrichment analysis

   In order to identify regulators associated with genes induced or
   downregulated in inflammation, we used ChEA3^[273]96 with default
   settings as described^[274]9, using the 250 most significantly up- and
   downregulated genes, respectively, for each condition, ranked by their
   adjusted P-value. We then calculated the arithmetic mean of the
   negative logarithms of the ChEA3 scores per species and transcription
   factor to compare average TF activity across species.

   We used a paired t-test to assess significant differences between a TFs
   ChEA3-enrichment in up- against downregulated genes across all
   comparisons. The resulting P-values were corrected using a
   Benjamini-Hochberg correction for all tested TFs. We used these
   corrected P-values to determine if a TF was significantly more enriched
   in genes upregulated in inflammation or vice-versa. We subsequently
   inferred transcription factor activity using DoRothEA^[275]97 (v1.8.0)
   and decoupleR^[276]98 (v2.2.2), taking advantage of the
   species-specific transcription factor databases. Here, log[2] fold
   change matrices per species served as input, leading to enrichment
   scores with their respective P-values. For downstream analyses, we
   calculated the mean enrichment scores per species and preserved the
   highest observed P-value for each transcription factor. For data
   visualization (Fig. [277]5a), only transcription factors where the
   respective gene had an assigned P-value in Fisher’s combined test are
   shown (N = 680). Genes encoding transcription factors with a mean
   log[2]fold change ≥0 were merged with transcription factor scores that
   were derived from the upregulated score set (as in Supplementary
   Fig. [278]8b), and genes with a mean log2 fold change <0 were merged
   with transcription factor scores that were derived from the
   downregulated score set (as in Supplementary Fig. [279]8a). Labeling
   was selectively applied for genes encoding transcription factors with
   the 10 highest absolute log[2] fold changes per direction which fall
   under the adjusted Fisher’s combined P-rank cutoff of 500 genes, as
   well as the 5 transcription factors with the lowest P-values and CEBPB.

ATAC-sequencing analysis

   We retrieved ATAC-sequencing data from mice that were subjected to the
   air pouch model of acute inflammation (GEO: [280]GSE161765, mapped to
   the GRCm38 genome). Genes annotated based on differentially accessible
   peaks as defined in the study (P[adj] < 0.05, log[2] fold change > 1.5)
   were compared with the conserved upregulated genes as defined in the
   core inflammation program. The ratio and number of core inflammation
   program genes that were associated with projected increased
   accessibility served as an input for pairwise Fisher’s exact tests, and
   P-values were adjusted using the Benjamini-Hochberg method. For each
   comparison, the significance of the number of core inflammation genes
   with increased accessibility was retrieved by comparing the number with
   the results of this analysis using a 1000-fold repeated random
   selection of expression-matched background genes (as described below
   for RNA sequencing).

RNA-sequencing analysis of zymosan-treated HoxB8 cells

   We retrieved featureCounts (per ENSEMBL-ID) from HoxB8 cells that were
   subjected to differentiation and zymosan treatment (GEO:
   [281]GSE161765, mapped to the GRCm38 genome). Differential expression
   analysis was performed as described in the respective section above. We
   restricted the analysis to HoxB8 cells that were differentiated for 5
   days and then compared (1) wild-type cells that were treated for 2 h
   with zymosan (50 µg/ml) or DMSO (control), (2) resting stable knockout
   HoxB8 cell lines versus wild type, (3) zymosan-treated stable knockout
   HoxB8 cell lines versus zymosan-treated wild-type HoxB8 cells. A
   significant up- or downregulation of core inflammation program genes
   was then assessed by performing pairwise overrepresentation analyses.
   For each overrepresentation analysis, we defined differentially
   expressed genes as genes with an FDR ≤ 0.05 and a |log2 fold change| ≥
   1. We then used goseq (v1.48.0) to calculate a Probability Weighting
   Function for the given set of genes, and calculated P-values by
   approximating the true distribution by the Wallenius non-central
   hypergeometric distribution as previously described^[282]99.

   The control expression was calculated as previously described^[283]95.
   For each comparison, the gene expression was distributed in 25 bins.
   Then, each core inflammation program member was assigned to its
   respective bin. The randomized sets were then sampled according to the
   distribution of core inflammation program gene expressions. This
   sampling was repeated 1000 times.

Experimental validation

   The list of ranked conserved inflammatory response genes was filtered
   to include genes encoding surface proteins using the surfaceome
   resource^[284]45. The remaining N = 69 surface protein-encoding genes
   were then filtered by available human and mouse antibodies (BioLegend),
   and a panel consisting of CD14, CD69, CD40, IL-4R, and PD-L1 was
   selected for validation.

Human samples

   Neutrophils were isolated using density gradient centrifugation with
   Polymorphprep as previously described^[285]8: 30 ml whole blood was
   layered onto 20 ml Polymorphprep (Progen #1114683) and centrifuged at
   535 × g for 35 min. The PBMC-containing layer was discarded by suction,
   and neutrophils were recovered and subjected to hypotonic lysis using
   0.2% NaCl. The cells were subsequently washed with cell culture medium
   (RPMI 1640 (Gibco #21875-034)) supplemented with 10% heat-inactivated
   FBS (PAN Biotech #3302/P101102) and 1% GlutaMAX (Gibco #35050-061) and
   seeded at 5 million cells per 6 wells in a total volume of 5 ml at a
   humidified atmosphere at 37 °C with 5% CO[2]. The cells were cultured
   over 48 h either in the absence of cytokines (vehicle control), with
   GM-CSF + IFN-γ or GM-CSF + LPS. GM-CSF was used at a final
   concentration of 100 U/ml (R&D #215GM), IFN-γ at 10 ng/ml (BioLegend
   #570208), and LPS at 100 ng/ml (Invivogen #tlrl-3pelps). After 48 h, 1
   million cells were collected and stained using the Zombie Yellow
   Fixable Viability Kit (BioLegend #423103) for live/dead discrimination,
   followed by an antibody panel (Supplementary Table [286]1) in 50 µl of
   FACS buffer (2% FBS, 5 mM EDTA and 0.1 sodium azide in PBS) for 25 min.

Mouse samples

   Mice were housed under SPF conditions with a 12 h light/dark cycle, a
   humidity of 50–60%, a temperature of 22 ± 2 °C and food and water
   available ad libitum. Male and female C57BL/6J mice were sacrificed by
   cervical dislocation, and bone marrow was extracted by flushing with
   RPMI. Neutrophils were enriched by density centrifugation using
   Histopaque 1077 (Sigma-Aldrich #10771) and Histopaque 1119
   (Sigma-Aldrich #11191). Cells were recovered from the interphase of
   both Histopaque layers and centrifuged. Cells were washed with RPMI
   containing 10% FBS and 1% Glutamax and seeded at 10^6 cells/ml in 48
   well plates in a total volume of 500 µl. Mouse GM-CSF (Peprotech
   #315-03, 100 U/ml), mouse IFN-γ (Peprotech #315-05, 10 ng/ml), and LPS
   (Invivogen #tlrl-3pelps, 100 ng/ml) were added to the medium for 24 h
   and 48 h in combination as indicated in the respective figures. Cells
   cultured in the absence of cytokines were used as controls. After the
   indicated times, cells were collected and stained with the antibody
   panel (Supplementary Table [287]2) in 50 µl of FACS buffer containing
   2% FBS, 5 mM EDTA and 0.1% sodium azide.

   To assess neutrophils from different organs, mice were sacrificed by
   cardiac puncture under generalized anesthesia. Subsequently, the femora
   and tibiae were flushed with PBS to obtain bone marrow. Any remaining
   fat was removed from spleens, and splenic tissue was mechanically
   disintegrated using the back of a syringe. Cells were pelleted at
   400×g, and erythrocytes were lysed using ACK Lysing Buffer (Lonza
   #10-548E) for 5 min at 4 °C. Cells were seeded at 10^6 cells/ml in 48
   well plates in a total volume of 500 µl. Cytokines were added as
   described above for a total of 8 h.

Flow cytometry

   Flow cytometry was performed on a BD LSRII flow cytometer. At least
   50,000 events were recorded per sample. FCS files were exported by
   FACSDiva and subsequently gated and compensated in FlowJo (v10.8.0) for
   single, living, and CD15^+ (human) and Ly6G^+ (mouse) cells.
   Eosinophils were excluded based on high autofluorescence in the
   live/dead (Pacific Orange) channel (Supplementary Fig. [288]11). Gated
   events and their median fluorescence intensity values were exported and
   concatenated into a single-cell experiment using CATALYST^[289]100
   (v1.16.2) in R (v4.2.0). The dataset was arcsinh transformed using
   manually determined cofactors^[290]8,[291]101 and clustered by FlowSOM
   clustering and Consensus-Plus-Metaclustering. For combined analysis of
   human and mouse cells, both datasets were mean-centered, scaled, and
   combined into one SingleCellExperiment. Dimensionality reduction was
   performed using the DiffusionMap algorithm as implemented in the
   CATALYST package with standard settings. For visualization, a random
   subset of 1500 cells per sample was plotted using ggplot2^[292]102
   (v3.3.5). Principal components were calculated based on the median
   fluorescence values of the respective marker proteins per sample and
   plotted using matplotlib (v3.5.1) in Python (v3.9.1).

Gene ontology enrichment analysis

   We used EnrichR^[293]103–[294]105 to assess enriched gene sets in a
   given list of genes. The analysis was restricted to terms annotated in
   GO_Biological_Process_2021.

Linear mixed-effect model

   To validate the core inflammation program derived from Fisher’s
   combined test, we globally tested for differential expression between
   resting and inflamed cells including all samples used in the Fisher’s
   combined testing approach. We accounted for batch effects by correcting
   gene counts for the study using ComBat-Seq^[295]106 (sva v3.44.0). From
   batch-corrected counts, we calculated TMM-normalized log[2] counts per
   million that were subsequently quantile normalized and then used as
   input for linear modeling. Modeling was implemented using lme4^[296]87
   (v1.1-29) to fit a linear mixed-effects model (LMM) to normalized
   counts. The linear formulae we fit for each gene were defined as
   [MATH:
   <mi>f</mi><mi>u</mi><mi>l</mi><mi>l</mi><mo>:</mo><mi>e</mi><mi>x</mi><
   mi>p</mi><mi>r</mi><mi>e</mi><mi>s</mi><mi>s</mi><mi>i</mi><mi>o</mi><m
   i>n</mi><mo>~</mo><mi>c</mi><mi>o</mi><mi>n</mi><mi>d</mi><mi>i</mi><mi
   >t</mi><mi>i</mi><mi>o</mi><mi>n</mi><mo>+</mo><mn>1</mn><mo>∣</mo><mi>
   s</mi><mi>t</mi><mi>u</mi><mi>d</mi><mi>y</mi> :MATH]
   and
   [MATH:
   <mi>r</mi><mi>e</mi><mi>d</mi><mi>u</mi><mi>c</mi><mi>e</mi><mi>d</mi><
   mo>:</mo><mi>e</mi><mi>x</mi><mi>p</mi><mi>r</mi><mi>e</mi><mi>s</mi><m
   i>s</mi><mi>i</mi><mi>o</mi><mi>n</mi><mo>~</mo><mn>1</mn><mo>∣</mo><mi
   >s</mi><mi>t</mi><mi>u</mi><mi>d</mi><mi>y</mi> :MATH]
   , where the variable to test for was condition, and the study was used
   as the covariate that was considered to be the random effect. We
   retrieved β as an estimate for the log[2](FC) from the full model and
   subsequently performed a likelihood ratio test to compare the full with
   the reduced model and to retrieve the respective P-values. P-values
   were then adjusted using the Benjamini-Hochberg procedure. We accounted
   for batch effects by correcting gene counts using ComBat-Seq^[297]106
   (sva v3.44.0).

π[1]-statistic

   Using the qvalue-package^[298]39,[299]50,[300]51 (v2.28.0), we
   calculated the π[1]-statistic (1- π[0]) as an estimated proportion of
   truly significantly differentially expressed genes for a given set of
   P-values. To account for a selection of genes potentially biased toward
   low P-values when testing for the replicability of DEGs between
   studies, qvalue-calculation was implemented via the qvalue_truncp
   function.

Gene expression modules using WGCNA

   For WGCNA^[301]39 analysis, we selected the same samples that were used
   for differential expression testing. We accounted for batch effects by
   correcting gene counts using ComBat-Seq^[302]106 (sva v3.44.0). From
   batch-corrected counts, we calculated TMM-normalized log[2] counts per
   million that were subsequently quantile normalized and then used as
   input for WGCNA. The network was constructed as a signed network, using
   a soft thresholding power of 13, a minimum module size of 30, and a
   merge cut height of 0.25. Modules with more than 1000 genes were
   removed from subsequent analyses.

Statistical analyses

   Correlations indicated on scatter plots represent Pearson’s R
   (Pearson’s correlation coefficient) with their respective P-value. For
   comparisons of the mean, we used the Mann-Whitney U test (two groups)
   or Kruskal-Wallis H test (three groups) if the Shapiro-Wilk test
   indicated non-normality in at least one group. When multiple
   comparisons were performed, P-values and/or asterisks indicate adjusted
   P-values using the Holm-Bonferroni method unless stated otherwise. For
   pairwise comparisons, the Mann-Whitney U test was used post-hoc, taking
   multiple comparisons into account using the Holm-Bonferroni method. To
   test for categorical associations, we used Fisher’s exact test.
   Asterisks represent the following P-value ranges: P > 0.05, ns.
   0.01 < P ≤ 0.05, *. 0.001 < P ≤ 0.01, **. 0.0001 < P ≤ 0.001, ***.
   P ≤ 0.0001, ****.

Reporting summary

   Further information on research design is available in the [303]Nature
   Portfolio Reporting Summary linked to this article.

Supplementary information

   [304]Supplementary Information^ (4.7MB, pdf)
   [305]Peer Review File^ (698.9KB, pdf)
   [306]41467_2023_43573_MOESM3_ESM.pdf^ (84KB, pdf)

   Description of Additional Supplementary Files
   [307]Supplementary Data 1^ (43.2KB, xlsx)
   [308]Supplementary Data 2^ (32.5KB, xlsx)
   [309]Supplementary Data 3^ (443.2KB, txt)
   [310]Reporting Summary^ (444.3KB, pdf)

Source data

   [311]Source Data^ (33.1MB, zip)

Acknowledgements