Abstract

   Bovine respiratory disease (BRD), the leading disease complex in beef
   cattle production systems, remains highly elusive regarding diagnostics
   and disease prediction. Previous research has employed cellular and
   molecular techniques to describe hematological and gene expression
   variation that coincides with BRD development. Here, we utilized
   weighted gene co-expression network analysis (WGCNA) to leverage total
   gene expression patterns from cattle at arrival and generate
   hematological and clinical trait associations to describe mechanisms
   that may predict BRD development. Gene expression counts of previously
   published RNA-Seq data from 23 cattle (2017; n = 11 Healthy, n = 12
   BRD) were used to construct gene co-expression modules and correlation
   patterns with complete blood count (CBC) and clinical datasets. Modules
   were further evaluated for cross-populational preservation of
   expression with RNA-Seq data from 24 cattle in an independent
   population (2019; n = 12 Healthy, n = 12 BRD). Genes within
   well-preserved modules were subject to functional enrichment analysis
   for significant Gene Ontology terms and pathways. Genes which possessed
   high module membership and association with BRD development, regardless
   of module preservation (“hub genes”), were utilized for protein-protein
   physical interaction network and clustering analyses. Five
   well-preserved modules of co-expressed genes were identified. One
   module (“steelblue”), involved in alpha-beta T-cell complexes and
   Th2-type immunity, possessed significant correlation with increased
   erythrocytes, platelets, and BRD development. One module (“purple”),
   involved in mitochondrial metabolism and rRNA maturation, possessed
   significant correlation with increased eosinophils, fecal egg count per
   gram, and weight gain over time. Fifty-two interacting hub genes,
   stratified into 11 clusters, may possess transient function involved in
   BRD development not previously described in literature. This study
   identifies co-expressed genes and coordinated mechanisms associated
   with BRD, which necessitates further investigation in BRD-prediction
   research.

Introduction

   Despite decades of research involved in discovering novel management
   tools, developing interventional systems, and advancing antimicrobial
   therapeutics, bovine respiratory disease (BRD) remains the leading
   cause of morbidity and mortality in beef cattle operations across North
   America [[36]1–[37]3]. Due to its widespread prevalence, BRD is
   considered one of the most economically devastating components of beef
   cattle production systems [[38]2–[39]4]. BRD is a polymicrobial,
   multifactorial disease complex, incorporating infectious agents, host
   immunity, and environmental elements as predisposing factors
   [[40]5–[41]7]. Previous research over the past several decades has
   greatly detailed these factors and risks associated with BRD, yet there
   is minimal evidence that overall rates of disease have improved [[42]5,
   [43]8–[44]10]. Furthermore, diagnostic evaluation of BRD often relies
   on visual signs attributed to the disease complex, which are commonly
   non-specific to airway and lung disease, and lack clinical sensitivity
   [[45]11, [46]12]. Therefore, data driven approaches which capture the
   biological intricacies associated with clinical BRD development and
   provide candidate molecular targets capable of stratifying or
   predicting risk of disease and/or production loss would offer a more
   precise method of managing BRD.

   Clinical BRD progression and severity often presents as an acute
   inflammatory disease [[47]13]. However, molecular and cellular changes
   precede physiological changes in terms of disease development. As such,
   identifying consistent molecular and/or cellular components that relate
   to BRD development would allow for the development of rapid diagnostics
   capable of being performed with cattle at the time of facility arrival.
   Such a tool could facilitate precision medicine practices in stocker
   and feedlot operations and improve both speed and success of targeted
   therapy. Accordingly, hematological samples are ideal, as they
   represent a relatively noninvasive, cost effective, and readily
   obtainable source that reflects dynamic biological processes throughout
   the body [[48]14, [49]15].

   Previous research has investigated cellular and molecular components
   that may indicate or predict clinical BRD. Richeson and colleagues,
   utilizing complete blood count (CBC) variables and castration status at
   facility arrival, identified significant associations with BRD in
   calves with comparatively decreased numbers of eosinophils and
   increased numbers of erythrocytes [[50]16]. When evaluating the
   relationships between cytokine gene expression and CBC data in cattle
   with concurrent BRD, Lindholm-Perry and colleagues discovered that
   cattle with BRD possessed a comparative increase in numbers of
   neutrophils, decrease in numbers of basophils, and increased expression
   of CCL16, CXCR1, and CCR1 [[51]17]. Recent RNA sequencing studies,
   performed by both our group and others, have identified mechanisms and
   candidate biomarkers in whole blood associated with BRD development
   [[52]18–[53]20]. However, these studies primarily sought to identify
   differentially expressed genes (DEGs) between cattle that were or were
   not treated for BRD based on clinical signs. Focus on identifying DEGs
   meant that much of the data generated by these studies was neglected.
   Therefore, we aimed to leverage global gene expression patterns across
   high-risk cattle, and incorporate available cellular-level
   hematological data from the same cattle, to infer mechanisms associated
   with BRD development with a more holistic approach.

   As gene expression operates in tandem with biological regulatory
   networks and complexes, investigation of gene co-expression levels may
   reveal transcriptional coordination, distinguish protein production
   relationships, and measure cellular composition and function relevant
   to specific disease states such as BRD [[54]21, [55]22]. This analysis
   approach falls into the field of systems biology, where, in contrast to
   reductionist biology, molecular components are pieced and scaled
   together to better understand disease and generate novel hypotheses
   [[56]23, [57]24]. In this respect, we sought to build networks of
   co-expressed genes, utilizing the full structure of previously
   published gene expression data [[58]20], and discover relationships
   between gene expression and cellular hematological components, which
   may elucidate and/or further confirm genes and mechanisms related to
   BRD development or resistance.

Materials and methods

Animal enrollment

   All animal use and procedures were approved by the Mississippi State
   University Animal Care and Use Committee (IACUC protocol #17–120) and
   carried out in accordance with relevant IACUC and agency guidelines and
   regulations. This study was carried out in accordance with Animal
   Research: Reporting of In Vivo Experiments (ARRIVE) guidelines
   ([59]https://arriveguidelines.org). This study was conducted in
   accompaniment with previous work focused on differential gene
   expression analysis and candidate biomarker validation [[60]20]; the
   RNA-Seq data of these animals were previously deposited in the National
   Center for Biotechnology Information (NCBI) Gene Expression Omnibus
   (GEO) database under accession number [61]GSE161396. Briefly, 24
   samples (n = 12 BRD, n = 12 Healthy) from the 2017 study were
   previously selected based on randomized stratification of vaccine and
   oral anthelminthic administration upon facility arrival (d0), and 24
   samples from the 2019 study randomly selected with equal distribution
   of clinical BRD development within 28 days of arrival [[62]20]. All
   cattle within each population (year) were of proportional arrival
   weight ([63]S1 Table) and age (estimated 4–6 months). All animals
   enrolled in these two groups were commercial cattle, with unknown
   genetic characteristics and background; this is a typical attribute of
   newly received stocker cattle in commercial production systems. Of the
   24 cattle from the 2017 population having RNA-Seq data, one individual
   (ID: 162–2017_S24; [64]GSM4906455) was not incorporated into the
   network analysis due to missing CBC data. The following clinical data
   were recorded for each animal: at-arrival fecal egg counts per gram via
   modified-Wisconsin procedure (FEC-d0), body weight in pounds (WT) at
   arrival, Day 12, Day 26, and Day 82, average daily weight gain at each
   time point (ADG), growth rate (slope of weight over days recorded; GR),
   at-arrival castration status (Sex), at-arrival rectal temperature
   (Temp-d0), development of clinical BRD within 28 days post-arrival
   (BRD), number of clinical BRD treatments (Treat_Freq), and timing to
   first BRD treatment (Risk_Days). Ages (not recorded) were estimated to
   be similar upon facility arrival. Clinical data for these cattle are
   found in [65]S1 Table.

Hematology analysis

   Approximately 6 mL of whole blood was collected at arrival into
   K[3]-EDTA glass blood tubes (BD Vacutainer; Franklin Lakes, NJ, USA)
   via jugular venipuncture. Blood samples were stored at 4°C and analyzed
   the same day of collection with the flow cytometry-based Advia 2120i
   hematology analyzer (Siemens Healthcare Diagnostics Inc., Tarrytown,
   NY, USA), testing for the following parameters: white blood cells (WBC;
   K/μL), erythrocytes (RBC; M/μL), hemoglobin (HGB; g/dL), hematocrit
   (HCT; %), mean corpuscular volume (MCV; fL), mean corpuscular
   hemoglobin (MCH; pg), mean corpuscular hemoglobin concentration (MCHC;
   g/dL), red blood cell distribution width (RDW; %), and platelets (PLT;
   K/μL). Blood smear staining was performed with a Hematek 3000 Slide
   Stainer (Siemens Healthcare Diagnostics Inc., Tarrytown, NY, USA) via
   Wright-Giemsa stain reagents. Stained blood smears were evaluated for
   leukocyte distribution via a manual 300-count white blood cell
   differential by trained clinical pathology technical staff at
   Mississippi State University College of Veterinary Medicine.
   Neutrophil, eosinophil, basophil, monocyte, and lymphocyte percentages
   were recorded, with accompanying neutrophil-to-lymphocyte ratios (NL
   Ratio). Hematology data for these cattle are found in [66]S2 Table.

RNA-Seq data processing and normalization

   The gene-level raw count matrix generated from our previous research
   was utilized for this study [[67]20]. Briefly, RNA was isolated via
   Tempus Spin RNA Isolation Kits (Thermo Fisher Scientific; Waltham, MA,
   USA), following manufacturer’s protocol. TruSeq RNA Library Kit v2
   (Illumina; San Diego, CA, USA) was utilized for mRNA sequencing library
   preparation, following manufacturer’s protocol. Single-lane,
   high-throughput RNA sequencing was performed with NovaSeq 6000 S4
   reagent kit and flow cell (Illumina). Sequence read files were quality
   assessed and trimmed with FastQC v0.11.9 [[68]25] and Trimmomatic v0.39
   [[69]26], respectively. Reference-guided (Bos taurus; ARS-UCD1.2) read
   mapping, indexing, and gene-level assembly were performed with HISAT2
   v2.2.1 [[70]27, [71]28] and StringTie v2.1.2 [[72]29, [73]30],
   respectively. The python program prepDE.py [[74]31] was utilized for
   gene-level count matrix construction.

   Raw gene counts were imported to R v4.0.4 and processed with the
   filterByExpr toolkit [[75]32], removing genes with a minimum total
   count of less than 200 and counts-per-million (CPM) below 1.0 across a
   minimum of 12 libraries. Libraries were normalized with the trimmed
   mean of M-values method (TMM) [[76]33, [77]34] and converted into
   log2-counts per million values (log2CPM). A total of 12,795 genes were
   identified after count processing and were utilized for weighted
   network analysis.

Weighted gene co-expression network analysis (WGCNA)

   Weighted network analysis was performed with the R package WGCNA
   v1.70.3 [[78]35]. Clinical and hematology trait data were compiled and
   aligned to each respective sample library. To remove any outlier
   sample, canonical Euclidean distance-based network adjacency matrices
   were estimated and used to identify outliers based on standardized
   connectivity. Estimated adjacency matrices had network connectivity
   standardized with the provided equation, where the z-score normalized
   network connectivity (Z.k[μ]) for each sample is calculated from
   mean-center scaling of the raw network adjacencies (k) [[79]36]:
   [MATH:
   <msub><mrow><mi>Z</mi><mo>.</mo><mi>k</mi></mrow><mrow><mi>μ</mi></mrow
   ></msub><mo>=</mo><msub><mrow><mi>s</mi><mi>c</mi><mi>a</mi><mi>l</mi><
   mi>e</mi><mo>(</mo><mi>k</mi><mo>)</mo></mrow><mrow><mi>μ</mi></mrow></
   msub><mo>=</mo><mfrac><mrow><msub><mrow><mi>k</mi></mrow><mrow><mi>μ</m
   i></mrow></msub><mo>−</mo><mi>m</mi><mi>e</mi><mi>a</mi><mi>n</mi><mo>(
   </mo><mi>k</mi><mo>)</mo></mrow><mrow><mo>√</mo><mi>v</mi><mi>a</mi><mi
   >r</mi><mo>(</mo><mi>k</mi><mo>)</mo></mrow></mfrac><mo>.</mo> :MATH]

   Samples with a standardized connectivity < -5.0 were considered
   outliers and to be removed from further analysis; no samples were
   considered outliers in this study ([80]S1 Fig). An adjacency matrix was
   constructed from the calculated signed Pearson coefficients between all
   genes across all samples. We utilized signed networks as they better
   capture gene expression trends (up- and down-regulation) and classify
   co-expressed gene modules which improve the ability to identify
   functional enrichment, when compared to unsigned networks [[81]24,
   [82]35–[83]37]. Soft thresholding was used to calculate the power
   parameter (β) required to exponentially raise the adjacency matrix, to
   reach a scale-free topology fitting index (R^2) of >80%; β = 8 was
   selected for this study. The relationship between each unit β and R^2
   is seen in [84]S2 Fig. Co-expression modules were constructed with the
   automatic, one-step blockwiseModules function within the WGCNA R
   package, using the following parameters: power = 8, corType =
   “pearson,” TOMType = “signed,” networkType = “signed,” maxBlockSize =
   12795, minModuleSize = 30, mergeCutHeight = 0.25, and pamRespectsDendro
   = FALSE; all other parameters were set to default. Constructed
   co-expression modules were assigned a color by the WGCNA R package,
   with any gene not assembling into a specific module placed in the
   “grey” module. Module-trait associations were identified with Pearson
   correlation between module eigengene (ME; first principal component of
   the co-expression matrix [[85]38] and clinical and hematology data).
   Modules were considered weakly or strongly correlated with each trait
   having a p-value ≤ 0.10 and |R| ≥ 0.3 or p-value ≤ 0.05 and |R| ≥ 0.4,
   respectively. Color scaling was performed with the Bioconductor package
   viridis v0.6.1 [[86]39] to allow ease of visual interpretation for
   individuals with color blindness.

Cross-population module preservation analysis

   Based on our previous work, it can be inferred that host gene
   expression captured at facility arrival is variable across BRD severity
   cohorts [[87]20, [88]40, [89]41]. Therefore, we assessed whether the
   at-arrival co-expression patterns and modules found in this study were
   well preserved across an RNA-Seq data set from an independent
   population of cattle. We investigated cross-populational module
   preservation across the whole blood transcriptomes of cattle previously
   assessed for differential gene expression ([90]GSE161396; 2019
   population (n = 24)) with the modulePreservation function found within
   the WGCNA R package. The gene-level raw count matrix from previous
   analysis [[91]20] was utilized and processed, filtered, and normalized
   in identical procedures as the 2017 RNA-Seq data set (see RNA-Seq data
   processing and normalization section); a total of 12,803 genes were
   identified in the 2019 data set after count processing and
   normalization. Permutation testing (n = 200 permutations) was conducted
   to assess the significance of module preservation across the 2017 and
   2019 RNA-Seq data sets, utilizing the two composite statistical
   measurements Zsummary and medianRank scores [[92]36, [93]42]. Briefly,
   the identified modules within the test network are randomly permuted n
   times, where, for each permuted index, the mean and standard deviation
   is calculated for defining the corresponding Z statistic [[94]42,
   [95]43]. Through the combination of additional preservation statistics
   (average of Zdensity and Zconnectivity), the calculated Zsummary
   statistic determines the level of mean connectivity among all genes
   within a module (i.e., network density) across the two data sets
   [[96]24, [97]42]. Higher Zsummary values indicate a stronger level of
   module preservation between data sets but is dependent on the number of
   genes within the module (i.e., module size) [[98]42]. To further
   evaluate preservation in a module size-independent manner, medianRank
   scores are calculated from the mean connectivity and density
   measurements observed from each module and assigned a rank score
   [[99]42]. Lower medianRank values indicate a stronger level of module
   preservation between data sets. For this study, any module possessing
   Zsummary ≥ 10 and medianRank ≤ 5 was considered highly preserved.

Functional enrichment analysis of preserved modules

   WebGestalt 2019 [[100]44] (WEB-based Gene SeT AnaLysis Toolkit;
   accessed September 13, 2021) was utilized for over-representation
   analysis to identify enriched Gene Ontology (GO) biological processes,
   cellular components, molecular functions, and pathways from genes found
   in each module considered well preserved. Pathway enrichment analysis
   was performed with the pathway database Reactome [[101]45]. Human (Homo
   sapiens) gene orthologs and functional databases were utilized for GO
   term and pathway enrichment analyses. Over-representation analysis
   parameters within WebGestalt 2019 included between 3 and 3000 genes per
   category, Benjamini-Hochberg (BH) procedure for multiple hypothesis
   correction, adjusted p-value (FDR) cutoff of 0.05 for significance, and
   a total of 10 expected reduced sets of the weighted set cover algorithm
   for redundancy reduction.

BRD-associated hub gene identification and network analyses

   Hub genes are those genes found within a module (eigengenes) that
   possess high connectivity which may exhibit a greater degree of
   biological significance in respect to significantly associated clinical
   traits, when compared to all other eigengenes [[102]38, [103]46,
   [104]47]. Here, we sought to identify hub genes found from modules
   which are significantly associated with any of the clinical BRD
   categories (BRD, Treat_Freq, and Risk_Days). This was performed in the
   WGCNA R package with two procedures. First, Pearson correlation between
   gene expression and module eigengenes was calculated, resulting in the
   level of module membership (k[ME]) for each gene. Second, the Pearson
   correlation between individual gene expression level and clinical trait
   was calculated, resulting in the level of gene significance (GS) for
   each gene. Any gene possessing k[ME] and GS values ≥ 0.7 and ≥ 0.3,
   respectively, were considered hub genes for clinical traits [[105]36].
   All BRD-associated hub genes were used for network construction of
   known and predicted protein-protein interactions with the Search Tool
   for the Retrieval of Interacting Genes (STRING) database v11.5
   [[106]48], utilizing bovine (Bos taurus) annotations. STRING analysis
   was performed with the physical subnetwork setting, where edges only
   display protein interactions that have evidence of binding to or
   forming a physical complex. Any interaction above a combined score
   (confidence) of 0.200 was incorporated into the complete network prior
   to network clustering; disconnected nodes were removed from the
   network. The Markov Cluster (MCL) algorithm was utilized for network
   clustering due to its superior performance in complex extraction
   without the need of additional parameter tuning [[107]49]. Hub genes
   within the interaction network were placed into distinct clusters based
   on MCL clustering of the distance matrix acquired from the combined
   interaction scores, using a MCL inflation parameter of 1.4.

Statistical analysis

   Clinical and hematology data (described in animal enrollment and
   hematology analysis) were compared between cattle treated for
   naturally-acquired clinical BRD within the first 28 days following
   facility arrival (BRD) and those never being diagnosed nor treated
   (Healthy). Residual normality was assessed in R v4.0.4 with the
   Shapiro-Wilk test [[108]50], with an a priori level of significance set
   at 0.10; neutrophil percentage (Neu%), eosinophil percentage (Eos%),
   basophil percentage (Baso%), lymphocyte percentage (Lymph%),
   neutrophil-to-lymphocyte ratio (NL ratio), FEC-d0, MCHC, RDW, and Sex
   were considered non-normally distributed. Differences in normally
   distributed variables between BRD and Healthy cattle were assessed with
   the Student’s t-test. Differences in non-normally distributed variables
   were assessed with the Welch’s t-test; differences between the two
   groups with respect to Sex was assessed with Pearson’s chi-square test
   with Yates’ continuity correction. Differences between BRD and Healthy
   cattle were considered significant having a p-value ≤ 0.05.

Results

Statistical analysis of clinical and hematological parameters

   Descriptive statistics for the clinical and hematological data are
   provided in [109]Table 1. Regarding the hematological parameters,
   average values of Lymph%, RDW, and PLT were outside of the internal
   reference intervals for both BRD and Healthy cattle. In this study, RBC
   was considered significantly higher at arrival in BRD cattle compared
   to Healthy cattle; no other parameter was considered significantly
   different between the two groups. Regarding clinical data, BRD cattle
   possessed significantly lower weight gain by end of study (ADG-d82;
   2.273 lbs/day in BRD and 2.946 lbs/day in Healthy) and lower calculated
   slopes of weight gain over time (Growth Rate; 2.370 in BRD and 2.995 in
   Healthy); no other clinical parameter was considered significantly
   different between the two groups. We did not include the at arrival
   weight (WT-d0) as an analysis variable because there was no significant
   difference between the Healthy (mean: 477.0, s.d.:24.8) and BRD (mean:
   475.3, s.d.:26.9) cohorts.

Table 1. Statistical analysis of hematological and clinical traits between
BRD and healthy groups.

    Variable   Internal Reference  BRD mean (s.d.)   Healthy mean (s.d.) p-value
      Neu%       37.000–80.000      35.917 (5.547)     37.213 (9.748)     0.717
      Eos%        0.000–12.000      3.944 (3.237)       2.635 (1.616)     0.251
     Baso%        0.000–2.500       0.193 (0.213)       0.151 (0.218)     0.658
     Mono%        0.000–12.000      8.862 (4.603)       8.363 (4.507)     0.805
     Lymph%      10.000–50.000      51.083 (4.756)     51.635 (11.928)    0.893
    NL Ratio          N/A           0.711 (0.141)       0.859 (0.660)     0.504
   WBC (K/μL)     4.000–12.000      7.430 (2.722)       7.320 (1.292)     0.913
   RBC (M/μL)     5.000–9.990       9.605 (0.568)       9.032 (0.676)     0.047
   HGB (g/dL)     7.700–15.000      13.075 (1.071)     12.491 (0.906)     0.194
    HCT (%)      25.000–45.000      36.125 (3.269)     35.000 (2.534)     0.391
    MCV (fL)     36.000–55.000      37.725 (3.843)     38.845 (2.851)     0.460
    MCH (pg)     12.000–22.000      13.625 (1.112)     13.855 (0.806)     0.597
  MCHC (g/dL)    32.000–40.000      36.225 (1.190)     35.691 (0.977)     0.272
    RDW (%)      11.600–14.800      29.258 (2.362)     27.564 (3.023)     0.171
   PLT (K/μL)   200.000–900.000   1413.083 (506.885) 1149.000 (401.516)   0.203
     FEC-d0           N/A         761.250 (768.795)   618.364 (408.492)   0.597
    ADG-d12           N/A           0.667 (1.604)       2.167 (1.838)     0.059
    ADG-d26           N/A           1.917 (1.204)       2.710 (0.948)     0.110
    ADG-d82           N/A           2.273 (0.599)       2.946 (0.432)     0.008
  Growth Rate         N/A           2.370 (0.554)       2.995 (0.435)     0.009
  Temp-d0 (F°)        N/A          103.333 (0.712)     103.291 (0.667)    0.890
      Sex             N/A         10 bulls, 2 steers  10 bulls, 1 steer   1.000
   [110]Open in a new tab

   Means, standard deviations (in parentheses), and statistical
   probability values of differences in hematological and clinical
   parameters between BRD (n = 12) and Healthy (n = 11) cattle. Parameters
   were considered significantly different with p-values ≤ 0.05.

Weighted gene co-expression network construction

   The remaining filtered genes (n = 12,795) were used for WGCNA network
   and module construction. The resulting network identified a total of 41
   color-coded modules of co-expressed genes, excluding the grey module
   which incorporates uncorrelated genes (n = 1,235) ([111]Fig 1). Across
   the 41 assigned modules, the turquoise module possessed the largest
   number of co-expressed genes (n = 2,503) and the lightsteelblue1 module
   possessed the smallest number of co-expressed genes (n = 38); the
   average size of each module was approximately 282 genes. The complete
   list of genes and module assignment is found in [112]S3 Table.

Fig 1. Cluster dendrogram of 12,795 genes generated through dissimilarity
metrics (1-TOM) and hierarchical clustering.

   [113]Fig 1
   [114]Open in a new tab

   Automated block-wise module detection of interconnected genes were
   grouped into 41 unique color-coded modules, excluding the grey module
   (uncorrelated genes). The x-axis corresponds to the gene-module
   assignment and the y-axis (Height) depicts the calculated distance
   between co-expressed genes from hierarchical average linkage
   clustering.

Module-trait relationship with hematological and clinical datasets

   Pearson correlation heatmaps were generated to assess the relationship
   between all modules and hematological clinical datasets. Regarding
   hematological data, several significant relationships of interest exist
   ([115]Fig 2). The tan module possessed the highest number of
   significant correlations with the hematological data (8), followed by
   turquoise, pink, lightgreen, and lightcyan modules (7). With respect to
   RBC, considered significantly higher at arrival in BRD cattle compared
   to Healthy cattle, six modules were strongly correlated: paleturquoise
   (R = 0.44, p = 0.03), lightcyan (R = 0.51, p = 0.01), green (R = 0.41,
   p = 0.05), steelblue (R = 0.50, p = 0.01), brown (R = -0.45, p = 0.03),
   turquoise (R = 0.49, p = 0.02). Additionally, seven modules were
   considered weakly correlated with RBC: magenta (R = 0.32, p = 0.10),
   darkgreen (R = 0.36, p = 0.09), lightsteelblue1 (R = -0.36, p = 0.09),
   blue (R = 0.36, p = 0.10), saddlebrown (R = 0.36, p = 0.09), orangered4
   (R = -0.33, p = 0.10), tan (R = -0.36, p = 0.09). Regarding modules
   correlating with RBC, three modules possessed significant associations
   with multiple related red cell indices (HGB, HCT, MCV, MCH, MCHC, and
   RDW): saddlebrown, steelblue, and lightcyan. Saddlebrown was strongly
   associated with MCV (R = -0.63, p = 0.001) and MCH (R = -0.62, p =
   0.001), and weakly associated with HCT (R = -0.31, p = 0.10) and MCHC
   (R = 0.36, p = 0.10). Steelblue was strongly associated with RDW (R =
   0.70, p = 2e-04) and weakly associated with HGB (R = 0.35, p = 0.10)
   and MCHC (R = 0.40, p = 0.06). Lightcyan was strongly associated with
   HGB (R = 0.47, p = 0.02) and RDW (R = 0.51, p = 0.01) and weakly
   associated with HCT (R = 0.38, p = 0.08).

Fig 2. Module-trait relationships between co-expression modules and
hematological traits.

   [116]Fig 2
   [117]Open in a new tab

   Pearson correlations between each of the unique color-coordinated
   modules and hematological traits are visualized and represented as a
   heatmap. Each row represents a distinct co-expression module, and each
   column represents hematological traits as follows: white blood cells
   (WBC; K/μL), erythrocytes (RBC; M/μL), hemoglobin (HGB; g/dL),
   hematocrit (HCT; %), mean corpuscular volume (MCV; fL), mean
   corpuscular hemoglobin (MCH; pg), mean corpuscular hemoglobin
   concentration (MCHC; g/dL), red blood cell distribution width (RDW; %),
   and platelets (PLT; K/μL). Cells are represented by how positive
   (yellow/white) or negative (purple/black) the correlation is between
   module and hematological trait, respectively.

   The relationships between modules and clinical data are found in
   [118]Fig 3. With respect to all clinical disease associations (BRD,
   Treat_Freq, and Risk_Days), five modules possessed significant
   correlations: steelblue, mediumpurple3, royalblue, orange, and violet.
   Steelblue was strongly associated with BRD (R = 0.41, p = 0.05) and
   Risk_Days (R = -0.41, p = 0.05). Mediumpurple3 was weakly associated
   with Treat_Freq (R = -0.40, p = 0.06). Royalblue was weakly associated
   with Treat_Freq (R = -0.40, p = 0.06). Orange was weakly associated
   with BRD (R = -0.39, p = 0.07), Treat_Freq (R = -0.34, p = 0.10), and
   Risk_Days (R = 0.38, p = 0.07). Violet was weakly associated with
   Treat_Freq (R = -0.38, p = 0.07). Regarding production traits (ADG-d12,
   ADG-d26, ADG-d82, and GR), ten modules possessed significant
   correlations: darkgreen, skyblue, darkturquoise, darkmagenta, purple,
   yellowgreen, orange, orangered4, darkred, and lightyellow. However, to
   mitigate unexplained variation which may confound differences in
   ADG-d12 and ADG-d26, coupled with the lack of significance between
   disease cohorts, eight modules correlating with ADG-d82 and GR were
   prioritized. Darkred was strongly associated with ADG-d82 (R = 0.50, p
   = 0.02) and GR (R = 0.50, p = 0.02). Orangered4 was strongly associated
   with ADG-d82 (R = 0.41, p = 0.05) and weakly associated with GR (R =
   0.38, p = 0.07). Orange was strongly associated with ADG-d82 (R = 0.43,
   p = 0.04) and GR (R = 0,41, p = 0.05). Yellowgreen was strongly
   associated with ADG-d82 (R = 0.41, p = 0.05) and GR (R = 0.42, p =
   0.05). Purple was weakly associated with GR (R = 0.32, p = 0.10).
   Darkmagenta was weakly associated with ADG-d82 (R = -0.34, p = 0.10).
   Skyblue was weakly associated with GR (R = -0.36, p = 0.10). Darkgreen
   was weakly associated with ADG-d82 (R = -0.32, p = 0.10). Notably,
   orange was the only module which possessed significant correlations
   with both disease-associated and weight gain traits. However, orange
   did not possess any significant correlations with hematological traits.

Fig 3. Module-trait relationships between co-expression modules and clinical
traits.

   [119]Fig 3
   [120]Open in a new tab

   Pearson correlations between each of the unique color-coordinated
   modules and clinical traits are visualized and represented as a
   heatmap. Each row represents a distinct co-expression module, and each
   column represents clinical traits as follows: at-arrival fecal egg
   counts per gram via modified-Wisconsin procedure (FEC-d0), body weight
   in pounds (WT) at arrival, Day 12, Day 26, and Day 82, calculated
   average daily weight gain at each time point (ADG), growth rate (slope
   of weight over days recorded; GR), at-arrival castration status (Sex),
   at-arrival rectal temperature (Temp-d0), development of clinical BRD
   within 28 days post-arrival (BRD), number of clinical BRD treatments
   (Treat_Freq), and timing to first BRD treatment (Risk_Days). Cells are
   represented by how positive (yellow/white) or negative (purple/black)
   the correlation is between module and clinical trait, respectively.

Cross-populational network preservation analysis

   Module preservation analysis identified five modules considered well
   preserved across the 2017 and 2019 populations: black (size = 432;
   Zsummary = 39.772; medianRank = 4), purple (size = 296; Zsummary =
   34.773; medianRank = 2), lightgreen (size = 123; Zsummary = 23.291;
   medianRank = 1), tan (size = 222; Zsummary = 17.559; medianRank = 5),
   and steelblue (size = 59; Zsummary = 11.555; medianRank = 3) ([121]Fig
   4). Notably, steelblue was the only well-preserved module which
   possessed significant association with BRD-related clinical traits.

Fig 4. Cross-populational module preservation analysis.

   [122]Fig 4
   [123]Open in a new tab

   The medianRank and Zsummary values across all modules are depicted
   through the scatterplot x- and y-axes, respectively. Zsummary values ≥
   10.0 and medianRank values ≤ 5.0, indicated by dashed lines, denote
   that a module identified with the 2017 gene expression data is well
   preserved across the 2019 gene expression data.

Functional enrichment analysis of well-preserved modules

   To explore the functionality and biological relevance of the five well
   preserved modules, we performed over-representation analysis with all
   genes from each module (black, purple, lightgreen, tan, and steelblue;
   [124]S4 Table). Analysis of genes from the black module revealed 47
   biological process terms, 49 cellular component terms, 17 molecular
   function terms, and five significantly enriched pathways. Biological
   processes identified from genes within the black module were related to
   neutrophil activity and degranulation, aldehyde metabolism, nitrogen
   compound response and catabolism, and cellular transport. Cellular
   components identified from genes within the black module involved
   intracellular and extracellular vesicles, secretory granules, cellular
   junctions, and lysosomes. Molecular functions identified from genes
   within the black module involve cytokine, enzyme, and calcium-dependent
   protein binding, aldehyde dehydrogenase (NAD) activity, and
   interleukin-1 receptor activity. Enriched pathways identified from
   genes within the black module involved neutrophil degranulation,
   metabolic disease, and signaling via tyrosine kinase receptor.

   Analysis of genes from the purple module revealed 54 biological process
   terms, 46 cellular component terms, 16 molecular function terms, and 40
   significantly enriched pathways. Biological processes identified from
   genes within the purple module involved mitochondrial processes
   (cristae formation, respiratory chain complex assembly), non-coding RNA
   processing and maturation, cellular protein transport, and metabolic
   processes and biosynthesis. Cellular components identified from genes
   within the purple module involved cell substrate and adhesion junction,
   ribosomes, cytoplasmic side of endoplasmic reticulum, mitochondrial
   inner membrane and envelope, and the 48S preinitiation complex.
   Molecular functions identified from genes within the purple module
   involved mRNA/rRNA binding, ubiquitin ligase inhibition, ATP synthase
   activity, and NADH dehydrogenase. Enriched pathways identified from
   genes within the purple module involved infectious disease/viral
   infection, amino acid metabolism, translation initiation/termination,
   rRNA processing, and ATP synthesis and respiratory electron transport.

   Analysis of genes from the lightgreen module revealed 38 biological
   process terms, 49 cellular component terms, three molecular function
   terms, and one significantly enriched pathway. Biological processes
   identified from genes within the lightgreen module involved
   leukocyte/neutrophil differentiation, activation, and degranulation,
   tissue remodeling, cell secretion and exocytosis, phagocytosis and
   micropinocytosis, dendritic cell activation, and interleukin-8
   secretion. Cellular components identified from genes within the
   lightgreen module involved lysosome, secretory/azurophil granule,
   vesicular/vacuolar membrane, granule lumen, and macropinosome.
   Molecular functions identified from genes within the lightgreen module
   involved symporter activity, potassium-chloride symporter activity, and
   phosphatidylinositol binding. The single enriched pathway identified
   from genes within the lightgreen module was neutrophil degranulation.

   Analysis of genes from the tan module revealed 35 biological process
   terms, 32 cellular component terms, four molecular function terms, and
   two significantly enriched pathways. Biological processes identified
   from genes within the tan module involved B-cell activation, receptor
   signaling, and regulation, immunoglobulin production, cytokine
   production, positive regulation of interferon-gamma production, and
   mononuclear cell proliferation. Cellular components identified from
   genes within the tan module involved MHC class II protein complex,
   lytic vacuole membrane, clathrin-coated endocytic vesicle, endosomal
   membrane, and B-cell receptor complex. Molecular functions identified
   from genes within the tan module involved MHC class II receptor
   activity, MHC class II protein complex binding, and peptide antigen
   binding. Enriched pathways identified from genes within the tan module
   were antigen activates B-cell receptor (BCR) leading to generation of
   second messengers and CD22-mediated BCR regulation.

   Analysis of genes from the steelblue module revealed three biological
   process terms, three cellular component terms, no molecular function
   terms, and no significantly enriched pathways. Biological processes
   identified from genes within the steelblue module were cell surface
   receptor signaling pathway, negative regulation of fibroblast growth
   factor receptor signaling pathway, and antigen receptor-mediated
   signaling pathway. Cellular components identified from genes within the
   steelblue module involved side of membrane, plasma membrane part, and
   alpha-beta T cell receptor complex.

BRD-associated hub gene identification and in silico protein-protein
interaction and clustering analyses

   Hub gene identification analysis included co-expressed genes from the
   following modules: violet (54), orange (68), royalblue (100),
   mediumpurple3 (41), and steelblue (59). The k[ME] and GS value cutoffs
   within each module resulted in 24, 46, 30, 22, and 32 BRD-associated
   hub genes from the violet, orange, royalblue, mediumpurple3, and
   steelblue modules, respectively ([125]S5 Table). These resulting hub
   genes were further utilized for physical subnetwork protein-protein
   interactions and network clustering. After removal of all disconnected
   nodes, the interaction network demonstrated significant connectivity
   between 52 proteins across 11 distinct clusters with high inter-nodal
   connectivity ([126]Fig 5); these gene products and their combined
   interaction scores are found in [127]S6 Table. These connected gene
   products demonstrate possible at-arrival biomolecular complexes
   associated with BRD development and severity.

Fig 5. Protein-protein interaction network of interconnected BRD-associated
hub genes.

   [128]Fig 5
   [129]Open in a new tab

   Interaction score analysis reveals 52 genes, with high intramodular and
   BRD-trait relationship, which possess high connectivity. Interconnected
   gene products (nodes) were further grouped into distinct clusters based
   on their interaction scores (edges). Edge thickness represents the
   level of interaction confidence between nodes.

Discussion

   While at-arrival management practices are somewhat dependent upon
   anticipated risk of BRD development, both inter- and intra-herd level
   disease prevalence is highly variable [[130]5, [131]51]. To counter
   this variability, beef production systems will often administer
   antimicrobials and/or immunostimulants at arrival to reduce the risk of
   clinical BRD development and associated production losses [[132]52,
   [133]53]. However, immunostimulant administration alone as a
   metaphylactic protocol for controlling BRD appears to have minimal
   impact on rates of morbidity [[134]54–[135]56]. Metaphylactic use of
   antimicrobials at arrival reduces risk of morbidity and mortality
   across beef production systems, however this management practice is
   variable in efficacy, in both rates of disease across cattle
   populations and in pharmacological choice, and the practice is
   suspected to drive expansion of antimicrobial resistance, a growing
   societal concern [[136]52, [137]57, [138]58]. Given this background,
   our research group and others have focused on evaluating host
   transcriptomes at arrival, to better characterize host-driven
   mechanisms and develop candidate mRNA biomarkers associated with
   clinical BRD outcomes [[139]18, [140]19, [141]20]. These studies have
   provided valuable information regarding cattle treated based on
   clinical signs of BRD, but these studies heavily rely on semi-objective
   evaluation of BRD cases and may miss underlying subclinical or
   misdiagnosed disease. As such, the underlying host mechanisms involved
   in BRD development remain disputed. Therefore, to identify at facility
   arrival genes and mechanisms which may represent the variable
   development of BRD cases and leverage the total expression profile of
   individual cattle, we employed a systems biology approach with weighted
   co-expression network analysis. This methodology allows us to identify
   networks of genes exclusively co-expressed, and to evaluate said
   networks in a reduced manner in order to identify molecules and
   mechanisms of interest for future BRD prediction studies. Importantly,
   co-expression network analysis serves as a complementary, yet distinct,
   approach to identifying genes and mechanisms associated with disease
   status, when compared to differential expression analyses. The network
   approach performed in this study evaluates and identifies genes that
   are strongly coordinated in terms of expression, and determines
   correlation with overlapping metadata (clinical data), whereas
   differential expression analyses typically follow a pairwise approach
   to determine level of effect and probability of gene differences
   between groups. Co-expression network analyses consider greater
   biological context when evaluating gene expression differences,
   compared to more traditional pairwise approaches. Additionally, through
   utilization of hematological parameters, we could capture changes in
   the cellular composition of whole blood as they may relate to cellular
   and gross pathophysiology across individuals.

   While we recognize that dynamic changes captured in whole blood may not
   completely encompass biomolecular characteristics seen within lung
   tissue, whole blood serves as a practical and easily obtainable sample
   for respiratory and inflammatory disease diagnostics [[142]14,
   [143]59]. After initial statistical assessment of CBC data, we
   identified that both BRD and Healthy cattle possessed comparable
   lymphocytosis, thrombocytosis, and erythrocytic macrocytosis; the
   distribution of these values were not considered significantly
   different between the two groups. Notably, mild to moderate levels of
   dehydration, a common condition in newly arrived post-weaned beef
   animals, may cause elevated changes in hematocrit levels and
   lymphocytes [[144]60, [145]61]. Lymphocytosis and thrombocytosis may
   also result from host responses to infection or inflammation.
   Additionally, reticulocytosis (i.e., immature erythrocytosis) is the
   most common cause of erythrocytic macrocytosis [[146]60] and was noted
   as a common feature found across all blood samples submitted for
   analysis. While these cattle did not possess physiological nor
   hematological evidence of hemolysis or blood loss upon facility
   arrival, this finding may be associated with early regenerative anemia,
   systemic inflammation, or mineral deficiencies [[147]60–[148]62].
   Furthermore, blood-borne pathogens were not reported from blood smear
   assessment. Nevertheless, it does not rule out the possibility of
   mild/subclinical intraerythrocytic pathology or asymptomatic
   convalescence that may result in these increased hematological changes.
   Such pathology is often caused by parasitic diseases such as
   anaplasmosis, a common infectious disease of cattle across the United
   States [[149]63, [150]64]. It is plausible that these findings indicate
   that cattle at facility entry are undergoing similar physiological
   changes as it relates to stressful and/or pathogenic events
   (long-distance transportation, co-mingling, etc.) and underlying
   genomic mechanisms serve to resolve or prolong deleterious
   physiological conditions that result in BRD.

   With respect to distributions, we found that the majority of variables
   tested for module correlations were normally distributed. Of the nine
   (of the 26 total) non-parametric testing variables, six demonstrated
   relative linearity upon visual distribution assessment (data not shown;
   Neu%, Lymph%, NL Ratio, MCHC, RDW, and FEC-d0). Moreover, the
   non-parametric nature of Eos%, Baso%, and Sex is perceived to be due to
   data sparsity and relative rarity of the expected cell counts
   attributed to eosinophils and basophils in cattle ([151]Table 1)
   [[152]65]. We elected to utilize Pearson correlation models as they can
   measure discrete and continuous datasets without need for
   transformation, and preserve linearity from the raw data structure when
   assessing these variables with gene co-expression modules.
   Additionally, calculated Pearson correlations from WGCNA can better
   handle datasets with missing or censored data and is highly
   computationally efficient [[153]66]. We identified that RBCs were
   significantly increased in cattle that would go on to develop BRD
   versus those that did not. Although this result was identified in a
   relatively small number of cattle, it corresponds with the work of
   Richeson and colleagues [[154]16]. As discussed within their prior
   research, elevated RBCs may indicate dehydration and subsequent
   predisposition with BRD development [[155]5, [156]16]. Interestingly,
   we were able to identify one well-preserved co-expression module which
   possessed significant correlations with RBCs, RDW, PLT, BRD, and
   Risk_Days (i.e., shorter time to first treatment): steelblue. Upon
   further investigation, we discovered that the genes within this module
   were related to antigen receptor-mediated signaling (BLK, CD247, CD276,
   CD3G, GATA3, and PLEKHA1) and negative regulation of fibroblast growth
   factor receptor signaling (CREB3L1, GATA3, and WNT5A), and specifically
   components of alpha-beta T-cell receptor complexes (CD247 and CD3G).
   The upregulation of IL-7R and associated signaling molecules, which
   include CD3G and CD247, initiate NOTCH-dependent proliferation of
   T-cell precursors [[157]67]. Furthermore, elevated levels of BLK and
   GATA3 tend to skew the immune response towards Th2-type immunity
   [[158]68–[159]70]. In terms of RBC relationship, previous research has
   demonstrated that Th2-stimulated bone marrow T-cells promote erythroid
   differentiation and lead to the development of erythroblasts [[160]71].
   Additionally, CXCL12, also identified within the steelblue module and
   previously identified as a differentially expressed gene associated
   with BRD development [[161]20], is involved in Th2-cell migration and
   immune response [[162]71, [163]72]. HNRNPH3, found within the steelblue
   module, has previously been identified as a key transcription factor
   associated with clinical BRD [[164]18]. Lastly, several genes
   identified in the steelblue module were also found in the “turquoise”
   module identified by Hasankhani and colleagues [[165]24], which
   enriched positive regulation of activated T-cell proliferation and
   Th1/Th2-cell differentiation pathways. While this study cannot
   elucidate the exact mechanistic components nor temporality of molecular
   events, it suggests that promotion of Th2-mediated T-cells at arrival
   shares a common mechanism with RBC elevation and risk of BRD
   development. Our previous research has indicated that genes elevated at
   arrival in cattle that eventually develop BRD interact, and may
   enhance, TLR-4 and IL-6 responses [[166]20, [167]40, [168]73], which
   may contribute to the co-expressed pattern related to Th2-mediated
   T-cell development [[169]74]. Overall, this pattern of Th2-mediated
   immunity is strongly associated with clinical BRD development and
   timing to first treatment, and may further strengthen the depiction
   that early Th2 responses indicate clinical disease development and lung
   pathology [[170]75, [171]76].

   While steelblue was the only well-preserved BRD-associated module
   detected, four other modules were determined to be well-preserved
   across populations and warranted specific functional enrichment
   investigation: black, purple, lightgreen, and tan. Genes within the
   black module, largely involved with neutrophil activation and
   degranulation, IL-1 activity, and metabolic disease, was only
   significantly associated with hemoglobin and erythrocyte parameters
   (HGB, HCT, MCV, and MCH); notably, the black module did not possess any
   significant associations with clinical variables. This may indicate
   that neutrophilic and IL-1 activity was not indicative of BRD within
   this population of cattle, and/or additional disease-associated
   variables were not recorded in this study. Genes within the purple
   module, associated with increased eosinophil percentage, decreased
   neutrophil-lymphocyte ratio, decreased MCV and MCH, increased
   at-arrival fecal parasitic egg count, and increased growth rate (weight
   gain over 82 days), largely enriched for mitochondrial function and
   aerobic metabolism and RNA processing. Importantly, this module
   possessed positive association to weight gain independent of BRD
   development. Previous research has investigated many of these ribosomal
   protein-encoding genes for their potential for immune effector capacity
   [[172]77] and cell regulation [[173]78], however this marks the first
   time, to our knowledge, that they have been implicated in contributing
   to weight gain potential in high-risk cattle. Notably, one gene (RPS26)
   has been previously identified as a differentially decreased marker in
   the diseased lungs of cattle experimentally challenged with
   BRD-associated pathogens [[174]79, [175]80]. Similar to the black
   module, genes identified within the lightgreen module were associated
   with hemoglobin and erythrocyte parameters, but additionally positively
   correlated with neutrophil percentage and neutrophil-lymphocyte ratio,
   and negatively correlated with basophil percentage; likewise, the
   lightgreen module did not possess significant associations with
   clinical variables. Lastly, the tan module, possessing several
   significant hematological associations, and was negatively correlated
   with castration status at arrival, possessed genes which enriched for
   B-cell receptor complexes and regulation and interferon-gamma
   production. Unfortunately, the underlying physiological impact of
   co-expressed genes identified within the black, lightgreen, and tan
   modules were not captured in this study. As this study was primarily
   focused on BRD development and severity, the genes within these three
   modules may possess a role in other disease complexes or
   immune-mediated events, such as gastrointestinal or apoptotic/necrotic
   diseases.

   Utilizing hub gene and interaction network analyses, we further
   identified genes related to BRD development and severity. Here, we
   detected and mapped 52 genes into a protein-protein interaction
   network, further stratified into 11 distinct clusters based on their
   combined interaction scores. This procedure helps describe the physical
   relationship that multiple BRD-associated gene products possess with
   one another in a more holistic approach. Here, we may infer that these
   interactions possess accompanying transient functions involved in BRD
   development not previously described in literature. As such, these
   predicted protein-protein network interactions may infer potential
   modular units which participate in BRD development or resistance
   [[176]81, [177]82]. Further evidence of the associative importance
   related to BRD development exists with these genes, as CXCL12
   [[178]20], TLL2 [[179]20], ALOX15 [[180]18, [181]20, [182]40], and
   LOC100298356 [[183]73, [184]79, [185]80, [186]83] have been previously
   identified as differentially expressed when comparing cattle with and
   without BRD development. Specifically, CXCL12, TLL2, and LOC100298356
   are considered drivers of innate surveillance and inflammation
   associated with BRD development, whereas ALOX15 encodes for an enzyme
   involved in specialized proresolving mediator biosynthesis and
   associated with cattle that do not develop clinical BRD in high-risk
   systems [[187]18, [188]20, [189]40, [190]79, [191]83]. Collectively, we
   detected these previously identified differentially expressed genes,
   associated to BRD development, with an independent approach. This
   overlap emphasizes the potential capability of these genes and
   mechanisms to serve a predictive role for BRD. Proteomic approaches
   have detailed that proteins infrequently operate as single biological
   entities and, when involved in similar biological functions, interact
   in dynamic, yet organized complexes [[192]84–[193]87]. As such, these
   findings provide candidate protein complexes related to BRD development
   and severity, which warrants further investigation for avenues of
   confirmation in larger populations of cattle and novel therapeutic
   target development.

Conclusions

   This study was conducted to utilize systems biology methodology to
   further establish genes, mechanisms, and coordinated biological
   complexes associated with dynamic hematological changes and BRD
   development. Utilizing our previously published RNA-Seq data and WGCNA,
   we identified five well-preserved modules of highly co-expressed genes
   with significant associations with hematological and clinical traits in
   cattle at facility arrival. The “steelblue” module, containing genes
   involved in alpha-beta T cell receptor complex and negative regulation
   of fibroblast growth factor receptor signaling, possessed significant
   positive correlations with erythrocyte count, platelet count, red cell
   width, and BRD diagnosis, and negative correlation with days at risk
   for BRD. The “purple” module, containing genes involved in
   mitochondrial processes and non-coding RNA processing and maturation,
   possessed significant correlation with increased eosinophil percentage,
   decreased neutrophil-lymphocyte ratio, and increased growth rate
   (weight gain over time). Protein-protein interaction network and
   clustering analyses of BRD-related hub genes identified possible
   at-arrival biological complexes strongly associated with BRD
   development; many of these hub genes have been described as
   differentially expressed genes in previous BRD research. Through this
   holistic molecular approach, we provide genes, mechanisms, and
   predicted protein complexes associated with BRD development and
   performance which are warranted for future analyses targeted in
   predicting BRD at facility arrival.

Supporting information

   S1 Table. Clinical metadata of cattle selected for WGCNA analysis.

   (XLSX)
   [194]Click here for additional data file.^ (15.1KB, xlsx)
   S2 Table. CBC and leukocyte distribution data of cattle selected for
   WGCNA analysis.

   (XLSX)
   [195]Click here for additional data file.^ (12.7KB, xlsx)
   S3 Table. Full gene list and weighted module assignment.

   (CSV)
   [196]Click here for additional data file.^ (222.4KB, csv)
   S4 Table. Functional enrichment analysis of the five well-preserved
   modules.

   (XLSX)
   [197]Click here for additional data file.^ (44.4KB, xlsx)
   S5 Table. Hub gene analysis of all five BRD-associated modules.

   (XLSX)
   [198]Click here for additional data file.^ (34.7KB, xlsx)
   S6 Table. STRING identifiers and physical interaction scores
   (combined_score ≥ 0.200).

   (XLSX)
   [199]Click here for additional data file.^ (13.4KB, xlsx)
   S1 Fig. Heatmap and hierarchical clustering of clinical and
   hematological data across the 23 cattle utilized for transcriptome
   network analysis.

   Standardized connectivity was calculated from network adjacency
   matrices and used to classify potential outliers (Z.k < -5); no animal
   was identified as an outlier. The remaining rows represent the
   numerical values of all clinical and hematological traits across each
   animal. Colors indicate an increase (yellow/white) or decreased
   (purple/black) value for each trait; Sex and BRD are both represented
   as a value of 1 for bulls and Yes, and 0 for steers and No,
   respectively.

   (TIF)
   [200]Click here for additional data file.^ (1.3MB, tif)
   S2 Fig. Soft threshold (β) selection for signed weighed correlation
   network construction through scale free topology (SFT) plot analysis.

   A) SFT index R^2 (y-axis) at increasing soft threshold powers (β;
   x-axis). The value β = 8 was selected, seen where the saturation curve
   is above 0.8 (orange horizontal line). B) Increasing soft threshold
   powers (β; x-axis) with respect to decreasing mean connectivity
   (y-axis). The goal of selecting a value β is to maximize scale
   independence (i.e., suppress low correlations) while simultaneously
   minimizing loss in mean connectivity.

   (TIF)
   [201]Click here for additional data file.^ (542.6KB, tif)

Acknowledgments