Abstract Complexities in degenerative disorders, such as osteoarthritis, arise from multiscale biological, environmental, and temporal perturbations. Animal models serve to provide controlled representations of the natural history of degenerative disorders, but in themselves represent an additional layer of complexity. Comparing transcriptomic networks arising from gene co-expression data across species can facilitate an understanding of the preservation of functional gene modules and establish associations with disease phenotypes. This study demonstrates the preservation of osteoarthritis-associated gene modules, described by immune system and system development processes, across human and rat studies. Class prediction analysis establishes a minimal gene signature, including the expression of the Rho GDP dissociation inhibitor ARHGDIB, which consistently defined healthy human cartilage from osteoarthritic cartilage in an independent data set. The age of human clinical samples remains a strong confounder in defining the underlying gene regulatory mechanisms in osteoarthritis; however, defining preserved gene models across species may facilitate standardization of animal models of osteoarthritis to better represent human disease and control for ageing phenomena. Joint degeneration: linking animal and human disease Cartilage, and other tissues in our joints, begins to degenerate with age resulting in pain and reduced mobility; this is termed osteoarthritis (OA). To understand OA better researchers have often used animal models to represent this disease; however, these models have never been fully-evaluated against human cartilage. This study considered the messages produced by cartilage cells in both humans and rats. Using a method that creates a network of messages the study was able to define “blocks” of cell messages that were associated with diseased cartilage in both the rat and human. As part of this study the authors also defined a set of messages that could be used to distinguish healthy and disease cartilage. In this way it may be possible to define patients with early OA that may benefit from therapeutic interventions. (135) Introduction Disorders of cartilage and joints account for a high incidence of disability^[30]1 and are prevalent co-morbidities of the ageing population.^[31]2 Debilitating in their own right musculoskeletal disorders contributes significantly to the global burden of disease, being the fourth most prevalent disorder^[32]3 and have a wider impact on the rehabilitation of co-occurring pathologies (obesity, stroke, and cardiovascular disease), thereby representing a major health policy issue. Osteoarthritis (OA), considered a chronic, degenerative condition of multiple tissues that comprise a joint,^[33]4 results in the destruction of cartilage, the friction-free interface, leading to considerable functional impairment. The major cell population of cartilage, chondrocytes, account for the unique extracellular matrix (ECM), which confers compression resistance and gliding characteristics of normal cartilage.^[34]5 Despite considerable efforts to characterize the nature of degenerate cartilage the pathophysiological process is not fully understood, disease-associated genetic variants are limited, and there are no disease-modifying therapeutics available.^[35]6 Disease complexity, arising from multiscale perturbations, makes a mechanistic understanding of OA difficult. OA is a complex disease because it involves multiple tissues, environmental factors, behaviors, signaling pathways and genes. For example, numerous genetic risk loci, epigenetic effects, inflammation associated with ageing^[36]7 and obesity^[37]8 and biomechanical factors contribute to joint degeneration. Although heritable factors account for 50% of an individual’s risk of developing OA, only 16 disease risk loci have been consistently identified^[38]9 with candidate genes such as GDF5 and SMAD3 harboring the most promising risk alleles;^[39]10 overall, multiple risk alleles are likely to contribute to OA susceptibility. Additionally, OA is dynamic, being progressive and chronic, and so is likely to involve the dysregulation of several biological systems over multiple timescales. As with other multifactorial diseases (e.g., neurological disorders), analysis of individual components cannot adequately explain the properties of the whole system (the contributing tissues) as novel properties emerge with increasing complexity of the system.^[40]11 Animal models of multifactorial disorders are used to provide a controlled representation of subsets of human disease and aim to reproduce the natural history and progression. Rodent models of cartilage pathophysiology are frequently employed and include surgical-induced (destabilization of the medial meniscus) and chemical-induced (monoiodoacetate joint injection) OA. The rat is frequently used in the study of OA; however, there is no single standardized in vivo model.^[41]12 Gene expression studies arising from these models are often poorly controlled, underpowered, combine joint tissues, and use multiple different gene expression analysis platforms, making comparison across studies problematic. Overall, animal models that better represent human OA are required.^[42]13 Weighted gene co-expression network analysis^[43]14 is a systems biology methodology that considers the connectivity between genes based upon their co-expression. This method facilitates investigation of the global network properties of a transcriptome and provides functional insights into the organization of a co-expression network by utilizing the concept of scale-free networks.^[44]15 As the co-expression of genes encodes the downstream protein interactions, the study of transcriptional co-expression patterns can reveal emergent functional properties of the cellular system under investigation. Weighted gene co-expression network analysis (WGCNA) has been used widely to define candidate genes for human disorders including prognostic signatures for cancers, and has demonstrated preservation of functional gene modules between human and mouse brains.^[45]16 In this context network, nodes represent genes that are expressed in a sample. Edges connect nodes based upon their weighted co-expression across samples. WGCNA assumes that all nodes are connected and the connections have different strengths; highly connected genes within a network can be gathered as modules, with “hubs” being the most highly connected genes within a module. The modularity of networks is inherent to cell biology,^[46]17 and biological phenomena arise from molecular interactions organized into functional modules. The network topology (or architecture of these module structures) can be compared across networks to assess conservation of modules in different conditions or between species. The system under consideration in this study was the chondrocyte, either as whole cartilage or isolated cells. Transcriptomic profiling from different environments and conditions provided information on perturbations to that system. The study sought to establish, from publically available gene expression data, a comprehensive analysis of the gene–gene co-expression networks from transcriptomic profiles of different chondrocyte phenotypes in human and rat. By performing this analysis on human and rat data, an understanding of the preservation of network module topology would inform the validity of rodent in vivo models of OA. Additionally, by establishing a subnetwork of genes associated with the phenotype of interest, osteoarthritic cartilage, rational therapeutic and diagnostic targets may be proposed for future study. This study demonstrates high preservation of modules across species associated with physiological functions in addition to modules associated with inflammatory mediators and system development that are characteristic of a subset of human osteoarthritic cartilage samples. Importantly, genes with class discrimination potential were established that may serve to define early cartilage degeneration. Results Construction of rat and human co-expression networks Global co-expression networks were constructed from rat (115 arrays) and human (129 arrays) gene expression data using 5982 genes with common annotation by WGCNA. Overall, 12 modules were defined for the human network (Fig. [47]1a, b) and 20 modules for the rat (Fig. [48]2a, b), inclusive of a module of unassigned genes for each species. All further characterizations were undertaken on these modules. An alphanumeric code for each module (H-human, R-rat) is provided for reference, Supplementary Figs. [49]4a and [50]5a. The genes assigned to each module are listed Supplementary Data [51]SD1 and [52]SD2. Fig. 1. Fig. 1 [53]Open in a new tab Definition of co-expression modules in the human (a). Hierarchical cluster dendrogram derived from merged human gene expression data (derived from n = 129 arrays and 5982 genes) defines 12 modules. Branches of the dendrogram represent groups of genes. Dynamic tree-cutting was used to define modules; where these had significant overlap, they were assigned the same label (arbitrary module color). The co-expression distance (1-topological overlap (TO)) between the genes is defined by the y-axis; the genes are plotted along the x-axis; b top band—gene modules (clusters of highly co-expressed genes) coded by color; unassigned genes are colored “gray”. Key modules of interest (H2 and H4) are annotated; the associated consensus modules (C1 to C5), modules found in both rat and human networks, are also defined above the module bar. Alphanumeric module codes are provided in Supplementary Fig. [54]4a; bands 2–4—selected samples showing positive or negative correlations with genes enriched in each module (see figure key—red corresponds to positive correlation). Clinical samples represent whole-cartilage gene expression derived from human donors; gene expression profiles from samples were allocated to one of three clinical sample groups (“Clinical Groups 1–3”); a fourth (“Articular cartilage”) represents ostensibly normal cartilage. The overlap between human and consensus modules is provided in Supplementary Fig. [55]4b. “Clinical Group 2” shows a positive correlation with H2 and H4 modules, which overlap with the C4 and C5 consensus modules; c module eigengene expression (y-axis) is defined for two consensus modules (C4 and C5) across selected samples (x-axis) relative to all samples contributing to the human network (“Other”). For both consensus modules there is a significant difference in the expression of the corresponding module eigengenes across samples using a non-parametric Kruskal–Wallis one-way analysis of variance (C4, brown—p = 1.6e−09; C5, yellow—p = 2.7e−11) with high expression found in “Clinical Group 2” relative to normal articular cartilage. Overall, whole-cartilage samples demonstrate heterogenous gene expression and differ in their association with network modules Fig. 2. [56]Fig. 2 [57]Open in a new tab Definition of co-expression modules in the rat. a Hierarchical cluster dendrograms in the rat, derived from gene expression data (n = 115 arrays and 5982 genes) show 20 modules. An alphanumeric code is provided to clarify references to specific modules (Supplementary