Abstract The human biological system uses ‘inter-organ’ communication to achieve a state of homeostasis. This communication occurs through the response of receptors, located on target organs, to the binding of secreted ligands from source organs. Albeit years of research, the roles these receptors play in tissues is only partially understood. This work presents a new methodology based on the enrichment analysis scores of co-expression networks fed into support vector machines (SVMs) and k-NN classifiers to predict the tissue-specific metabolic roles of receptors. The approach is primarily based on the detection of coordination patterns of receptors expression. These patterns and the enrichment analysis scores of their co-expression networks were used to analyse ~ 700 receptors and predict metabolic roles of receptors in subcutaneous adipose. To facilitate supervised learning, a list of known metabolic and non-metabolic receptors was constructed using a semi-supervised approach following literature-based verification. Our approach confirms that pathway enrichment scores are good signatures for correctly classifying the metabolic receptors in adipose. We also show that the k-NN method outperforms the SVM method in classifying metabolic receptors. Finally, we predict novel metabolic roles of receptors. These predictions can enhance biological understanding and the development of new receptor-targeting metabolic drugs. Subject terms: Systems biology, Computational biology and bioinformatics, Data processing, Functional clustering, Machine learning Introduction The human system, as any other biological system, always aiming to achieve a state of homeostasis, responds to different conditions through activating feedback control loops between its sub-systems, organs and tissues. For example, to ensure whole organism survival, the endocrine system preserves long feedback loops of ligands secretion and receptors binding to maintain glucose or energetic balance. Ligand–receptor secretion and binding are accomplished by molecules, i.e., ligands, secreted into the blood stream from source organs that bind to receptors located on both the cell surface and within the cells of target organs. This complex network of whole-body ligand–receptor interactions serves as the information transducer of these feedback loops. Understanding these receptor roles is pivotal in the field of modern medicine. Receptor dysregulation underlies the etiology of many human diseases (e.g., diabetes^[24]1) and prescription drugs are designed to affect the regulation of receptors, e.g., by distrupting the interaction to the ligand, and produce therapeutic changes in the function of related biological systems^[25]1,[26]2. Moreover, receptors serve as targets for virus invasion of cells, e.g., the ACE-2 receptor is responsible for the entrance of the COVID-19 virus into the lungs^[27]3. Albeit years of research, our present-day understanding of the tissue-specific functions of many receptors and their ligand intercellular signalling networks is still incomplete. Developing drugs continues to be a challenge, as advances in scientific knowledge of receptors has been relatively slow, being based on laborious experimentation that typically precedes testing one or two receptors at a time in one or two tissues. The advent of ultrahigh-throughput sequencing technologies and algorithmic advancements now enable us to investigate systematically and simultaneously hundreds of genes coded to receptors. A recent computational work^[28]4 defined cross-tissue expression of ligand–receptor pairs by merely measuring the expression levels of ligands and receptors across 144 cell types. A common task of analysis of gene expression data is to detect gene–gene co-expression networks. These gene co-expression networks are based on the “guilt by association” concept that is related to the fact that functionally related genes are co-expressed^[29]5. Such networks are used to identify the functional roles of genes whose function is unknown by relating their co-expression networks to known biological processes. For example, Horan et al. annotated genes of known and unknown function by large-scale coexpression analysis^[30]6. The Weighted Gene Co-expression Network Analysis (WGCNA)^[31]7 is the most popular algorithm for specifying co-expression networks. The algorithm groups related genes into gene modules (clusters) based on their co-expression patterns and topological similarity to neighbour genes in the network. Machine learning approaches are gaining popularity for gene expression analysis^[32]8,[33]9 and the support vector machines (SVMs) are one of the most widely used type of machine learning algorithm for solving binary classification problems^[34]10. SVMs have successfully classified functional modules and protein interaction networks from gene expression data^[35]8,[36]9. The binary SVM classifier is based on defining a hyperplane that distinguishes between the positive labeled data (e.g., metabolic receptors) and the negative labeled data (e.g., non-metabolic receptors) based on the feature space, the properties of the data. The k-NN (k-nearest neighbours) algorithm is a distance-based approach that classifies the data points based on the known classification of their neighbours^[37]11. The GTEx project^[38]20 includes a unique collection of thousands of samples of RNA-seq gene expression data across multiple tissues collected from hundreds of donors. Using this data and focusing on metabolic receptors and adipose tissue, we ask several questions: (1) Is expression of genes coded to receptors widely correlated within tissues? And in adipose in specific? (2) How can we use this data to infer the metabolic roles of receptors in tissues and to detect new metabolic receptors, not thought of as being members of a specific classically defined metabolic system? Together, answers to these questions can begin to delineate a comprehensive view of the metabolic network signalling. Here we present a new computational methodology to predict tissue-specific receptor metabolic functionality, which we applied to subcutaneous adipose. The methodology incorporates three steps A, B and C (see Fig. [39]1) and is based on our new finding that metabolic receptors are co-expressed, among themselves and with other genes. In Step A an annotated list of metabolic and non-metabolic receptors in adipose was constructed using a semi-supervised approach and literature-based validation. In Step B we used the (WGCNA) algorithm^[40]7 for co-expression network analysis to generate gene modules (clusters) in subcutaneous adipose followed by their pathways enrichment analysis. We used the enrichment scores to train SVMs and k-nearest neighbour (k-NN) classifiers and compared their performance, in Step C. Finally, we used the classifiers to predict new metabolic receptors, having previously unknown metabolic functions, in adipose. We used an extensive list of ~ 700 receptors for the full analyses and predictions. Figure 1. [41]Figure 1 [42]Open in a new tab Schematic view of the new computational methodology. Results The new computational methodology predicts tissue-specific roles of metabolic receptors in subcutaneous adipose and comprises the following steps. Step A: Subcutaneous adipose receptor labeled list Supervised learning requires an initial labeled list of known metabolic (positive examples) and non-metabolic (negative examples) receptors in a tissue for the training, performance evaluation and construction of the classifier. We chose to study adipose tissue^[43]13 since it is a highly active endocrine and metabolically important organ, with the ability to modulate glucose homeostasis, energy expenditure, lipid metabolism, and peripheral inflammation. In addition, the existing knowledge about its metabolic receptors roles is extensive and, experimentally, it was robustly tested in comparison to other tissues. One main challenge for us was to detect the receptors that exhibit metabolic roles in adipose and those that do not. We note that we use the term metabolic receptors to include receptors related to the metabolic/endocytosis/growth regulation system^[44]14–[45]16. This knowledge is not easily available since public databases, such as KEGG, do not include a metabolic receptor classification in general or a tissue-specific metabolic receptors classification in particular. For example, the KEGG database includes the “Neuroactive ligand-receptor interaction” pathway that consists of a combination of metabolic and non-metabolic receptors. The insulin receptor is included in its own pathway, the KEGG insulin signalling pathway. In addition to the "pure" metabolic receptors a receptor may exhibit ubiquitous roles across the whole body, e.g., a known inflammation-related cytokine receptor which we possibly label as a non-metabolic negative example, may exhibit metabolic roles in adipose. An example is the cytokine receptor TNFRSF21, a tumor necrosis factor receptor superfamily member 21, that is include in the KEGG “Cytokine-cytokine receptor interaction” but is also related to the “regulation of lipid metabolic process” in GO (Gene Ontology)^[46]17,[47]18. To construct the initial positively labeled receptors list, we gathered a list of 33 metabolic receptors known from the literature to be related to the regulation of growth, endocytosis and metabolism^[48]14–[49]16. The reader is directed to Supplemental Table [50]S1 for this list and additional references for the metabolic