Abstract Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: [28]http://targetmine.mizuguchilab.org Introduction The proliferation of high-throughput ‘omics’ experiments has led to a surge in the availability of biomedical data that need to be properly analysed. Leveraging biological information from different data types yields deeper insights into gene function and provides a better understanding of the biological process under study, which can be further transformed into actionable research. For instance, drug repositioning (i.e. new uses for existing drugs) ([29]1) and combinatorial drug treatments ([30]2) necessitate a systems-level mapping of drug–target interactions and their influence on cellular networks. Cellular networks themselves comprise multiple organizational layers made up of different types of biomolecular interactions such as microRNA (miRNA)–target interactions (MTIs), protein–protein interactions (PPIs) and transcription factor (TF)–target gene interactions, which together modulate the functioning of the living systems. The ability to correlate these data with gene expression patterns and a priori knowledge of the genetic determinants of various diseases is key to a deeper understanding of disease mechanisms and development of better therapeutic strategies. However, integration of the vast and scattered array of biological information is a multifarious scientific challenge, for reasons ranging from inconsistencies in data gathering to heterogeneous and often incompatible formats used to store and manage biological data in different repositories. Despite these obstacles, the immense benefits of an integrative approach in disease biology and drug discovery have spawned numerous efforts to develop different types of frameworks and tools for integrating diverse biological data types ([31]3–10) and for functional analysis of large gene sets. Gene set functional enrichment relies on a statistical analysis of the relative abundance of biological themes associated with a given gene list and identifies themes (and associated genes) that are overrepresented and therefore, likely to be more relevant to the biological conditions under study. For instance, the DAVID gene functional classification resource employs a heuristic approach to grouping genes into modules based on similarities in the biological annotations and provides a set of tools for functional analysis of user-supplied gene lists ([32]11). On the other hand, Enrichr employs pre-defined gene set libraries to assist functional enrichment analysis of large gene lists ([33]12). However, most of the available tools that facilitate gene set functional enrichment provide a standalone web interface and have not been integrated into a more general data-mining platform; such a platform is often crucial to refine and validate gene set functional enrichment results and for further characterization of gene sets in drug discovery and related applications. We have previously developed TargetMine, an integrated data warehouse based on the versatile InterMine framework ([34]8, [35]10, [36]13), which models biological entities (such as genes and proteins) as ‘objects’ and their relationships as ‘references’ to other objects.