ABSTRACT As for many model organisms, the amount of Listeria omics data produced has recently increased exponentially. There are now >80 published complete Listeria genomes, around 350 different transcriptomic data sets, and 25 proteomic data sets available. The analysis of these data sets through a systems biology approach and the generation of tools for biologists to browse these various data are a challenge for bioinformaticians. We have developed a web-based platform, named Listeriomics, that integrates different tools for omics data analyses, i.e., (i) an interactive genome viewer to display gene expression arrays, tiling arrays, and sequencing data sets along with proteomics and genomics data sets; (ii) an expression and protein atlas that connects every gene, small RNA, antisense RNA, or protein with the most relevant omics data; (iii) a specific tool for exploring protein conservation through the Listeria phylogenomic tree; and (iv) a coexpression network tool for the discovery of potential new regulations. Our platform integrates all the complete Listeria species genomes, transcriptomes, and proteomes published to date. This website allows navigation among all these data sets with enriched metadata in a user-friendly format and can be used as a central database for systems biology analysis. IMPORTANCE In the last decades, Listeria has become a key model organism for the study of host-pathogen interactions, noncoding RNA regulation, and bacterial adaptation to stress. To study these mechanisms, several genomics, transcriptomics, and proteomics data sets have been produced. We have developed Listeriomics, an interactive web platform to browse and correlate these heterogeneous sources of information. Our website will allow listeriologists and microbiologists to decipher key regulation mechanism by using a systems biology approach. INTRODUCTION Listeria monocytogenes is a foodborne pathogen responsible for foodborne infections with a mortality rate of 25%. This pathogen is responsible for gastroenteritis, sepsis, and meningitis and can cross three host barriers, the intestinal, placental, and blood-brain barriers. It is a major concern for pregnant women, as it induces abortions ([43]1). L. monocytogenes can enter, replicate in, and survive in a wide range of human cell types, such as macrophages, epithelial cells, and endothelial cells. Moreover, Listeria has emerged as a model organism for the study of host-pathogen interactions ([44]1[45]–[46]3). Listeria belongs to the Firmicutes phylum. The Listeria genus is made up of the widely studied pathogenic species L. monocytogenes; another pathogenic species, Listeria ivanovii, that mostly affects ruminants; and 15 nonpathogenic species ([47]4[48]–[49]10). In 2001, the genomes of L. monocytogenes strain EGD-e and one Listeria innocua strain were sequenced ([50]11). Since then, many other Listeria genomes, covering all the lineages, have been sequenced ([51]12[52]–[53]17). Currently, the NCBI refSeq database contains 83 complete Listeria genomes, including 70 L. monocytogenes genomes. The number of Listeria strains sequenced will probably grow exponentially in the coming years. Efforts have been made to summarize all these genomes on specific databases like ListiList ([54]11), GenoList ([55]18), GECO-LisDB server ([56]16), and ListeriaBase ([57]19) to find common gene features and to develop pangenome studies of Listeria species. The first Listeria transcriptomic data set was published in 2007 ([58]20). Since that report, 64 ArrayExpress studies, corresponding to 362 different biological conditions, have been produced ([59]21). Only seven of them are transcriptome sequencing (RNA-Seq) studies, and all the others correspond to transcription profiling by microarrays, with the EGD-e strain being the most frequently used strain. Listeria is also a key organism in the study of bacterial regulatory small noncoding RNAs (sRNAs). Despite the high number of studies on Listeria noncoding RNAs, only two websites with Listeria-related data sets have been published. The first one is a genome viewer published along with a transcription start site (TSS) study of Listeria ([60]22). The second is the sRNAdb database ([61]23), which provides tools to visualize the conservation of gene loci surrounding noncoding RNAs in different Gram-positive bacteria. The ability of L. monocytogenes to enter into various types of cells is due to the variety of proteins it secretes or anchors to its cell wall and external membrane. Consequently, many proteomic studies have been performed to analyze the exoproteome of Listeria ([62]24[63]–[64]35). Other studies have focused on cytoplasmic proteins ([65]27, [66]33, [67]35[68]–[69]44). To our knowledge, 74 proteome studies have been conducted to decipher the production and localization of Listeria proteins. Nevertheless, no database exists that combines all these proteomics data sets into a single, user-friendly resource. The number of omics data sets produced has increased exponentially. The number of tools to analyze these data, as well as the diversity of databases to store them, has also burgeoned. In parallel with this increase, many efforts have been made to develop accurate web-based tools to integrate diverse omics data for each model organism. One of the most complete resources is certainly the University of California Santa Cruz Encyclopedia of DNA Elements (ENCODE at UCSC) Genome Browser ([70]45), which allows the visualization of a large variety of human and mouse omics data sets. For prokaryotic organisms, the BioCyc ([71]46, [72]47) and Pathosystems Resource Integration Center ([73]48) websites have been created. These websites connect all the published genomic and transcriptomic data sets for prokaryotic organisms to metabolic pathways. Such wide-ranging web resources are useful for microbiologists, but for in-depth analyses, the development of individual web resources with curated metadata per model organism is also required. In the case of bacteria, few heterogeneous omics data sets are available ([74]49) and few model organisms have dedicated web resources, including Escherichia coli, with RegulonDB ([75]50) and PortEco ([76]51), and Bacillus subtilis, with SubtiWiki ([77]52). As yet, resources for Listeria species are limited. Here, we present Listeriomics ([78]http://listeriomics.pasteur.fr/), a highly interactive web resource summarizing many omics data sets related to the genus Listeria. We have curated and integrated all the available Listeria transcriptomic, proteomic, and genomic data sets to date. The Listeriomics platform was developed not only to integrate these diverse data sets but also to display them in a single viewer. To interactively explore these data sets, our website also provides different tools, i.e., (i) a genome viewer for displaying gene expression arrays, tiling arrays, and sequencing data, along with proteomic and genomic data sets; (ii) an expression atlas and protein atlas, inspired by the EBI Expression Atlas, that connects genomic elements (genes, small RNAs, antisense RNAs [asRNAs]) to the most relevant omics data; (iii) a specific tool for exploring protein conservation through the Listeria phylogenomic tree; and (iv) a coexpression network analysis tool for the discovery of potential new regulations. RESULTS The Listeriomics web interface. Genomic, transcriptomic, or proteomic data can be browsed by using the Listeriomics website ([79]http://listeriomics.pasteur.fr/) main page ([80]Fig. 1; [81]Table 1; see [82]Fig. S1 in the supplemental material). For each type of data, we designed a summary panel to navigate through the different data sets. The top banner of the website gives direct access to them. As summarized in [83]Table 1, users can search 83 complete Listeria genomes and browse 492 transcriptome and 74 proteome data sets. Listeriomics integrates four tools for omics data management, i.e., (i) a genome viewer for displaying gene expression array, tiling array, and sequencing data along with proteomics and genomics data; (ii) an expression atlas and protein atlas that connect every genomic element (genes, small RNAs, asRNAs) to the most relevant omics data; (iii) a protein conservation tool for the direct visualization of the presence or absence of a protein in a specific Listeria strain; and (iv) a coexpression network analysis tool for the visualization of genome features with the same expression profile. FIG 1 . [84]FIG 1 [85]Open in a new tab Overview of the Listeriomics platform. (Center) The five major tools of Listeriomics, i.e., gene conservation and synteny, coexpression network, genome viewer, expression, and protein atlas. (Left) Summary of all the available genomic information available on the website. (Right) List of all the transcriptomic information available in Listeriomics. (Bottom) View of all the proteomic information that can be accessed. TABLE 1 . Summary of omics data sets included in the Listeriomics database Category Data sizes Data type(s) Tools available Genomics 83 complete genomes (NCBI), all protein coding genes and noncoding RNAs, 304 small RNAs Genome, phylogeny, genome elements, homologs Genome summary, gene panel, small RNA panel, genome viewer Transcriptomics 362 biological conditions, 8 Listeria strains, 342 comparisons Gene expression array, tiling array, TSS, RNA-Seq Transcriptome summary, expression atlas, heat map, genome viewer Proteomics 74 biological conditions, 4 Listeria strains, 28 comparisons Mass spectrometry Proteome summary, protein atlas, heat map, genome viewer [86]Open in a new tab FIG S1 Flowchart of omics data set integration in the Listeriomics database. (A) Complete genome sequences from the RefSeq and GenBank databases were downloaded and integrated into Listeriomics, along with pathway information and small RNAs. (B) MAGE-TAB data sets were downloaded from ArrayExpress. Metadata on the data sets were manually curated, and processed gene expression array tables were added. Raw RNA-Seq data were downloaded and mapped to a reference genome. After log fold change calculation, all of the data sets were normalized with variance normalization to fix the statistical deviation at 1 and ensure comparability. (C) Proteomics data sets were manually curated from the core articles and related supplementary data. Download [87]FIG S1, EPS file, 1.1 MB^ (1.1MB, eps) . Copyright © 2017 Bécavin et al. This content is distributed under the terms of the [88]Creative Commons Attribution 4.0 International license. The genomic interface is designed to browse every complete genome of the Listeriomics resource. Users can access strain name, serotype, lineage, and isolation information, along with a complete phylogenomic tree of Listeria strains ([89]Fig. 1). From this table, scientists can access all the annotated genes of a specific strain. For each Listeria gene, five different information panels are available. The first panel shows all the general information about the position of the gene, its predicted annotated function. DNA and amino acid sequences can be accessed and saved as FASTA files or sent directly for a BLASTn or BLASTp search ([90]53). The predicted subcellular localization (cytoplasm, cytoplasmic membrane, cell wall, cell surface, and extracellular milieu [[91]27]) of each protein is also displayed along with information about the secretion pathway possibly used by the protein. The second panel provides an instant view of the conservation of a specific protein in other Listeria strains. This panel dynamically displays homologs on the Listeria reference tree in each existing Listeria strain. It also displays a summary table of all the homologous proteins with their similarity percentages and amino acid sequences. Users can also create a multialignment file of the homologous proteins. With the third panel, the user can visualize the protein locus synteny in all Listeria strains. We built an external synteny website by using the SynTView architecture ([92]54). A fourth panel uses the expression atlas to show in which transcriptomics data sets the selected gene is differently expressed. The fifth panel displays every proteomics data set in which the protein encoded by the selected gene has been detected. Finally, from the home webpage, a summary panel with all the small RNAs in L. monocytogenes EGD-e can be accessed ([93]Fig. 1). For each noncoding RNA element, one can display its position, its nucleotide sequence, its predicted secondary structure at 37°C, and a table displaying all supplementary information provided in source references ([94]22, [95]55[96]–[97]58).