Source: https://github.com/markziemann/background

Intro

I previously wrote about the importance of a background gene list for accurate and correct enrichment analysis.

Here I will demonstrate how important this is, as well as providing a guide for how to make your own background gene list.

For this guide I will be using data from a previous analysis, which is deposited at NCBI GEO and SRA under accession numbers: GSE109140 and SRP128998.

The data was presented in a paper by Felisbino et al

library("getDEE2")
library("DESeq2")
library("kableExtra")
library("vioplot")
library("clusterProfiler")
library("eulerr")

Fetch data

Using RNAseq data from DEE2, which can be accessed from dee2.io

mdat <- getDEE2Metadata("hsapiens")

# get sample sheet
ss <- subset(mdat,SRP_accession=="SRP128998")

# fetch the whole set of RNA-seq data
x <- getDEE2("hsapiens",ss$SRR_accession , metadata=mdat, legacy=TRUE)
## For more information about DEE2 QC metrics, visit
##     https://github.com/markziemann/dee2/blob/master/qc/qc_metrics.md
mx <- x$GeneCounts
rownames(mx) <- paste(rownames(mx),x$GeneInfo$GeneSymbol) 

# vpa subset no filtering
ss1 <- ss[grep("low",ss$Experiment_title),]
ss1$vpa <- grepl("VPA",ss1$Experiment_title)

ss1 %>% 
  kbl(caption="sample sheet for VPA treatment in hepatocytes") %>%
  kable_paper("hover", full_width = F)
sample sheet for VPA treatment in hepatocytes
SRR_accession QC_summary SRX_accession SRS_accession SRP_accession Experiment_title GEO_series vpa
384654 SRR6467485 PASS SRX3557434 SRS2830733 SRP128998 GSM2932797: low glucose replicate 1; Homo sapiens; RNA-Seq FALSE
384655 SRR6467486 PASS SRX3557435 SRS2830734 SRP128998 GSM2932798: low glucose replicate 2; Homo sapiens; RNA-Seq FALSE
384656 SRR6467487 PASS SRX3557436 SRS2830735 SRP128998 GSM2932799: low glucose replicate 3; Homo sapiens; RNA-Seq FALSE
384657 SRR6467488 PASS SRX3557437 SRS2830736 SRP128998 GSM2932800: low glucose VPA replicate 1; Homo sapiens; RNA-Seq TRUE
384658 SRR6467489 PASS SRX3557438 SRS2830737 SRP128998 GSM2932801: low glucose VPA replicate 2; Homo sapiens; RNA-Seq TRUE
384659 SRR6467490 PASS SRX3557439 SRS2830738 SRP128998 GSM2932802: low glucose VPA replicate 3; Homo sapiens; RNA-Seq TRUE
mx1 <- mx[,which(colnames(mx) %in% ss1$SRR_accession)] 
dim(mx1)
## [1] 58302     6

Analysis of read counts

Although I know this data set is good as I was involved in the project, it is a good idea to show at least the number of reads for each sample.

par(mar=c(5,7,5,1))
barplot(rev(colSums(mx1)),horiz=TRUE,las=1,main="number of reads per sample in SRP128998")

Now we can have a look at the read count distribution for one of the samples.

# look at the distribution of counts
l1 <- lapply(1:ncol(mx1),function(i) log10( mx1[,i] +1 ))
names(l1) <- colnames(mx1)
l1 <- rev(l1)

hist(l1[[6]],main="Distribution of read counts in SRR6467485",
  xlab="log10(readcounts+1)",breaks=30)

abline(v=1,col="red",lty=2)

We can see that most genes have a very low read count, consistent with most genes being undetectable in any sample.

Based on this typical distribution of read counts, I would consider any gene with fewer than 10 reads as not detected. This could be considered a strict detection threshold. Setting a strict threshold like this could be helpful in selecting genes that are likely to validate with qRT-PCR.

I will run DESeq2 with and without this threshold to see how it impacts the results.

Differential expression analysis with and without filtering

To see whether this has any effect on downstream analysis, DESeq2 will be run on the same dataset without any filtering, with light filtering and with strict filtering.

Here I define light filter threshold as the lowest baseline expression level that DE genes start to appear. I know this contrast will yield thousands of DE genes, so this is a reasonable way to set the detection threshold. The DE genes with lowest baseline expression showed mean read counts of about 2.

# DESeq2 without any fitering
dds <- DESeqDataSetFromMatrix(countData = mx1 , colData = ss1, design = ~ vpa )
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
z<- results(res)
vsd <- vst(dds, blind=FALSE)
zz<-cbind(as.data.frame(z),mx1)
dge1<-as.data.frame(zz[order(zz$pvalue),])

head(dge1,10) %>% 
  kbl(caption="Top DE genes for VPA treatment (unfiltered)") %>%
  kable_paper("hover", full_width = F)
Top DE genes for VPA treatment (unfiltered)
baseMean log2FoldChange lfcSE stat pvalue padj SRR6467485 SRR6467486 SRR6467487 SRR6467488 SRR6467489 SRR6467490
ENSG00000116852 KIF21B 758.8708 4.235925 0.1637805 25.86342 0 0 65 81 91 1370 1232 1720
ENSG00000103034 NDRG4 809.8276 4.068972 0.1680121 24.21833 0 0 81 93 107 1699 1264 1605
ENSG00000101210 EEF1A2 623.2070 4.319276 0.1812953 23.82453 0 0 52 71 61 1305 1067 1158
ENSG00000205978 NYNRIN 922.4737 4.414294 0.1955579 22.57282 0 0 79 72 102 1531 1788 1917
ENSG00000168280 KIF5C 420.9624 5.242347 0.2324458 22.55298 0 0 21 20 26 858 675 924
ENSG00000187764 SEMA4D 469.4849 3.365642 0.1621712 20.75364 0 0 71 86 100 830 784 939
ENSG00000025039 RRAGD 544.0335 3.627284 0.1778770 20.39209 0 0 71 108 73 1046 921 1029
ENSG00000039068 CDH1 1204.3439 2.464853 0.1226862 20.09071 0 0 317 404 419 1985 1865 2233
ENSG00000182575 NXPH3 406.9713 5.120785 0.2615973 19.57507 0 0 11 41 20 697 776 886
ENSG00000100321 SYNGR1 324.7426 4.448159 0.2280782 19.50278 0 0 32 25 30 645 556 648
sig1 <- subset(dge1,padj<=0.05)
message("number of significant genes")
## number of significant genes
nrow(sig1)
## [1] 7089
plot(rowMeans( log10(dge1[,7:12]+1) ),dge1$log2FoldChange,
  cex=0.6,pch=19,col="darkgray",
  xlab="log10(mean read count)",ylab="log2(fold change)")

points(rowMeans( log10(sig1[,7:12]+1) ),sig1$log2FoldChange,
  cex=0.6,pch=19,col="red")

abline(v=1,lty=2)

dge1up <- rownames(subset(dge1,padj<=0.05 & log2FoldChange>0))
dge1up <- unique(sapply(strsplit(dge1up," "),"[[",2))

dge1dn <- rownames(subset(dge1,padj<=0.05 & log2FoldChange<0))
dge1dn <- unique(sapply(strsplit(dge1dn," "),"[[",2))

dge1bg <- rownames(dge1)
dge1bg <- unique(sapply(strsplit(dge1bg," "),"[[",2))

look at the lowest basemean genes

sig1 <- sig1[order(sig1$baseMean),]
head(sig1,10) %>%
  kbl(caption="lowest basemean genes") %>%
  kable_paper("hover", full_width = F)
lowest basemean genes
baseMean log2FoldChange lfcSE stat pvalue padj SRR6467485 SRR6467486 SRR6467487 SRR6467488 SRR6467489 SRR6467490
ENSG00000095917 TPSD1 1.955539 4.446961 1.824035 2.437981 0.0147696 0.0472248 0 0 0 4 2 6
ENSG00000144230 GPR17 1.970787 4.456430 1.817191 2.452373 0.0141917 0.0456696 0 0 0 5 2 5
ENSG00000066382 MPPED2 1.989149 4.467431 1.791035 2.494329 0.0126195 0.0414081 0 0 0 4 3 5
ENSG00000204475 NCR3 1.992263 4.468833 1.821833 2.452932 0.0141697 0.0456337 0 0 0 2 4 6
ENSG00000273294 C1QTNF3-AMACR 1.992837 -4.416860 1.791380 -2.465619 0.0136777 0.0443251 4 4 4 0 0 0
ENSG00000134533 RERG 2.034892 4.495207 1.838488 2.445056 0.0144830 0.0464671 0 0 0 7 3 2
ENSG00000140009 ESR2 2.034892 4.495207 1.838488 2.445056 0.0144830 0.0464671 0 0 0 7 3 2
ENSG00000137674 MMP20 2.038006 4.496467 1.788898 2.513540 0.0119526 0.0395104 0 0 0 5 4 3
ENSG00000106633 GCK 2.056368 4.507043 1.788079 2.520606 0.0117153 0.0388516 0 0 0 4 5 3
ENSG00000232618 AL355304.1 2.077321 4.537447 1.858690 2.441207 0.0146382 0.0468782 0 0 0 2 2 9

The counts look quite low. Not sure that these DEGs could be validated with qRT-PCR.

Now run DESeq2 with filtering

mx2 <- mx1[which(rowMeans(mx1)>=10),]
dim(mx2)
## [1] 15852     6
dds <- DESeqDataSetFromMatrix(countData = mx2 , colData = ss1, design = ~ vpa )
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
z<- results(res)
vsd <- vst(dds, blind=FALSE)
zz<-cbind(as.data.frame(z),mx2)
dge2<-as.data.frame(zz[order(zz$pvalue),])

head(dge2,10) %>%
  kbl(caption="Top DE genes for VPA treatment (filtered)") %>%
  kable_paper("hover", full_width = F)
Top DE genes for VPA treatment (filtered)
baseMean log2FoldChange lfcSE stat pvalue padj SRR6467485 SRR6467486 SRR6467487 SRR6467488 SRR6467489 SRR6467490
ENSG00000116852 KIF21B 763.8163 4.248737 0.1718694 24.72073 0 0 65 81 91 1370 1232 1720
ENSG00000205978 NYNRIN 928.0397 4.427457 0.1887125 23.46139 0 0 79 72 102 1531 1788 1917
ENSG00000103034 NDRG4 815.8752 4.083139 0.1748377 23.35388 0 0 81 93 107 1699 1264 1605
ENSG00000101210 EEF1A2 627.8539 4.333653 0.1884855 22.99197 0 0 52 71 61 1305 1067 1158
ENSG00000168280 KIF5C 424.0618 5.255390 0.2402672 21.87311 0 0 21 20 26 858 675 924
ENSG00000182575 NXPH3 409.5097 5.132293 0.2538263 20.21970 0 0 11 41 20 697 776 886
ENSG00000072071 ADGRL1 1409.6105 3.197538 0.1606012 19.90980 0 0 247 317 290 2078 2596 2839
ENSG00000156515 HK1 1062.6425 3.220757 0.1622888 19.84583 0 0 159 263 222 1557 1898 2235
ENSG00000025039 RRAGD 547.7283 3.641436 0.1840194 19.78833 0 0 71 108 73 1046 921 1029
ENSG00000187764 SEMA4D 472.4305 3.378284 0.1754224 19.25800 0 0 71 86 100 830 784 939
sig2 <- subset(dge2,padj<=0.05)
nrow(sig2)
## [1] 6762
plot(log10(dge2$baseMean),dge2$log2FoldChange,
  cex=0.6,pch=19,col="darkgray",
  xlab="log10(base mean)",ylab="log2 fold change",
  main="strict filter")

points(log10(sig2$baseMean),sig2$log2FoldChange,
  cex=0.6,pch=19,col="red")

dge2up <- rownames(subset(dge2,padj<=0.05 & log2FoldChange>0))
dge2up <- unique(sapply(strsplit(dge2up," "),"[[",2))

dge2dn <- rownames(subset(dge2,padj<=0.05 & log2FoldChange<0))
dge2dn <- unique(sapply(strsplit(dge2dn," "),"[[",2))

dge2bg <- rownames(dge2)
dge2bg <- unique(sapply(strsplit(dge2bg," "),"[[",2))

Now run ORA based enrichment analysis

I will run enrichment analysis using the correct correct background list, and compare the results to the incorrect whole genome background.

if (! file.exists("ReactomePathways.gmt")) {
  download.file("https://reactome.org/download/current/ReactomePathways.gmt.zip", 
    destfile="ReactomePathways.gmt.zip")
  unzip("ReactomePathways.gmt.zip")
}

genesets2 <- read.gmt("ReactomePathways.gmt")

ORA with correct background list.

Firstly with the upregulated genes

ora2_up <- as.data.frame(enricher(gene = dge2up , 
  universe = dge2bg,  maxGSSize = 5000, TERM2GENE = genesets2,
  pAdjustMethod="fdr",  pvalueCutoff = 1, qvalueCutoff = 1  ))

ora2_up$geneID <- NULL
ora2_ups <- rownames(subset(ora2_up,p.adjust<0.05))

gr <- as.numeric(sapply(strsplit(ora2_up$GeneRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora2_up$GeneRatio,"/"),"[[",2))

br <- as.numeric(sapply(strsplit(ora2_up$BgRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora2_up$BgRatio,"/"),"[[",2))

ora2_up$es <- gr/br
ora2_up <- ora2_up[order(-ora2_up$es),]
head(ora2_up)
##                                                                        ID
## Adenylate cyclase inhibitory pathway Adenylate cyclase inhibitory pathway
## PKA activation                                             PKA activation
## Long-term potentiation                             Long-term potentiation
## PKA-mediated phosphorylation of CREB PKA-mediated phosphorylation of CREB
## Activation of GABAB receptors               Activation of GABAB receptors
## GABA B receptor activation                     GABA B receptor activation
##                                                               Description
## Adenylate cyclase inhibitory pathway Adenylate cyclase inhibitory pathway
## PKA activation                                             PKA activation
## Long-term potentiation                             Long-term potentiation
## PKA-mediated phosphorylation of CREB PKA-mediated phosphorylation of CREB
## Activation of GABAB receptors               Activation of GABAB receptors
## GABA B receptor activation                     GABA B receptor activation
##                                      GeneRatio BgRatio       pvalue    p.adjust
## Adenylate cyclase inhibitory pathway    9/1806 10/7666 1.736468e-05 0.001290774
## PKA activation                         11/1806 14/7666 2.137723e-05 0.001449423
## Long-term potentiation                  9/1806 12/7666 2.381870e-04 0.007587956
## PKA-mediated phosphorylation of CREB   11/1806 15/7666 6.302994e-05 0.002811135
## Activation of GABAB receptors          14/1806 20/7666 1.384093e-05 0.001157447
## GABA B receptor activation             14/1806 20/7666 1.384093e-05 0.001157447
##                                           qvalue Count       es
## Adenylate cyclase inhibitory pathway 0.001169831     9 3.820266
## PKA activation                       0.001313614    11 3.335153
## Long-term potentiation               0.006876977     9 3.183555
## PKA-mediated phosphorylation of CREB 0.002547737    11 3.112809
## Activation of GABAB receptors        0.001048997    14 2.971318
## GABA B receptor activation           0.001048997    14 2.971318
topup2 <- rev(head(ora2_up$es,20))
names(topup2) <- rev(head(ora2_up$ID,20))

Now with the downregulated genes

ora2_dn <- as.data.frame(enricher(gene = dge2dn ,
  universe = dge2bg,  maxGSSize = 5000, TERM2GENE = genesets2,
  pAdjustMethod="fdr",  pvalueCutoff = 1, qvalueCutoff = 1  ))

ora2_dn$geneID <- NULL
ora2_dns <- rownames(subset(ora2_dn,p.adjust<0.05))

gr <- as.numeric(sapply(strsplit(ora2_dn$GeneRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora2_dn$GeneRatio,"/"),"[[",2))

br <- as.numeric(sapply(strsplit(ora2_dn$BgRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora2_dn$BgRatio,"/"),"[[",2))

ora2_dn$es <- gr/br
ora2_dn <- ora2_dn[order(-ora2_dn$es),]
head(ora2_dn)
##                                                                                                                                ID
## Defects in vitamin and cofactor metabolism                                             Defects in vitamin and cofactor metabolism
## Defects in cobalamin (B12) metabolism                                                       Defects in cobalamin (B12) metabolism
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol Synthesis of bile acids and bile salts via 27-hydroxycholesterol
## RMTs methylate histone arginines                                                                 RMTs methylate histone arginines
## Biotin transport and metabolism                                                                   Biotin transport and metabolism
## Methylation                                                                                                           Methylation
##                                                                                                                       Description
## Defects in vitamin and cofactor metabolism                                             Defects in vitamin and cofactor metabolism
## Defects in cobalamin (B12) metabolism                                                       Defects in cobalamin (B12) metabolism
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol Synthesis of bile acids and bile salts via 27-hydroxycholesterol
## RMTs methylate histone arginines                                                                 RMTs methylate histone arginines
## Biotin transport and metabolism                                                                   Biotin transport and metabolism
## Methylation                                                                                                           Methylation
##                                                                  GeneRatio
## Defects in vitamin and cofactor metabolism                         15/1855
## Defects in cobalamin (B12) metabolism                               9/1855
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol    9/1855
## RMTs methylate histone arginines                                   20/1855
## Biotin transport and metabolism                                     7/1855
## Methylation                                                         7/1855
##                                                                  BgRatio
## Defects in vitamin and cofactor metabolism                       20/7666
## Defects in cobalamin (B12) metabolism                            12/7666
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 13/7666
## RMTs methylate histone arginines                                 30/7666
## Biotin transport and metabolism                                  11/7666
## Methylation                                                      11/7666
##                                                                        pvalue
## Defects in vitamin and cofactor metabolism                       2.370976e-06
## Defects in cobalamin (B12) metabolism                            2.966890e-04
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 7.567769e-04
## RMTs methylate histone arginines                                 9.904943e-07
## Biotin transport and metabolism                                  6.197420e-03
## Methylation                                                      6.197420e-03
##                                                                      p.adjust
## Defects in vitamin and cofactor metabolism                       0.0004051406
## Defects in cobalamin (B12) metabolism                            0.0213459940
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 0.0416170751
## RMTs methylate histone arginines                                 0.0001934294
## Biotin transport and metabolism                                  0.2017112715
## Methylation                                                      0.2017112715
##                                                                        qvalue
## Defects in vitamin and cofactor metabolism                       0.0003993223
## Defects in cobalamin (B12) metabolism                            0.0210394428
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 0.0410194095
## RMTs methylate histone arginines                                 0.0001906515
## Biotin transport and metabolism                                  0.1988144824
## Methylation                                                      0.1988144824
##                                                                  Count       es
## Defects in vitamin and cofactor metabolism                          15 3.099461
## Defects in cobalamin (B12) metabolism                                9 3.099461
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol     9 2.861041
## RMTs methylate histone arginines                                    20 2.755076
## Biotin transport and metabolism                                      7 2.629846
## Methylation                                                          7 2.629846
topdn2 <- head(ora2_dn$es,20)
names(topdn2) <- head(ora2_dn$ID,20)

Make a barplot

par(mar=c(5,20,5,1))

cols <- c(rep("blue",20),rep("red",20))

barplot(c(topdn2,topup2),
  horiz=TRUE,las=1,cex.names=0.65,col=cols,
  main="top DE Reactomes",
  xlab="ES")

mtext("correct background")

ORA with whole genome background list.

Firstly with the upregulated genes

ora1_up <- as.data.frame(enricher(gene = dge1up ,
  universe = dge1bg,  maxGSSize = 5000, TERM2GENE = genesets2,
  pAdjustMethod="fdr",  pvalueCutoff = 1, qvalueCutoff = 1  ))

ora1_up$geneID <- NULL
ora1_ups <- rownames(subset(ora1_up,p.adjust<0.05))

gr <- as.numeric(sapply(strsplit(ora1_up$GeneRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora1_up$GeneRatio,"/"),"[[",2))

br <- as.numeric(sapply(strsplit(ora1_up$BgRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora1_up$BgRatio,"/"),"[[",2))

ora1_up$es <- gr/br
ora1_up <- ora1_up[order(-ora1_up$es),]
head(ora1_up)
##                                                                                                  ID
## Adenylate cyclase activating pathway                           Adenylate cyclase activating pathway
## Adenylate cyclase inhibitory pathway                           Adenylate cyclase inhibitory pathway
## Ephrin signaling                                                                   Ephrin signaling
## PKA activation                                                                       PKA activation
## Defective EXT1 causes exostoses 1, TRPS2 and CHDS Defective EXT1 causes exostoses 1, TRPS2 and CHDS
## Defective EXT2 causes exostoses 2                                 Defective EXT2 causes exostoses 2
##                                                                                         Description
## Adenylate cyclase activating pathway                           Adenylate cyclase activating pathway
## Adenylate cyclase inhibitory pathway                           Adenylate cyclase inhibitory pathway
## Ephrin signaling                                                                   Ephrin signaling
## PKA activation                                                                       PKA activation
## Defective EXT1 causes exostoses 1, TRPS2 and CHDS Defective EXT1 causes exostoses 1, TRPS2 and CHDS
## Defective EXT2 causes exostoses 2                                 Defective EXT2 causes exostoses 2
##                                                   GeneRatio  BgRatio
## Adenylate cyclase activating pathway                 7/1907 10/10922
## Adenylate cyclase inhibitory pathway                 9/1907 14/10922
## Ephrin signaling                                    12/1907 19/10922
## PKA activation                                      11/1907 18/10922
## Defective EXT1 causes exostoses 1, TRPS2 and CHDS    8/1907 14/10922
## Defective EXT2 causes exostoses 2                    8/1907 14/10922
##                                                         pvalue    p.adjust
## Adenylate cyclase activating pathway              3.588639e-04 0.012217685
## Adenylate cyclase inhibitory pathway              1.273383e-04 0.007064916
## Ephrin signaling                                  1.161813e-05 0.001186937
## PKA activation                                    4.262818e-05 0.002902591
## Defective EXT1 causes exostoses 1, TRPS2 and CHDS 9.409871e-04 0.023108176
## Defective EXT2 causes exostoses 2                 9.409871e-04 0.023108176
##                                                        qvalue Count       es
## Adenylate cyclase activating pathway              0.010826014     7 4.009124
## Adenylate cyclase inhibitory pathway              0.006260178     9 3.681849
## Ephrin signaling                                  0.001051738    12 3.617255
## PKA activation                                    0.002571968    11 3.500029
## Defective EXT1 causes exostoses 1, TRPS2 and CHDS 0.020476010     8 3.272755
## Defective EXT2 causes exostoses 2                 0.020476010     8 3.272755
topup1 <- rev(head(ora1_up$es,20))
names(topup1) <- rev(head(ora1_up$ID,20))

Now with the downregulated genes

ora1_dn <- as.data.frame(enricher(gene = dge1dn ,
  universe = dge1bg,  maxGSSize = 5000, TERM2GENE = genesets2,
  pAdjustMethod="fdr",  pvalueCutoff = 1, qvalueCutoff = 1  ))

ora1_dn$geneID <- NULL
ora1_dns <- rownames(subset(ora1_dn,p.adjust<0.05))

gr <- as.numeric(sapply(strsplit(ora1_dn$GeneRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora1_dn$GeneRatio,"/"),"[[",2))

br <- as.numeric(sapply(strsplit(ora1_dn$BgRatio,"/"),"[[",1)) /
  as.numeric(sapply(strsplit(ora1_dn$BgRatio,"/"),"[[",2))

ora1_dn$es <- gr/br
ora1_dn <- ora1_dn[order(-ora1_dn$es),]
head(ora1_dn)
##                                                                                                                                ID
## Defects in vitamin and cofactor metabolism                                             Defects in vitamin and cofactor metabolism
## Defects in cobalamin (B12) metabolism                                                       Defects in cobalamin (B12) metabolism
## Beta-oxidation of very long chain fatty acids                                       Beta-oxidation of very long chain fatty acids
## Biotin transport and metabolism                                                                   Biotin transport and metabolism
## RMTs methylate histone arginines                                                                 RMTs methylate histone arginines
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol Synthesis of bile acids and bile salts via 27-hydroxycholesterol
##                                                                                                                       Description
## Defects in vitamin and cofactor metabolism                                             Defects in vitamin and cofactor metabolism
## Defects in cobalamin (B12) metabolism                                                       Defects in cobalamin (B12) metabolism
## Beta-oxidation of very long chain fatty acids                                       Beta-oxidation of very long chain fatty acids
## Biotin transport and metabolism                                                                   Biotin transport and metabolism
## RMTs methylate histone arginines                                                                 RMTs methylate histone arginines
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol Synthesis of bile acids and bile salts via 27-hydroxycholesterol
##                                                                  GeneRatio
## Defects in vitamin and cofactor metabolism                         15/1860
## Defects in cobalamin (B12) metabolism                               9/1860
## Beta-oxidation of very long chain fatty acids                       8/1860
## Biotin transport and metabolism                                     7/1860
## RMTs methylate histone arginines                                   19/1860
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol    9/1860
##                                                                   BgRatio
## Defects in vitamin and cofactor metabolism                       20/10922
## Defects in cobalamin (B12) metabolism                            12/10922
## Beta-oxidation of very long chain fatty acids                    11/10922
## Biotin transport and metabolism                                  11/10922
## RMTs methylate histone arginines                                 31/10922
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 15/10922
##                                                                        pvalue
## Defects in vitamin and cofactor metabolism                       1.835693e-08
## Defects in cobalamin (B12) metabolism                            1.588286e-05
## Beta-oxidation of very long chain fatty acids                    7.066624e-05
## Biotin transport and metabolism                                  7.158270e-04
## RMTs methylate histone arginines                                 3.985178e-08
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 2.206981e-04
##                                                                      p.adjust
## Defects in vitamin and cofactor metabolism                       1.687690e-06
## Defects in cobalamin (B12) metabolism                            8.653219e-04
## Beta-oxidation of very long chain fatty acids                    3.353227e-03
## Biotin transport and metabolism                                  2.024964e-02
## RMTs methylate histone arginines                                 3.256776e-06
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 8.324278e-03
##                                                                        qvalue
## Defects in vitamin and cofactor metabolism                       1.464931e-06
## Defects in cobalamin (B12) metabolism                            7.511077e-04
## Beta-oxidation of very long chain fatty acids                    2.910633e-03
## Biotin transport and metabolism                                  1.757689e-02
## RMTs methylate histone arginines                                 2.826913e-06
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol 7.225553e-03
##                                                                  Count       es
## Defects in vitamin and cofactor metabolism                          15 4.404032
## Defects in cobalamin (B12) metabolism                                9 4.404032
## Beta-oxidation of very long chain fatty acids                        8 4.270577
## Biotin transport and metabolism                                      7 3.736755
## RMTs methylate histone arginines                                    19 3.598994
## Synthesis of bile acids and bile salts via 27-hydroxycholesterol     9 3.523226
topdn1 <- head(ora1_dn$es,20)
names(topdn1) <- head(ora1_dn$ID,20)

Make a barplot

par(mar=c(5,20,5,1))

cols <- c(rep("blue",20),rep("red",20))

barplot(c(topdn1,topup1),
  horiz=TRUE,las=1,cex.names=0.65,col=cols,
  main="top DE Reactomes",
  xlab="ES")

mtext("whole genome background")

Compare pathway results

v1 <- list("correct up"=ora2_ups, "wg up"=ora1_ups)

v2 <- list("correct dn"=ora2_dns, "wg dn"=ora1_dns)

par(mar=c(10,10,10,10))

par(mfrow=c(2,1))
plot(euler(v1),quantities = TRUE)

plot(euler(v2),quantities = TRUE)

dev.off()
## X11cairo 
##        2

Session information

sessionInfo()
## R version 4.2.2 Patched (2022-11-10 r83330)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] eulerr_7.0.0                clusterProfiler_4.4.4      
##  [3] vioplot_0.4.0               zoo_1.8-11                 
##  [5] sm_2.2-5.7.1                kableExtra_1.3.4           
##  [7] DESeq2_1.36.0               SummarizedExperiment_1.26.1
##  [9] Biobase_2.56.0              MatrixGenerics_1.8.1       
## [11] matrixStats_0.63.0          GenomicRanges_1.48.0       
## [13] GenomeInfoDb_1.32.2         IRanges_2.30.0             
## [15] S4Vectors_0.34.0            BiocGenerics_0.42.0        
## [17] getDEE2_1.6.0              
## 
## loaded via a namespace (and not attached):
##   [1] shadowtext_0.1.2       fastmatch_1.1-3        systemfonts_1.0.4     
##   [4] plyr_1.8.8             igraph_1.3.5           lazyeval_0.2.2        
##   [7] polylabelr_0.2.0       splines_4.2.2          BiocParallel_1.30.3   
##  [10] ggplot2_3.4.0          digest_0.6.31          yulab.utils_0.0.5     
##  [13] htmltools_0.5.4        GOSemSim_2.22.0        viridis_0.6.2         
##  [16] GO.db_3.15.0           fansi_1.0.3            magrittr_2.0.3        
##  [19] memoise_2.0.1          Biostrings_2.64.0      annotate_1.74.0       
##  [22] graphlayouts_0.8.4     svglite_2.1.0          enrichplot_1.16.1     
##  [25] colorspace_2.0-3       blob_1.2.3             rvest_1.0.3           
##  [28] ggrepel_0.9.2          xfun_0.35              dplyr_1.0.10          
##  [31] crayon_1.5.2           RCurl_1.98-1.9         jsonlite_1.8.4        
##  [34] scatterpie_0.1.8       genefilter_1.78.0      survival_3.4-0        
##  [37] ape_5.6-2              glue_1.6.2             polyclip_1.10-4       
##  [40] gtable_0.3.1           zlibbioc_1.42.0        XVector_0.36.0        
##  [43] webshot_0.5.4          htm2txt_2.2.2          DelayedArray_0.22.0   
##  [46] scales_1.2.1           DOSE_3.22.0            DBI_1.1.3             
##  [49] Rcpp_1.0.9             viridisLite_0.4.1      xtable_1.8-4          
##  [52] gridGraphics_0.5-1     tidytree_0.4.1         bit_4.0.5             
##  [55] httr_1.4.4             fgsea_1.22.0           RColorBrewer_1.1-3    
##  [58] pkgconfig_2.0.3        XML_3.99-0.13          farver_2.1.1          
##  [61] sass_0.4.4             locfit_1.5-9.6         utf8_1.2.2            
##  [64] ggplotify_0.1.0        tidyselect_1.2.0       rlang_1.0.6           
##  [67] reshape2_1.4.4         AnnotationDbi_1.58.0   munsell_0.5.0         
##  [70] tools_4.2.2            cachem_1.0.6           downloader_0.4        
##  [73] cli_3.4.1              generics_0.1.3         RSQLite_2.2.19        
##  [76] evaluate_0.19          stringr_1.5.0          fastmap_1.1.0         
##  [79] yaml_2.3.6             ggtree_3.4.1           knitr_1.41            
##  [82] bit64_4.0.5            tidygraph_1.2.2        purrr_0.3.5           
##  [85] KEGGREST_1.36.3        ggraph_2.1.0           nlme_3.1-161          
##  [88] aplot_0.1.9            DO.db_2.9              xml2_1.3.3            
##  [91] compiler_4.2.2         rstudioapi_0.14        png_0.1-8             
##  [94] treeio_1.20.1          tibble_3.1.8           tweenr_2.0.2          
##  [97] geneplotter_1.74.0     bslib_0.4.1            stringi_1.7.8         
## [100] highr_0.9              lattice_0.20-45        Matrix_1.5-3          
## [103] vctrs_0.5.1            pillar_1.8.1           lifecycle_1.0.3       
## [106] jquerylib_0.1.4        data.table_1.14.6      bitops_1.0-7          
## [109] patchwork_1.1.2        qvalue_2.28.0          R6_2.5.1              
## [112] gridExtra_2.3          codetools_0.2-19       MASS_7.3-58.2         
## [115] assertthat_0.2.1       withr_2.5.0            GenomeInfoDbData_1.2.8
## [118] parallel_4.2.2         grid_4.2.2             ggfun_0.0.9           
## [121] tidyr_1.2.1            rmarkdown_2.19         ggforce_0.4.1