Transcriptome sequencing of whole cell and mito fractions of sedentary and trained rats

Methods
Read counts
Samplesheet
Overall clustering with multidimensional scaling
Functions
Set up contrasts
DE
Session info

Methods

Rats were kept in sedentary conditions or were trained. RNA was isolated from whole cell and mito fractions. Libraries were generated by the Deakin Genomics Facility and sequenced on the MiSeq system generating paired end 150 bp reads. Reads underwent adapter and quality (q>=20) trimming with Skewer version 0.2.2.

Read 1: AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

Read 2: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

Reads were then were mapped to the rat genome (Ensembl version 99) with STAR aligner version 2.7.3a. I was having issues with a low number of mapped reads, despite BLAST searches showing that the sequences were good, and I did a bit of troubleshooting. In the end I found that read 2 doesn't Map very well, which could be related to something about the adapter sequence of the lit that was used. To get around this, I only used read 1 in this analysis. I also ran MultiQC to summarise the QC checks that were done. STAR generated counts were imported into R for analysis.

Read counts

Import the STAR counts. We can also include some info out of the Ensembl GTF file including gene name and gene class.

# libraries
library("reshape2")
library("DESeq2")
library("mitch")
library("gplots")

# import the 3 column table
tmp<-read.table("3col.tsv",header=F)

# convert the 3 col table into a standard count matrix 
x<-as.matrix(acast(tmp, V2~V1, value.var="V3"))
# tidy up the column headers
colnames(x)<-sapply(strsplit(colnames(x),"_"),"[[",1)
#dont forget gene names
g<-read.table("../ref/Rattus_norvegicus.Rnor_6.0.99.genenames.tsv",row.names=1)
x<-merge(g,x,by=0)
# remove underscores from the gene biotype because it might cause issues later
x$V3<-gsub("_","",x$V3)
# gene accession numbers also have gene symbol and gene biotype
rownames(x)=paste(x$Row.names,x$V2,x$V3,sep="_")
# get rid of the columns that aren't required anymore
x$Row.names=x$V2=x$V3=NULL
# show the top 6 lines of data
head(x)

##                                                1  10  11  12  13  14  15  16
## ENSRNOG00000000001_AABR07013255.1_pseudogene   5  12   3   7   8   6   3   6
## ENSRNOG00000000007_Gad1_proteincoding          8   5  10   6  10   2   6   2
## ENSRNOG00000000008_Alx4_proteincoding          4  43  44  12  26  41  38  22
## ENSRNOG00000000009_Tmco5b_proteincoding        1   1   0   0   0   1   0   0
## ENSRNOG00000000010_Cbln1_proteincoding         0   0   1   1   0   1   0   1
## ENSRNOG00000000012_Tcf15_proteincoding       266 679 325 522 699 439 300 340
##                                               17  18  19   2  20  21  22  23
## ENSRNOG00000000001_AABR07013255.1_pseudogene  30   6   5   4   5   7   6   5
## ENSRNOG00000000007_Gad1_proteincoding          8   8  18   7   5  21   4   6
## ENSRNOG00000000008_Alx4_proteincoding         66  31  49  20  30  63  11  34
## ENSRNOG00000000009_Tmco5b_proteincoding        0   0   0   0   0   0   0   0
## ENSRNOG00000000010_Cbln1_proteincoding         1   0   1   4   2   1   6   3
## ENSRNOG00000000012_Tcf15_proteincoding       373 469 502 317 315 873 596 851
##                                               24  25  26  27  28  29   3  30
## ENSRNOG00000000001_AABR07013255.1_pseudogene  11   7   4  12   7   7   8   7
## ENSRNOG00000000007_Gad1_proteincoding          2  12   5  10  10   5   8   7
## ENSRNOG00000000008_Alx4_proteincoding         14  26  43  56  29  30  32  35
## ENSRNOG00000000009_Tmco5b_proteincoding        0   0   0   0   0   0   0   0
## ENSRNOG00000000010_Cbln1_proteincoding         3   0   0   0   2   2   1   0
## ENSRNOG00000000012_Tcf15_proteincoding       625 617 650 847 749 464 492 290
##                                               31  32 33 34 35 36 37 38 39   4
## ENSRNOG00000000001_AABR07013255.1_pseudogene   5  11  7  8 22  9 10 16 13   9
## ENSRNOG00000000007_Gad1_proteincoding          9   6 23 10 33 29 28 30 27   6
## ENSRNOG00000000008_Alx4_proteincoding         26  22  8 10 14  4 14 18  8  13
## ENSRNOG00000000009_Tmco5b_proteincoding        0   0 12  6 14 10 13 21 19   0
## ENSRNOG00000000010_Cbln1_proteincoding         4   0 20 10 21 15 30 17 10   0
## ENSRNOG00000000012_Tcf15_proteincoding       457 282 19  7  9  5  5 13  1 241
##                                              40 41 42 43 44 45 46 47 48 49   5
## ENSRNOG00000000001_AABR07013255.1_pseudogene 11  9  7 15 10 16  7  5 15 17   6
## ENSRNOG00000000007_Gad1_proteincoding        12 22 15 31 23 29 29 36 23 26   8
## ENSRNOG00000000008_Alx4_proteincoding        15  2  7 24 12 14  7 25  4 15  23
## ENSRNOG00000000009_Tmco5b_proteincoding      15  3  4  7  6 17  9  9  6 16   0
## ENSRNOG00000000010_Cbln1_proteincoding       19 12 13 17 12 18 22 16 11 14   0
## ENSRNOG00000000012_Tcf15_proteincoding        7  4  3 14 12 12 10  6  8  4 826
##                                              50 51 52 53 54 55 56 57 58 59   6
## ENSRNOG00000000001_AABR07013255.1_pseudogene 12 18 13 12  5  7  5 11 16  7  10
## ENSRNOG00000000007_Gad1_proteincoding        43 33 33 25 39 23 23 39 32 27  19
## ENSRNOG00000000008_Alx4_proteincoding        16 19  9 13 12 19  2 13  9  9  23
## ENSRNOG00000000009_Tmco5b_proteincoding      11 26 20 13 18 15 12 16  5 11   0
## ENSRNOG00000000010_Cbln1_proteincoding       22 38 13 23 19 22 25 15 19 22   0
## ENSRNOG00000000012_Tcf15_proteincoding       14  6  7  7 11  5  6  9  8  2 635
##                                              60 61 62 63 64   7   8   9 95 96
## ENSRNOG00000000001_AABR07013255.1_pseudogene 12 14 17 17 16   2   8   4  0  0
## ENSRNOG00000000007_Gad1_proteincoding        26 31 37 50 23   4   3   5  0  0
## ENSRNOG00000000008_Alx4_proteincoding        16 24 12 13 15  14  12  24  0  0
## ENSRNOG00000000009_Tmco5b_proteincoding       9 15 13 14  9   0   0   0  0  0
## ENSRNOG00000000010_Cbln1_proteincoding       25 26 20 29 31   0   2   0  0  0
## ENSRNOG00000000012_Tcf15_proteincoding        7  9  7 10  1 347 253 427  6 33

Samplesheet

samplesheet <- read.table("samplesheet.tsv",header=TRUE)
samplesheet$UDI <- gsub("UDI_0","",samplesheet$UDI)
samplesheet$UDI <- gsub("UDI_","",samplesheet$UDI)
samplesheet

##    UDI    fraction Group RatID
## 1    1 wholetissue     T     1
## 2    2 wholetissue     T     2
## 3    3 wholetissue     S     3
## 4    4 wholetissue     S     4
## 5    5 wholetissue     T     5
## 6    6 wholetissue     T     6
## 7    7 wholetissue     S     7
## 8    8 wholetissue     S     8
## 9    9 wholetissue     T     9
## 10  10 wholetissue     T    10
## 11  11 wholetissue     S    11
## 12  12 wholetissue     S    12
## 13  13 wholetissue     T    13
## 14  14 wholetissue     T    14
## 15  15 wholetissue     S    15
## 16  16 wholetissue     S    16
## 17  17 wholetissue     T    17
## 18  18 wholetissue     T    18
## 19  19 wholetissue     S    19
## 20  20 wholetissue     S    20
## 21  21 wholetissue     T    21
## 22  22 wholetissue     T    22
## 23  23 wholetissue     S    23
## 24  24 wholetissue     S    24
## 25  25 wholetissue     T    25
## 26  26 wholetissue     T    26
## 27  27 wholetissue     T    27
## 28  28 wholetissue     T    28
## 29  29 wholetissue     S    29
## 30  30 wholetissue     S    30
## 31  31 wholetissue     S    31
## 32  32 wholetissue     S    32
## 33  33        mito     T     1
## 34  34        mito     T     2
## 35  35        mito     S     3
## 36  36        mito     S     4
## 37  37        mito     T     5
## 38  38        mito     T     6
## 39  39        mito     S     7
## 40  40        mito     S     8
## 41  41        mito     T     9
## 42  42        mito     T    10
## 43  43        mito     S    11
## 44  44        mito     S    12
## 45  45        mito     T    13
## 46  46        mito     T    14
## 47  47        mito     S    15
## 48  48        mito     S    16
## 49  49        mito     T    17
## 50  50        mito     T    18
## 51  51        mito     S    19
## 52  52        mito     S    20
## 53  53        mito     T    21
## 54  54        mito     T    22
## 55  55        mito     S    23
## 56  56        mito     S    24
## 57  57        mito     T    25
## 58  58        mito     T    26
## 59  59        mito     T    27
## 60  60        mito     T    28
## 61  61        mito     S    29
## 62  62        mito     S    30
## 63  63        mito     S    31
## 64  64        mito     S    32

Overall clustering with multidimensional scaling

Sample 95 is a HUMAN mito control sample. Sample 96 is a RAT mito control sample. These should not be inluded in the main results.

mds <- cmdscale(dist(t(x)))
XMAX=max(mds[,1])*1.1
XMIN=min(mds[,1])*1.1
plot( mds , xlab="Coordinate 1", ylab="Coordinate 2",
  type = "n" , xlim=c(XMIN,XMAX),main="MDS plot",bty="n")
text(mds, labels=colnames(x) )

x <- x[,which(! colnames(x) %in% c("95","96"))]

mds <- cmdscale(dist(t(x)))
XMAX=max(mds[,1])*1.1
XMIN=min(mds[,1])*1.1
plot( mds , xlab="Coordinate 1", ylab="Coordinate 2",
  type = "n" , xlim=c(XMIN,XMAX),main="MDS plot",bty="n")
text(mds, labels=colnames(x) )

par(mar=c(5,10,5,3))
barplot(colSums(x),horiz=TRUE,las=2,main="number of reads per sample",cex.names=0.5)

Functions

run_de <- function(ss,xx){

y <- round(xx)

# MDS
mds <- cmdscale(dist(t(y)))
XMAX=max(mds[,1])*1.1
XMIN=min(mds[,1])*1.1
plot( mds , xlab="Coordinate 1", ylab="Coordinate 2",
  type = "n" , xlim=c(XMIN,XMAX),main="MDS plot",bty="n")
text(mds, labels=colnames(y) )

# DE
dds <- DESeqDataSetFromMatrix(countData=y, colData = ss, design = ~ trt)
dds <- DESeq(dds)
de <- DESeq2::results(dds)
de <- de[order(de$pvalue),]
up <- rownames(subset(de, log2FoldChange>0 & padj<0.05 ))
dn <- rownames(subset(de, log2FoldChange<0 & padj<0.05 ))
str(up)
str(dn)

# MA plot
sig <-subset(de, padj < 0.05 )
GENESUP <- length(up)
GENESDN <- length(dn)
SUBHEADER = paste(GENESUP, "up, ", GENESDN, "down")
ns <-subset(de, padj > 0.05 )
plot(log2(de$baseMean),de$log2FoldChange,
     xlab="log2 basemean", ylab="log2 foldchange",
     pch=19, cex=0.5, col="dark gray",
     main="smear plot")
points(log2(sig$baseMean),sig$log2FoldChange,
       pch=19, cex=0.5, col="red")
mtext(SUBHEADER)

# heatmap
yn <- y/colSums(y)*1000000
yf <- yn[which(rownames(yn) %in% rownames(de)[1:50]),]
mycols <- gsub("0","yellow",ss$trt)
mycols <- gsub("1","orange",mycols)
colfunc <- colorRampPalette(c("blue", "white", "red"))
heatmap.2(  as.matrix(yf), col=colfunc(25),scale="row",
    ColSideColors =mycols ,trace="none",
    margin = c(10,15), cexRow=0.6, cexCol=0.8 , main="Top 50 genes by p-val")
mtext("yellow=ctrl, orange=trt")

return(de)
}

Set up contrasts

Whole tissue S versus T
Mito fraction S versus T
Whole versus mito fraction in S
Whole versus mito fraction in T

ss1 <- subset(samplesheet,fraction=="wholetissue")
ss1$trt <- grepl("T",ss1$Group)*1
x1 <- x[,which(colnames(x) %in% rownames(ss1))]

ss2 <- subset(samplesheet,fraction=="mito")
ss2$trt <- grepl("T",ss2$Group)*1
x2 <- x[,which(colnames(x) %in% rownames(ss2))]

ss3 <- subset(samplesheet,Group=="S")
ss3$trt <- grepl("mito",ss3$fraction)*1
x3 <- x[,which(colnames(x) %in% rownames(ss3))]

ss4 <- subset(samplesheet,Group=="T")
ss4$trt <- grepl("mito",ss4$fraction)*1
x4 <- x[,which(colnames(x) %in% rownames(ss4))]

DE

Here, were using DESeq2 to perform differential expression analysis for the specified contrasts. The run_de function does the analysis and generate the charts. Here we actually run the analysis.

ss1

##    UDI    fraction Group RatID trt
## 1    1 wholetissue     T     1   1
## 2    2 wholetissue     T     2   1
## 3    3 wholetissue     S     3   0
## 4    4 wholetissue     S     4   0
## 5    5 wholetissue     T     5   1
## 6    6 wholetissue     T     6   1
## 7    7 wholetissue     S     7   0
## 8    8 wholetissue     S     8   0
## 9    9 wholetissue     T     9   1
## 10  10 wholetissue     T    10   1
## 11  11 wholetissue     S    11   0
## 12  12 wholetissue     S    12   0
## 13  13 wholetissue     T    13   1
## 14  14 wholetissue     T    14   1
## 15  15 wholetissue     S    15   0
## 16  16 wholetissue     S    16   0
## 17  17 wholetissue     T    17   1
## 18  18 wholetissue     T    18   1
## 19  19 wholetissue     S    19   0
## 20  20 wholetissue     S    20   0
## 21  21 wholetissue     T    21   1
## 22  22 wholetissue     T    22   1
## 23  23 wholetissue     S    23   0
## 24  24 wholetissue     S    24   0
## 25  25 wholetissue     T    25   1
## 26  26 wholetissue     T    26   1
## 27  27 wholetissue     T    27   1
## 28  28 wholetissue     T    28   1
## 29  29 wholetissue     S    29   0
## 30  30 wholetissue     S    30   0
## 31  31 wholetissue     S    31   0
## 32  32 wholetissue     S    32   0

de1 <- run_de(ss1,x1)

## converting counts to integer mode

##   the design formula contains one or more numeric variables with integer values,
##   specifying a model with increasing fold change for higher values.
##   did you mean for this to be a factor? if so, first convert
##   this variable to a factor using the factor() function

## estimating size factors

## estimating dispersions

## gene-wise dispersion estimates

## mean-dispersion relationship

## final dispersion estimates

## fitting model and testing

## -- replacing outliers and refitting for 186 genes
## -- DESeq argument 'minReplicatesForReplace' = 7 
## -- original counts are preserved in counts(dds)

## estimating dispersions

## fitting model and testing

##  chr(0) 
##  chr(0)

head(de1,20)

## log2 fold change (MLE): trt 
## Wald test p-value: trt 
## DataFrame with 20 rows and 6 columns
##                                                baseMean log2FoldChange
##                                               <numeric>      <numeric>
## ENSRNOG00000055229_LOC108348118_proteincoding   4.71871       1.278137
## ENSRNOG00000016275_Ttr_proteincoding           10.53329      -3.813266
## ENSRNOG00000029784_Pak1_proteincoding         252.68533      -0.301761
## ENSRNOG00000047230_Zfp551_proteincoding         2.48548       1.588395
## ENSRNOG00000041433_AABR07027009.1_miRNA         3.36231      -1.343134
## ...                                                 ...            ...
## ENSRNOG00000010002_Stpg3_proteincoding          1.64489       -1.99194
## ENSRNOG00000017223_Plg_proteincoding            7.81840       -2.14148
## ENSRNOG00000017381_Itih4_proteincoding          7.06781       -2.47796
## ENSRNOG00000015086_Plin1_proteincoding         28.94512       -1.80048
## ENSRNOG00000000053_Crp_proteincoding           10.12515       -2.48762
##                                                   lfcSE      stat      pvalue
##                                               <numeric> <numeric>   <numeric>
## ENSRNOG00000055229_LOC108348118_proteincoding 0.3363002   3.80058 0.000144357
## ENSRNOG00000016275_Ttr_proteincoding          1.0194842  -3.74039 0.000183737
## ENSRNOG00000029784_Pak1_proteincoding         0.0853485  -3.53564 0.000406792
## ENSRNOG00000047230_Zfp551_proteincoding       0.4661059   3.40780 0.000654894
## ENSRNOG00000041433_AABR07027009.1_miRNA       0.4014508  -3.34570 0.000820753
## ...                                                 ...       ...         ...
## ENSRNOG00000010002_Stpg3_proteincoding         0.648503  -3.07160  0.00212918
## ENSRNOG00000017223_Plg_proteincoding           0.704960  -3.03773  0.00238369
## ENSRNOG00000017381_Itih4_proteincoding         0.825695  -3.00106  0.00269044
## ENSRNOG00000015086_Plin1_proteincoding         0.602033  -2.99067  0.00278364
## ENSRNOG00000000053_Crp_proteincoding           0.834424  -2.98124  0.00287081
##                                                    padj
##                                               <numeric>
## ENSRNOG00000055229_LOC108348118_proteincoding  0.997511
## ENSRNOG00000016275_Ttr_proteincoding           0.997511
## ENSRNOG00000029784_Pak1_proteincoding          0.997511
## ENSRNOG00000047230_Zfp551_proteincoding        0.997511
## ENSRNOG00000041433_AABR07027009.1_miRNA        0.997511
## ...                                                 ...
## ENSRNOG00000010002_Stpg3_proteincoding         0.997511
## ENSRNOG00000017223_Plg_proteincoding           0.997511
## ENSRNOG00000017381_Itih4_proteincoding         0.997511
## ENSRNOG00000015086_Plin1_proteincoding         0.997511
## ENSRNOG00000000053_Crp_proteincoding           0.997511

write.table(de1,file="de1.tsv",quote=FALSE,sep="\t")

ss2

##    UDI fraction Group RatID trt
## 33  33     mito     T     1   1
## 34  34     mito     T     2   1
## 35  35     mito     S     3   0
## 36  36     mito     S     4   0
## 37  37     mito     T     5   1
## 38  38     mito     T     6   1
## 39  39     mito     S     7   0
## 40  40     mito     S     8   0
## 41  41     mito     T     9   1
## 42  42     mito     T    10   1
## 43  43     mito     S    11   0
## 44  44     mito     S    12   0
## 45  45     mito     T    13   1
## 46  46     mito     T    14   1
## 47  47     mito     S    15   0
## 48  48     mito     S    16   0
## 49  49     mito     T    17   1
## 50  50     mito     T    18   1
## 51  51     mito     S    19   0
## 52  52     mito     S    20   0
## 53  53     mito     T    21   1
## 54  54     mito     T    22   1
## 55  55     mito     S    23   0
## 56  56     mito     S    24   0
## 57  57     mito     T    25   1
## 58  58     mito     T    26   1
## 59  59     mito     T    27   1
## 60  60     mito     T    28   1
## 61  61     mito     S    29   0
## 62  62     mito     S    30   0
## 63  63     mito     S    31   0
## 64  64     mito     S    32   0

de2 <- run_de(ss2,x2)

## converting counts to integer mode

##   the design formula contains one or more numeric variables with integer values,
##   specifying a model with increasing fold change for higher values.
##   did you mean for this to be a factor? if so, first convert
##   this variable to a factor using the factor() function

## estimating size factors

## estimating dispersions

## gene-wise dispersion estimates

## mean-dispersion relationship

## final dispersion estimates

## fitting model and testing

## -- replacing outliers and refitting for 7 genes
## -- DESeq argument 'minReplicatesForReplace' = 7 
## -- original counts are preserved in counts(dds)

## estimating dispersions

## fitting model and testing

##  chr(0) 
##  chr(0)

head(de2,20)

## log2 fold change (MLE): trt 
## Wald test p-value: trt 
## DataFrame with 20 rows and 6 columns
##                                             baseMean log2FoldChange     lfcSE
##                                            <numeric>      <numeric> <numeric>
## ENSRNOG00000039876_LOC681410_proteincoding   8.79384       0.858222 0.2274371
## ENSRNOG00000006428_Olr1121_proteincoding     4.11959       1.123601 0.3008346
## ENSRNOG00000003821_Rptor_proteincoding      67.33842      -0.308260 0.0835924
## ENSRNOG00000045552_AC123316.2_miRNA          1.52349       2.287857 0.6244564
## ENSRNOG00000001129_Fbxo21_proteincoding     17.68718      -0.549951 0.1512828
## ...                                              ...            ...       ...
## ENSRNOG00000030721_Fbrsl1_proteincoding     21.00905      -0.466096  0.136953
## ENSRNOG00000017272_Mrps27_proteincoding     21.62768      -0.446042  0.131260
## ENSRNOG00000055377_AABR07010550.2_lincRNA    1.42731      -1.896113  0.568957
## ENSRNOG00000057666_AC120486.8_lincRNA        3.70419      -1.131126  0.339542
## ENSRNOG00000058653_Tmem52b_proteincoding    18.77783       0.496591  0.149408
##                                                 stat      pvalue      padj
##                                            <numeric>   <numeric> <numeric>
## ENSRNOG00000039876_LOC681410_proteincoding   3.77345 0.000161007  0.999907
## ENSRNOG00000006428_Olr1121_proteincoding     3.73495 0.000187756  0.999907
## ENSRNOG00000003821_Rptor_proteincoding      -3.68766 0.000226325  0.999907
## ENSRNOG00000045552_AC123316.2_miRNA          3.66376 0.000248542  0.999907
## ENSRNOG00000001129_Fbxo21_proteincoding     -3.63525 0.000277709  0.999907
## ...                                              ...         ...       ...
## ENSRNOG00000030721_Fbrsl1_proteincoding     -3.40333 0.000665689  0.999907
## ENSRNOG00000017272_Mrps27_proteincoding     -3.39816 0.000678417  0.999907
## ENSRNOG00000055377_AABR07010550.2_lincRNA   -3.33262 0.000860339  0.999907
## ENSRNOG00000057666_AC120486.8_lincRNA       -3.33133 0.000864325  0.999907
## ENSRNOG00000058653_Tmem52b_proteincoding     3.32372 0.000888267  0.999907

write.table(de2,file="de2.tsv",quote=FALSE,sep="\t")

ss3

##    UDI    fraction Group RatID trt
## 3    3 wholetissue     S     3   0
## 4    4 wholetissue     S     4   0
## 7    7 wholetissue     S     7   0
## 8    8 wholetissue     S     8   0
## 11  11 wholetissue     S    11   0
## 12  12 wholetissue     S    12   0
## 15  15 wholetissue     S    15   0
## 16  16 wholetissue     S    16   0
## 19  19 wholetissue     S    19   0
## 20  20 wholetissue     S    20   0
## 23  23 wholetissue     S    23   0
## 24  24 wholetissue     S    24   0
## 29  29 wholetissue     S    29   0
## 30  30 wholetissue     S    30   0
## 31  31 wholetissue     S    31   0
## 32  32 wholetissue     S    32   0
## 35  35        mito     S     3   1
## 36  36        mito     S     4   1
## 39  39        mito     S     7   1
## 40  40        mito     S     8   1
## 43  43        mito     S    11   1
## 44  44        mito     S    12   1
## 47  47        mito     S    15   1
## 48  48        mito     S    16   1
## 51  51        mito     S    19   1
## 52  52        mito     S    20   1
## 55  55        mito     S    23   1
## 56  56        mito     S    24   1
## 61  61        mito     S    29   1
## 62  62        mito     S    30   1
## 63  63        mito     S    31   1
## 64  64        mito     S    32   1

de3 <- run_de(ss3,x3)

## converting counts to integer mode

##   the design formula contains one or more numeric variables with integer values,
##   specifying a model with increasing fold change for higher values.
##   did you mean for this to be a factor? if so, first convert
##   this variable to a factor using the factor() function

## estimating size factors

## estimating dispersions

## gene-wise dispersion estimates

## mean-dispersion relationship

## final dispersion estimates

## fitting model and testing

## -- replacing outliers and refitting for 1251 genes
## -- DESeq argument 'minReplicatesForReplace' = 7 
## -- original counts are preserved in counts(dds)

## estimating dispersions

## fitting model and testing

##  chr [1:13784] "ENSRNOG00000022225_Lrrc19_proteincoding" ...
##  chr [1:5154] "ENSRNOG00000028837_AC128207.1_proteincoding" ...

head(de3,20)

## log2 fold change (MLE): trt 
## Wald test p-value: trt 
## DataFrame with 20 rows and 6 columns
##                                              baseMean log2FoldChange     lfcSE
##                                             <numeric>      <numeric> <numeric>
## ENSRNOG00000028837_AC128207.1_proteincoding 337.14753       -3.58784  0.442079
## ENSRNOG00000032994_Myom3_proteincoding      512.56938       -2.01408  0.265160
## ENSRNOG00000022225_Lrrc19_proteincoding       6.64842        7.05924  0.983164
## ENSRNOG00000039530_Ccl27_proteincoding       19.44680        2.51241  0.350338
## ENSRNOG00000061428_Pasd1_proteincoding        6.52198        6.47254  0.906512
## ...                                               ...            ...       ...
## ENSRNOG00000021366_Tas2r116_proteincoding     4.86768        6.81497  0.982972
## ENSRNOG00000013840_Ankrd2_proteincoding     679.69684       -3.00865  0.437474
## ENSRNOG00000052427_AC109660.1_lincRNA         6.13738        6.93105  1.008122
## ENSRNOG00000054547_AABR07070532.1_lincRNA     5.72950        6.84749  0.998736
## ENSRNOG00000037455_AABR07039303.2_lincRNA     4.72197        6.55220  0.957271
##                                                  stat      pvalue        padj
##                                             <numeric>   <numeric>   <numeric>
## ENSRNOG00000028837_AC128207.1_proteincoding  -8.11584 4.82434e-16 1.46009e-11
## ENSRNOG00000032994_Myom3_proteincoding       -7.59569 3.06145e-14 4.63275e-10
## ENSRNOG00000022225_Lrrc19_proteincoding       7.18012 6.96497e-13 5.53438e-09
## ENSRNOG00000039530_Ccl27_proteincoding        7.17138 7.42437e-13 5.53438e-09
## ENSRNOG00000061428_Pasd1_proteincoding        7.14006 9.32929e-13 5.53438e-09
## ...                                               ...         ...         ...
## ENSRNOG00000021366_Tas2r116_proteincoding     6.93303 4.11927e-12 7.79186e-09
## ENSRNOG00000013840_Ankrd2_proteincoding      -6.87732 6.09910e-12 1.04078e-08
## ENSRNOG00000052427_AC109660.1_lincRNA         6.87521 6.19003e-12 1.04078e-08
## ENSRNOG00000054547_AABR07070532.1_lincRNA     6.85616 7.07362e-12 1.12675e-08
## ENSRNOG00000037455_AABR07039303.2_lincRNA     6.84467 7.66524e-12 1.14626e-08

write.table(de3,file="de3.tsv",quote=FALSE,sep="\t")

ss4

##    UDI    fraction Group RatID trt
## 1    1 wholetissue     T     1   0
## 2    2 wholetissue     T     2   0
## 5    5 wholetissue     T     5   0
## 6    6 wholetissue     T     6   0
## 9    9 wholetissue     T     9   0
## 10  10 wholetissue     T    10   0
## 13  13 wholetissue     T    13   0
## 14  14 wholetissue     T    14   0
## 17  17 wholetissue     T    17   0
## 18  18 wholetissue     T    18   0
## 21  21 wholetissue     T    21   0
## 22  22 wholetissue     T    22   0
## 25  25 wholetissue     T    25   0
## 26  26 wholetissue     T    26   0
## 27  27 wholetissue     T    27   0
## 28  28 wholetissue     T    28   0
## 33  33        mito     T     1   1
## 34  34        mito     T     2   1
## 37  37        mito     T     5   1
## 38  38        mito     T     6   1
## 41  41        mito     T     9   1
## 42  42        mito     T    10   1
## 45  45        mito     T    13   1
## 46  46        mito     T    14   1
## 49  49        mito     T    17   1
## 50  50        mito     T    18   1
## 53  53        mito     T    21   1
## 54  54        mito     T    22   1
## 57  57        mito     T    25   1
## 58  58        mito     T    26   1
## 59  59        mito     T    27   1
## 60  60        mito     T    28   1

de4 <- run_de(ss4,x4)

## converting counts to integer mode

##   the design formula contains one or more numeric variables with integer values,
##   specifying a model with increasing fold change for higher values.
##   did you mean for this to be a factor? if so, first convert
##   this variable to a factor using the factor() function

## estimating size factors

## estimating dispersions

## gene-wise dispersion estimates

## mean-dispersion relationship

## final dispersion estimates

## fitting model and testing

## -- replacing outliers and refitting for 1695 genes
## -- DESeq argument 'minReplicatesForReplace' = 7 
## -- original counts are preserved in counts(dds)

## estimating dispersions

## fitting model and testing

##  chr [1:13583] "ENSRNOG00000019427_Tescl_proteincoding" ...
##  chr [1:5090] "ENSRNOG00000001052_Slc25a30_proteincoding" ...

head(de4,20)

## log2 fold change (MLE): trt 
## Wald test p-value: trt 
## DataFrame with 20 rows and 6 columns
##                                                  baseMean log2FoldChange
##                                                 <numeric>      <numeric>
## ENSRNOG00000019427_Tescl_proteincoding            9.95968        7.73129
## ENSRNOG00000012022_RGD1562914_proteincoding       8.94441        7.56678
## ENSRNOG00000054565_AABR07028792.1_lincRNA         5.95716        6.78165
## ENSRNOG00000061481_AC141168.1_lincRNA             8.41463        7.48675
## ENSRNOG00000045570_LOC288978_proteincoding       21.88827        2.91417
## ...                                                   ...            ...
## ENSRNOG00000059861_AC128394.3_lincRNA             6.30683        7.25553
## ENSRNOG00000012581_RGD1311933_proteincoding       6.82569        7.16511
## ENSRNOG00000060901_AABR07007875.1_lincRNA         4.75365        6.83457
## ENSRNOG00000031317_AABR07064224.1_proteincoding   5.23226        6.76955
## ENSRNOG00000049311_AABR07066826.2_proteincoding   7.32528        6.87672
##                                                     lfcSE      stat      pvalue
##                                                 <numeric> <numeric>   <numeric>
## ENSRNOG00000019427_Tescl_proteincoding           0.990375   7.80643 5.88302e-15
## ENSRNOG00000012022_RGD1562914_proteincoding      0.992933   7.62063 2.52434e-14
## ENSRNOG00000054565_AABR07028792.1_lincRNA        0.897023   7.56017 4.02556e-14
## ENSRNOG00000061481_AC141168.1_lincRNA            0.991603   7.55015 4.34743e-14
## ENSRNOG00000045570_LOC288978_proteincoding       0.387694   7.51667 5.61895e-14
## ...                                                   ...       ...         ...
## ENSRNOG00000059861_AC128394.3_lincRNA            1.017983   7.12736 1.02309e-12
## ENSRNOG00000012581_RGD1311933_proteincoding      1.005779   7.12394 1.04882e-12
## ENSRNOG00000060901_AABR07007875.1_lincRNA        0.959425   7.12361 1.05137e-12
## ENSRNOG00000031317_AABR07064224.1_proteincoding  0.951151   7.11722 1.10126e-12
## ENSRNOG00000049311_AABR07066826.2_proteincoding  0.966393   7.11586 1.11218e-12
##                                                        padj
##                                                   <numeric>
## ENSRNOG00000019427_Tescl_proteincoding          1.77991e-10
## ENSRNOG00000012022_RGD1562914_proteincoding     3.28829e-10
## ENSRNOG00000054565_AABR07028792.1_lincRNA       3.28829e-10
## ENSRNOG00000061481_AC141168.1_lincRNA           3.28829e-10
## ENSRNOG00000045570_LOC288978_proteincoding      3.40003e-10
## ...                                                     ...
## ENSRNOG00000059861_AC128394.3_lincRNA           1.68244e-09
## ENSRNOG00000012581_RGD1311933_proteincoding     1.68244e-09
## ENSRNOG00000060901_AABR07007875.1_lincRNA       1.68244e-09
## ENSRNOG00000031317_AABR07064224.1_proteincoding 1.68244e-09
## ENSRNOG00000049311_AABR07066826.2_proteincoding 1.68244e-09

write.table(de4,file="de4.tsv",quote=FALSE,sep="\t")

Session info

sessionInfo()

## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] gplots_3.1.0                mitch_1.0.10               
##  [3] DESeq2_1.28.1               SummarizedExperiment_1.18.2
##  [5] DelayedArray_0.14.1         matrixStats_0.57.0         
##  [7] Biobase_2.48.0              GenomicRanges_1.40.0       
##  [9] GenomeInfoDb_1.24.2         IRanges_2.22.2             
## [11] S4Vectors_0.26.1            BiocGenerics_0.34.0        
## [13] reshape2_1.4.4             
## 
## loaded via a namespace (and not attached):
##  [1] bit64_4.0.5            splines_4.0.3          gtools_3.8.2          
##  [4] shiny_1.5.0            blob_1.2.1             GenomeInfoDbData_1.2.3
##  [7] yaml_2.2.1             pillar_1.4.6           RSQLite_2.2.1         
## [10] lattice_0.20-41        glue_1.4.2             digest_0.6.27         
## [13] RColorBrewer_1.1-2     promises_1.1.1         XVector_0.28.0        
## [16] colorspace_1.4-1       htmltools_0.5.0        httpuv_1.5.4          
## [19] Matrix_1.2-18          plyr_1.8.6             XML_3.99-0.5          
## [22] pkgconfig_2.0.3        genefilter_1.70.0      zlibbioc_1.34.0       
## [25] purrr_0.3.4            xtable_1.8-4           scales_1.1.1          
## [28] later_1.1.0.1          BiocParallel_1.22.0    tibble_3.0.4          
## [31] annotate_1.66.0        echarts4r_0.3.3        generics_0.1.0        
## [34] ggplot2_3.3.2          ellipsis_0.3.1         survival_3.2-7        
## [37] magrittr_1.5           crayon_1.3.4           mime_0.9              
## [40] evaluate_0.14          memoise_1.1.0          GGally_2.0.0          
## [43] MASS_7.3-53            beeswarm_0.2.3         tools_4.0.3           
## [46] lifecycle_0.2.0        stringr_1.4.0          munsell_0.5.0         
## [49] locfit_1.5-9.4         AnnotationDbi_1.50.3   compiler_4.0.3        
## [52] caTools_1.18.0         rlang_0.4.8            grid_4.0.3            
## [55] RCurl_1.98-1.2         htmlwidgets_1.5.2      rmarkdown_2.5         
## [58] bitops_1.0-6           gtable_0.3.0           DBI_1.1.0             
## [61] reshape_0.8.8          R6_2.5.0               gridExtra_2.3         
## [64] knitr_1.30             dplyr_1.0.2            fastmap_1.0.1         
## [67] bit_4.0.4              KernSmooth_2.23-18     stringi_1.5.3         
## [70] Rcpp_1.0.5             vctrs_0.3.4            geneplotter_1.66.0    
## [73] tidyselect_1.1.0       xfun_0.19