In this report we will do QC and differential expression analysis. Let’s QC this data.
suppressPackageStartupMessages({
library("gplots")
library("reshape2")
library("WGCNA")
library("dplyr")
library("DESeq2")
library("mitch")
library("MASS")
library("eulerr")
library("beeswarm")
})
Please have a look at the multiQC report. Here are a few key points:
Skewer trimming resulted in loss of only a tiny number of bases. This indicates the sequence quality is very high.
Fastqc results showing the number of unique and duplicate reads indicates a few samples with <10M unique reads.
Per seqence GC content showed an unusual profile for two samples. PG1423-EOS R1 and R2 had GC profile max at 40% compared to the mean. PG2090-EOS also showed an unusual pattern with underrepresented low GC%.
Sequence duplication levels were elevated for some fastq files. Here are the files of concern, with <20% unique reads: PG3627-POD1_S86_R1_001 PG3627-POD1_S86_R2_001 PG3609-T0_S317_R1_001 PG2090-EOS_S134_R1_001 PG2090-EOS_S134_R2_001
There were two files with overrepresented sequences: PG2090-EOS R1 and R2. Others are okay.
Adapter content was very low which is good.
The fastq files were also checked with validatefastq-assembly which looks for signs of file corruption which can occur in large data transfers. No problematic files were detected.
Ribosomal RNA carryover can be a source of noise. The proportion should be <10% and there were a few samples in excess of this including PG2020-EOS, PG815-EOS, PG1452-EOS and PG702-POD1.
rrna <- read.table("rrna_stats.txt")
rrna <- rrna[,c(1,5)]
rrna$V1 <- sapply(strsplit(rrna$V1,"\\."),"[[",1)
rrna$V5 <- gsub("\\(","",rrna$V5)
rrna$V5 <- gsub("%","",rrna$V5)
rrna$V5 <- as.numeric(rrna$V5)
str(rrna)
## 'data.frame': 319 obs. of 2 variables:
## $ V1: chr "3166-POD1_S266_R1_001" "3166-T0_S265_R1_001" "3167-POD1_S268_R1_001" "3167-T0_S267_R1_001" ...
## $ V5: num 0.57 1.11 0.61 0.93 0.96 0.79 0.7 5.2 1.14 2.83 ...
rrna2 <- rrna[,2]
names(rrna2) <- rrna[,1]
par(mar=c(5,8,3,1))
barplot(rrna2,horiz=TRUE,las=1,cex.names=0.5,main="rRNA carryover")
rrna2 <- rrna2[order(-rrna2)]
barplot(head(rrna2,20),horiz=TRUE,las=1,cex.names=0.6,main="rRNA carryover")
tmp <- read.table("3col.tsv.gz",header=FALSE)
x <- as.matrix(acast(tmp, V2~V1, value.var="V3", fun.aggregate = sum))
x <- as.data.frame(x)
accession <- sapply((strsplit(rownames(x),"\\|")),"[[",2)
symbol<-sapply((strsplit(rownames(x),"\\|")),"[[",6)
x$geneid <- paste(accession,symbol)
xx <- aggregate(. ~ geneid,x,sum)
rownames(xx) <- xx$geneid
colnames <- gsub("T0R","T0",colnames(xx))
xx$geneid = NULL
xx <- round(xx)
xx[1:10,1:6]
## 3166-POD1 3166-T0 3167-POD1 3167-T0 3171-POD1
## ENSG00000000003.15 TSPAN6 3 1 5 5 23
## ENSG00000000005.6 TNMD 0 0 0 0 0
## ENSG00000000419.14 DPM1 685 577 521 735 811
## ENSG00000000457.14 SCYL3 622 611 550 777 789
## ENSG00000000460.17 C1orf112 181 171 232 263 215
## ENSG00000000938.13 FGR 33797 44344 31524 38959 26402
## ENSG00000000971.16 CFH 106 40 98 183 195
## ENSG00000001036.14 FUCA2 1229 769 1150 868 978
## ENSG00000001084.13 GCLC 944 1085 577 961 908
## ENSG00000001167.15 NFYA 1243 1277 1295 1605 1166
## 3171-T0
## ENSG00000000003.15 TSPAN6 4
## ENSG00000000005.6 TNMD 1
## ENSG00000000419.14 DPM1 494
## ENSG00000000457.14 SCYL3 575
## ENSG00000000460.17 C1orf112 196
## ENSG00000000938.13 FGR 33751
## ENSG00000000971.16 CFH 130
## ENSG00000001036.14 FUCA2 805
## ENSG00000001084.13 GCLC 798
## ENSG00000001167.15 NFYA 1251
Let’s look at the number of reads per sample
Most samples were in the range of 25-30 million assigned reads. Just 2 samples had less than 20 million reads: PG1452-EOS and PG1423-EOS. The maximum read count was about 40 million for PG7072-EOS.
xxcs <- colSums(xx)
par(mar=c(5,8,3,1))
barplot(xxcs,horiz=TRUE,las=1,main="no. reads per sample")
barplot(head(xxcs[order(xxcs)],20),horiz=TRUE,las=1,main="lowest no. reads per sample")
barplot(head(xxcs[order(-xxcs)],20),horiz=TRUE,las=1,main="highest no. reads per sample")
Some outliers are apparent.
PG2090-EOS to the left of the chart - this is clearly the effect of rRNA carryover. Other samples over to the left of the chart include PG815-EOS, PG145-EOS and PG702-POD1 which all have elevated rRNA.
heatmap.2( cor(xx),trace="none",scale="none")
mds <- cmdscale(dist(t(xx)))
par(mar=c(5,5,3,1))
minx <- min(mds[,1])
maxx <- max(mds[,1])
miny <- min(mds[,2])
maxy <- max(mds[,2])
plot(mds, xlab="Coordinate 1", ylab="Coordinate 2",
xlim=c(minx*1.1,maxx*1.1), ylim = c(miny*1.1,maxy*1.1) ,
type = "p", col="gray", pch=19, cex.axis=1.3,cex.lab=1.3, bty='n')
text(mds, labels=rownames(mds), cex=0.8)
col <- rownames(mds)
col <- sapply(strsplit(col,"-"),"[[",2)
col <- gsub("T0","lightblue",col)
col <- gsub("POD1","orange",col)
col <- gsub("EOS","pink",col)
plot(mds, xlab="Coordinate 1", ylab="Coordinate 2",
xlim=c(minx*1.1,maxx*1.1), ylim = c(miny*1.1,maxy*1.1) , cex=1.5 ,
type = "p", col=col, pch=19, cex.axis=1.3,cex.lab=1.3, bty='n')
#text(mds, labels=rownames(mds), cex=0.8)
mtext("blue=T0, orange=POD1, pink=EOS")
Exclude PG2090-EOS and repeat the analysis.
xx <- xx[,grep("PG2090-EOS",colnames(xx),invert=TRUE)]
mds <- cmdscale(dist(t(xx)))
par(mar=c(5,5,3,1))
minx <- min(mds[,1])
maxx <- max(mds[,1])
miny <- min(mds[,2])
maxy <- max(mds[,2])
plot(mds, xlab="Coordinate 1", ylab="Coordinate 2",
xlim=c(minx*1.1,maxx*1.1), ylim = c(miny*1.1,maxy*1.1) ,
type = "p", col="gray", pch=19, cex.axis=1.3,cex.lab=1.3, bty='n')
text(mds, labels=rownames(mds), cex=0.8)
col <- rownames(mds)
col <- sapply(strsplit(col,"-"),"[[",2)
col <- gsub("T0","lightblue",col)
col <- gsub("POD1","orange",col)
col <- gsub("EOS","pink",col)
plot(mds, xlab="Coordinate 1", ylab="Coordinate 2",
xlim=c(minx*1.1,maxx*1.1), ylim = c(miny*1.1,maxy*1.1) , cex=1.5 ,
type = "p", col=col, pch=19, cex.axis=1.3,cex.lab=1.3, bty='n')
#text(mds, labels=rownames(mds), cex=0.8)
mtext("blue=T0, orange=POD1, pink=EOS")
In the MDS plot with PG2090-EOS removed, there appears to be some separation of T0, POD1 and EOS samples. POD1 (orange) are more towards the upper side of the chart and T0 (blue) are toward the bottom right. EOS (pink) are quite spread out.
PG2090-EOS suffered rRNA carryover and needs to be re-prepared. The other samples with slightly higher rRNA are not a problem as the rRNA can be corrected for statistically. not sure what to do about samples with low numbers of unique reads.
xx <- xx[,order(colnames(xx))]
ss <- read.csv("PADDIgenomicsData.csv")
ss <- ss[order(ss$PG_number),]
colnames(ss)
## [1] "PG_number" "sexD"
## [3] "ageD" "weightD"
## [5] "heightD" "asaD"
## [7] "ethnicityD" "ethnicity_otherD"
## [9] "current_smokerD" "diabetes_typeD"
## [11] "daily_insulinD" "oral_hypoglycemicsD"
## [13] "non_insulin_injectablesD" "diabetes_yrs_since_diagnosisD"
## [15] "DM_years" "creatinine_preopD"
## [17] "crp_preopD" "crp_preop_typeD"
## [19] "crp_preop_naD" "hba1c_doneD"
## [21] "surgery_typeD" "surgery_procedureD"
## [23] "surgery_dominantD" "wound_typeOP"
## [25] "non_study_dexameth_steriodPOSTOP" "nonstudy_dexameth_steriodD3"
## [27] "HbA1c" "bmi"
## [29] "whodas_total_preop" "revised_whodas_preop"
## [31] "neut_lymph_ratio_d0" "neut_lymph_ratio_d1"
## [33] "neut_lymph_ratio_change_d1" "neut_lymph_ratio_d2"
## [35] "neut_lymph_ratio_change_d2" "neut_lymph_ratio_d1_2"
## [37] "neut_lymph_ratio_d2_2" "ab_noninfection"
## [39] "risk" "risk_cat"
## [41] "bmi_cat" "asa_cat"
## [43] "wound_type_cat" "oxygen_quin"
## [45] "duration_sx" "duration_sx_quin"
## [47] "anyDex" "anyDex_count"
## [49] "anyDexMiss" "anyDex2"
## [51] "treatment_group" "deltacrp"
## [53] "crp_group"
str(ss)
## 'data.frame': 117 obs. of 53 variables:
## $ PG_number : chr "3166" "3167" "3171" "3172" ...
## $ sexD : chr "Male" "Male" "Male" "Male" ...
## $ ageD : int 62 67 61 78 73 77 84 54 70 62 ...
## $ weightD : num 64.5 78.8 71.1 43 83.6 ...
## $ heightD : num 163 169 165 156 171 167 133 155 170 175 ...
## $ asaD : int 2 2 2 2 2 3 3 2 2 2 ...
## $ ethnicityD : chr "Asian" "Asian" "Asian" "Asian" ...
## $ ethnicity_otherD : chr "" "" "" "" ...
## $ current_smokerD : chr "No" "No" "No" "No" ...
## $ diabetes_typeD : chr "" "" "" "" ...
## $ daily_insulinD : chr "" "" "" "" ...
## $ oral_hypoglycemicsD : chr "" "" "" "" ...
## $ non_insulin_injectablesD : chr "" "" "" "" ...
## $ diabetes_yrs_since_diagnosisD : int NA NA NA NA NA 1 NA NA NA NA ...
## $ DM_years : int NA NA NA NA NA 1 NA NA NA NA ...
## $ creatinine_preopD : int 68 82 82 96 105 90 54 47 109 98 ...
## $ crp_preopD : chr "2.1" "0.6" "2.7" "1.2" ...
## $ crp_preop_typeD : chr "CRP" "CRP" "CRP" "CRP" ...
## $ crp_preop_naD : int 0 0 0 0 0 0 0 0 0 0 ...
## $ hba1c_doneD : chr "Yes" "Yes" "Yes" "Yes" ...
## $ surgery_typeD : chr "Laparoscopic assisted low anterior resection of rectum" "Laparoscopic sigmoidectomy" "Laparoscopic assisted anterior resection of rectum" "Robotic assisted laparoscopic radical prostatectomy, pelvic lymph node dissection" ...
## $ surgery_procedureD : chr "None of the above" "None of the above" "None of the above" "None of the above" ...
## $ surgery_dominantD : chr "Gastrointestinal" "Gastrointestinal" "Gastrointestinal" "Urology-renal" ...
## $ wound_typeOP : chr "Clean / contaminated" "Clean / contaminated" "Clean / contaminated" "Clean / contaminated" ...
## $ non_study_dexameth_steriodPOSTOP: chr "No" "No" "No" "No" ...
## $ nonstudy_dexameth_steriodD3 : chr "No" "No" "No" "No" ...
## $ HbA1c : num 5.7 6.2 6.2 6.3 6.3 ...
## $ bmi : num 24.3 27.6 26.1 17.7 28.6 ...
## $ whodas_total_preop : int 16 12 12 12 12 12 24 14 12 12 ...
## $ revised_whodas_preop : int 16 12 12 12 12 12 24 14 12 12 ...
## $ neut_lymph_ratio_d0 : num 4.3 2.94 2.29 2.93 2.62 ...
## $ neut_lymph_ratio_d1 : num 13 6.5 7.22 23.2 8.57 ...
## $ neut_lymph_ratio_change_d1 : num 8.7 3.56 4.93 20.27 5.95 ...
## $ neut_lymph_ratio_d2 : num 5.92 3.68 3.77 22 NA ...
## $ neut_lymph_ratio_change_d2 : num 1.623 0.741 1.475 19.071 NA ...
## $ neut_lymph_ratio_d1_2 : num 13 6.5 7.22 23.2 8.57 ...
## $ neut_lymph_ratio_d2_2 : num 5.92 3.68 3.77 22 NA ...
## $ ab_noninfection : int 1 1 0 1 1 1 1 1 1 1 ...
## $ risk : int 2 2 2 2 2 5 4 1 2 1 ...
## $ risk_cat : chr "Moderate" "Moderate" "Moderate" "Moderate" ...
## $ bmi_cat : chr "Normal [18.5 to <25]" "Overweight [25 to <30]" "Overweight [25 to <30]" "Underweight [BMI<18.5]" ...
## $ asa_cat : chr "1-2" "1-2" "1-2" "1-2" ...
## $ wound_type_cat : chr "Contaminated" "Contaminated" "Contaminated" "Contaminated" ...
## $ oxygen_quin : chr "0.21-0.4" "0.21-0.4" "0.21-0.4" "0.21-0.4" ...
## $ duration_sx : num 2.5 2.67 2.42 3.17 2.5 ...
## $ duration_sx_quin : chr "2.18-2.82" "2.18-2.82" "2.18-2.82" "2.83-3.75" ...
## $ anyDex : chr "No" "No" "No" "No" ...
## $ anyDex_count : int 0 0 0 0 0 0 0 0 0 0 ...
## $ anyDexMiss : int 0 0 0 0 0 0 0 0 0 0 ...
## $ anyDex2 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ treatment_group : int 1 1 2 2 1 1 2 1 2 1 ...
## $ deltacrp : num 39.3 38.3 49 189.9 7.3 ...
## $ crp_group : int 1 1 1 4 1 1 4 1 4 1 ...
summary(ss)
## PG_number sexD ageD weightD
## Length:117 Length:117 Min. :25.00 Min. : 41.00
## Class :character Class :character 1st Qu.:54.00 1st Qu.: 68.50
## Mode :character Mode :character Median :62.00 Median : 82.00
## Mean :61.03 Mean : 84.55
## 3rd Qu.:69.00 3rd Qu.: 95.40
## Max. :86.00 Max. :185.00
##
## heightD asaD ethnicityD ethnicity_otherD
## Min. :133.0 Min. :1.000 Length:117 Length:117
## 1st Qu.:163.0 1st Qu.:2.000 Class :character Class :character
## Median :171.0 Median :2.000 Mode :character Mode :character
## Mean :170.2 Mean :2.308
## 3rd Qu.:178.0 3rd Qu.:3.000
## Max. :193.0 Max. :4.000
##
## current_smokerD diabetes_typeD daily_insulinD oral_hypoglycemicsD
## Length:117 Length:117 Length:117 Length:117
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## non_insulin_injectablesD diabetes_yrs_since_diagnosisD DM_years
## Length:117 Min. : 1.000 Min. : 1.000
## Class :character 1st Qu.: 1.500 1st Qu.: 1.500
## Mode :character Median : 7.000 Median : 7.000
## Mean : 7.467 Mean : 7.467
## 3rd Qu.:11.000 3rd Qu.:11.000
## Max. :18.000 Max. :18.000
## NA's :102 NA's :102
## creatinine_preopD crp_preopD crp_preop_typeD crp_preop_naD
## Min. : 19.0 Length:117 Length:117 Min. :0
## 1st Qu.: 66.0 Class :character Class :character 1st Qu.:0
## Median : 76.0 Mode :character Mode :character Median :0
## Mean : 80.3 Mean :0
## 3rd Qu.: 91.0 3rd Qu.:0
## Max. :177.0 Max. :0
## NA's :10
## hba1c_doneD surgery_typeD surgery_procedureD surgery_dominantD
## Length:117 Length:117 Length:117 Length:117
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## wound_typeOP non_study_dexameth_steriodPOSTOP
## Length:117 Length:117
## Class :character Class :character
## Mode :character Mode :character
##
##
##
##
## nonstudy_dexameth_steriodD3 HbA1c bmi
## Length:117 Min. : 4.500 Min. :16.59
## Class :character 1st Qu.: 5.200 1st Qu.:24.93
## Mode :character Median : 5.600 Median :28.07
## Mean : 5.714 Mean :29.00
## 3rd Qu.: 5.900 3rd Qu.:31.73
## Max. :10.000 Max. :72.27
##
## whodas_total_preop revised_whodas_preop neut_lymph_ratio_d0
## Min. :12.00 Min. :12.00 Min. : 0.5312
## 1st Qu.:12.00 1st Qu.:12.00 1st Qu.: 1.8254
## Median :14.00 Median :14.00 Median : 2.5737
## Mean :16.74 Mean :16.74 Mean : 2.8745
## 3rd Qu.:17.00 3rd Qu.:17.00 3rd Qu.: 3.3338
## Max. :50.00 Max. :50.00 Max. :11.0000
## NA's :9
## neut_lymph_ratio_d1 neut_lymph_ratio_change_d1 neut_lymph_ratio_d2
## Min. : 1.375 Min. :-1.255 Min. : 0.1235
## 1st Qu.: 5.132 1st Qu.: 2.610 1st Qu.: 3.7692
## Median : 7.353 Median : 4.450 Median : 6.7273
## Mean : 8.882 Mean : 6.088 Mean : 8.1589
## 3rd Qu.:11.627 3rd Qu.: 8.730 3rd Qu.:10.8889
## Max. :44.000 Max. :39.299 Max. :25.6042
## NA's :13 NA's :21 NA's :28
## neut_lymph_ratio_change_d2 neut_lymph_ratio_d1_2 neut_lymph_ratio_d2_2
## Min. :-6.182 Min. : 1.375 Min. : 0.1235
## 1st Qu.: 1.591 1st Qu.: 5.132 1st Qu.: 3.7692
## Median : 4.356 Median : 7.353 Median : 6.7273
## Mean : 5.356 Mean : 8.882 Mean : 8.1589
## 3rd Qu.: 7.403 3rd Qu.:11.627 3rd Qu.:10.8889
## Max. :22.776 Max. :44.000 Max. :25.6042
## NA's :35 NA's :13 NA's :28
## ab_noninfection risk risk_cat bmi_cat
## Min. :0.0000 Min. :0.000 Length:117 Length:117
## 1st Qu.:0.0000 1st Qu.:1.000 Class :character Class :character
## Median :0.0000 Median :1.000 Mode :character Mode :character
## Mean :0.4495 Mean :1.598
## 3rd Qu.:1.0000 3rd Qu.:2.000
## Max. :1.0000 Max. :6.000
## NA's :8
## asa_cat wound_type_cat oxygen_quin duration_sx
## Length:117 Length:117 Length:117 Min. : 0.6833
## Class :character Class :character Class :character 1st Qu.: 2.5000
## Mode :character Mode :character Mode :character Median : 3.3333
## Mean : 3.9007
## 3rd Qu.: 4.7667
## Max. :10.6667
##
## duration_sx_quin anyDex anyDex_count anyDexMiss
## Length:117 Length:117 Min. :0.0000 Min. :0.000000
## Class :character Class :character 1st Qu.:0.0000 1st Qu.:0.000000
## Mode :character Mode :character Median :0.0000 Median :0.000000
## Mean :0.1282 Mean :0.008547
## 3rd Qu.:0.0000 3rd Qu.:0.000000
## Max. :2.0000 Max. :1.000000
##
## anyDex2 treatment_group deltacrp crp_group
## Min. :0.0000 Min. :1.000 Min. :-16.7 Min. :1.000
## 1st Qu.:0.0000 1st Qu.:1.000 1st Qu.: 32.9 1st Qu.:1.000
## Median :0.0000 Median :2.000 Median : 49.5 Median :1.000
## Mean :0.1111 Mean :1.556 Mean :130.9 Mean :2.487
## 3rd Qu.:0.0000 3rd Qu.:2.000 3rd Qu.:221.1 3rd Qu.:4.000
## Max. :1.0000 Max. :2.000 Max. :359.0 Max. :4.000
##
ss1 <- ss
rownames(ss) <- paste(ss$PG_number,ss$timepoint,sep="-")
dim(ss)
## [1] 117 53
ss$ageCS <- scale(ss$ageD)
ss$sexD <- as.numeric(factor(ss$sexD))
ss$ethnicityCAT <- ss$ethnicityD
ss$ethnicityD <- as.numeric(factor(ss$ethnicityD))
ss$current_smokerD <- as.numeric(factor(ss$current_smokerD))
ss$diabetes_typeD <- as.numeric(factor(ss$diabetes_typeD))
ss$daily_insulinD <- as.numeric(factor(ss$daily_insulinD))
ss$oral_hypoglycemicsD <- as.numeric(factor(ss$oral_hypoglycemicsD))
ss$crp_preopD <- as.numeric(gsub("<5","2.5",gsub("<1","0.5",gsub("<1.0","0.5",ss$crp_preopD))))
ss$surgery_dominantD <- as.numeric(factor(ss$surgery_dominantD))
ss$wound_typeOP <- as.numeric(factor(ss$wound_typeOP))
ss$risk_cat <- as.numeric(factor(ss$risk_cat,levels=c("Low","Moderate","High")))
ss$wound_type_cat <- as.numeric(factor(ss$wound_type_cat))
ss$anyDex <- as.numeric(factor(ss$anyDex))
ss$bmi_cat <- as.numeric(factor(ss$bmi_cat,
levels=c("Underweight [BMI<18.5]","Normal [18.5 to <25]",
"Overweight [25 to <30]","Obese [30 to <40]","Super obese [40+]")))
ss <- ss[,c("PG_number","sexD","ageD","ageCS","weightD","asaD","heightD","ethnicityCAT","ethnicityD",
"current_smokerD","diabetes_typeD","daily_insulinD","creatinine_preopD",
"surgery_dominantD","wound_typeOP","HbA1c","bmi","revised_whodas_preop",
"neut_lymph_ratio_d0","neut_lymph_ratio_d1","neut_lymph_ratio_d2","ab_noninfection",
"risk","risk_cat","bmi_cat","wound_type_cat","duration_sx","anyDex","treatment_group",
"deltacrp","crp_group")]
ss <- ss[order(rownames(ss)),]
ss_t0 <- ss
ss_eos <- ss
ss_pod1 <- ss
ss_t0$timepoint <- "T0"
ss_eos$timepoint <- "EOS"
ss_pod1$timepoint <- "POD1"
rownames(ss_t0) <- paste(ss_t0$PG_number,"T0",sep="-")
rownames(ss_eos) <- paste(ss_t0$PG_number,"EOS",sep="-")
rownames(ss_pod1) <- paste(ss_t0$PG_number,"POD1",sep="-")
ss <- rbind(ss_t0, ss_eos, ss_pod1)
rownames(ss) <- paste(ss$PG_number,ss$timepoint,sep="-")
xt0 <- xx[,grep("T0",colnames(xx))]
xpod1 <- xx[,grep("POD1",colnames(xx))]
xeos <- xx[,grep("EOS",colnames(xx))]
xt0f <- xt0[rowMeans(xt0)>=10,]
xpod1f <- xpod1[rowMeans(xpod1)>=10,]
xeosf <- xeos[rowMeans(xeos)>=10,]
dim(xt0f)
## [1] 21935 111
dim(xpod1f)
## [1] 21313 109
dim(xeosf)
## [1] 22067 98
ss_t0 <- ss_t0[which(rownames(ss_t0) %in% colnames(xt0)),]
ss_pod1 <- ss_pod1[which(rownames(ss_pod1) %in% colnames(xpod1)),]
ss_eos <- ss_eos[which(rownames(ss_eos) %in% colnames(xeos)),]
colnames(xt0) %in% rownames(ss_t0)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [106] TRUE TRUE TRUE TRUE TRUE TRUE
colnames(xpod1) %in% rownames(ss_pod1)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [106] TRUE TRUE TRUE TRUE
colnames(xeos) %in% rownames(ss_eos)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
rownames(ss_t0) %in% colnames(xt0)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [106] TRUE TRUE TRUE TRUE TRUE TRUE
rownames(ss_pod1) %in% colnames(xpod1)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [106] TRUE TRUE TRUE TRUE
rownames(ss_eos) %in% colnames(xeos)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [76] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [91] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
xxf <- xx[rowMeans(xx)>=10,]
xxf <- xxf[,order(colnames(xxf))]
This is a clinical study and each patient has detailed clinical metadata. Not all of these will be important to the gene expression profiles. Do determine that, we will use PCA analysis of the first 5 PCs to understand which PCs associate with which clinical parameters.
TODO: Infection
mx <- xt0f
ss2 <- ss_t0
ss2$ethnicityCAT = ss2$ageCS = NULL
ss2$timepoint = ss2$PG_number = NULL
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
par(cex=0.75, mar = c(6, 8.5, 3, 3))
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-trait relationships @T0: Top principal components"))
mx <- xeosf
ss2 <- ss_eos
ss2$ethnicityCAT = ss2$ageCS = NULL
ss2$timepoint = ss2$PG_number =NULL
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-trait relationships @EOS: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
mx <- xpod1f
ss2 <- ss_pod1
ss2$ethnicityCAT = ss2$ageCS = NULL
ss2$timepoint = ss2$PG_number = NULL
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-trait relationships @POD1: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
Now export PDF.
mx <- xt0f
ss2 <- ss_t0
ss2$ethnicityCAT = ss2$ageCS = NULL
ss2$timepoint = ss2$PG_number = NULL
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
pdf("pca_cor.pdf",height=7,width=7)
par(cex=0.75, mar = c(6, 8.5, 3, 3))
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-trait relationships @T0: Top principal components"))
mx <- xeosf
ss2 <- ss_eos
ss2$ethnicityCAT = ss2$ageCS = NULL
ss2$timepoint = ss2$PG_number = NULL
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-trait relationships @EOS: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
mx <- xpod1f
ss2 <- ss_pod1
ss2$ethnicityCAT = ss2$ageCS = NULL
ss2$timepoint = ss2$PG_number = NULL
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-trait relationships @POD1: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
dev.off()
## X11cairo
## 2
PCA plots
par(mfrow=c(3,3))
#T0
mx <- xt0f
ss2 <- ss_t0
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-T0","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("T0")
text(pca$x[,1:2],labels=labs)
plot(pca$x[,c(1,3)],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("T0")
text(pca$x[,c(1,3)],labels=labs)
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("T0")
text(pca$x[,2:3],labels=labs)
#EOS
mx <- xeosf
ss2 <- ss_eos
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-EOS","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("EOS")
text(pca$x[,1:2],labels=labs)
plot(pca$x[,c(1,3)],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("EOS")
text(pca$x[,c(1,3)],labels=labs)
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("EOS")
text(pca$x[,2:3],labels=labs)
#POD1
mx <- xpod1f
ss2 <- ss_pod1
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-POD1","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs)
mtext("POD1")
plot(pca$x[,c(1,3)],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs)
mtext("POD1")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs)
mtext("POD1")
dev.off()
## X11cairo
## 2
pdf("pca_charts.pdf",width=9,height=9)
par(mfrow=c(3,3))
#T0
mx <- xt0f
ss2 <- ss_t0
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-T0","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("T0")
text(pca$x[,1:2],labels=labs)
plot(pca$x[,c(1,3)],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("T0")
text(pca$x[,c(1,3)],labels=labs)
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("T0")
text(pca$x[,2:3],labels=labs)
#EOS
mx <- xeosf
ss2 <- ss_eos
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-EOS","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("EOS")
text(pca$x[,1:2],labels=labs)
plot(pca$x[,c(1,3)],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("EOS")
text(pca$x[,c(1,3)],labels=labs)
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
mtext("EOS")
text(pca$x[,2:3],labels=labs)
#POD1
mx <- xpod1f
ss2 <- ss_pod1
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-POD1","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs)
mtext("POD1")
plot(pca$x[,c(1,3)],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs)
mtext("POD1")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col="gray",pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs)
mtext("POD1")
dev.off()
## X11cairo
## 2
Specific PCAs for key clinical parameters:
And ones we didn’t include:
# wound type clean (1) contaminated (2)
mx <- xt0f
ss2 <- ss_t0
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-T0","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$wound_type_cat)
cols <- gsub("2","red",gsub("1","gray",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - wound type")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - wound type")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - wound type")
# surg duration
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_t0$duration_sx, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - surgical duration deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - surgical duration deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - surgical duration deciles")
# Ethnicity Levels [1-4]: Asian, Maori/Polynesian, Other, White/Caucasian
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$ethnicityD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - ethnicity")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - ethnicity")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - ethnicity")
# age
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_t0$ageD, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - age deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - age deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - age deciles")
# sex female=1 male=2
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$sexD)
cols <- gsub("1","pink",gsub("2","lightblue",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - sex: female=pink, male=lightblue")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - sex: female=pink, male=lightblue")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - sex: female=pink, male=lightblue")
# bmi
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_t0$bmi, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - BMI deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - BMI deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - BMI deciles")
# asaD levels 1:4 black,red,green,blue
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$asaD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - asaD")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - asaD")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - asaD")
# Current smoker no, yes
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$current_smokerD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - current smoker")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - current smoker")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - current smoker")
# diabetes
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$diabetes_typeD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - diabetes")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - diabetes")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - diabetes")
# treatment group
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$treatment_group)
cols <- gsub("2","orange",gsub("1","cyan3",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - treatment group")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - treatment group")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - treatment group")
# treatment group
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_t0$crp_group)
cols <- gsub("4","orange",gsub("1","cyan3",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - CRP group")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - CRP group")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - CRP group")
EOS.
# wound type clean (1) contaminated (2)
mx <- xeosf
ss2 <- ss_eos
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-EOS","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$wound_type_cat)
cols <- gsub("2","red",gsub("1","gray",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - wound type")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - wound type")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - wound type")
# surg duration
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_eos$duration_sx, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - surgical duration deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - surgical duration deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - surgical duration deciles")
# Ethnicity Levels [1-4]: Asian, Maori/Polynesian, Other, White/Caucasian
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$ethnicityD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - ethnicity")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - ethnicity")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - ethnicity")
# age
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_eos$ageD, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - age deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - age deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - age deciles")
# sex female=1 male=2
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$sexD)
cols <- gsub("1","pink",gsub("2","lightblue",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - sex: female=pink, male=lightblue")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - sex: female=pink, male=lightblue")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - sex: female=pink, male=lightblue")
# bmi
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_eos$bmi, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - BMI deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - BMI deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - BMI deciles")
# asaD levels 1:4 black,red,green,blue
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$asaD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - asaD")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - asaD")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - asaD")
# Current smoker no, yes
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$current_smokerD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - current smoker")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - current smoker")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - current smoker")
# diabetes
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$diabetes_typeD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - diabetes")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - diabetes")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - diabetes")
# treatment group
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$treatment_group)
cols <- gsub("2","orange",gsub("1","cyan3",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - treatment group")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - treatment group")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - treatment group")
# treatment group
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_eos$crp_group)
cols <- gsub("4","orange",gsub("1","cyan3",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - CRP group")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - CRP group")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - CRP group")
POD1.
# wound type clean (1) contaminated (2)
mx <- xpod1f
ss2 <- ss_pod1
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-POD1","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$wound_type_cat)
cols <- gsub("2","red",gsub("1","gray",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - wound type")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - wound type")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - wound type")
# surg duration
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_pod1$duration_sx, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - surgical duration deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - surgical duration deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - surgical duration deciles")
# Ethnicity Levels [1-4]: Asian, Maori/Polynesian, Other, White/Caucasian
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$ethnicityD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - ethnicity")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - ethnicity")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - ethnicity")
# age
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_pod1$ageD, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - age deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - age deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - age deciles")
# sex female=1 male=2
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$sexD)
cols <- gsub("1","pink",gsub("2","lightblue",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - sex: female=pink, male=lightblue")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - sex: female=pink, male=lightblue")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - sex: female=pink, male=lightblue")
# bmi
my_palette <- colorRampPalette(c("yellow", "orange", "red"))(n = 10)
decile <- ntile(ss_pod1$bmi, 10)
mycols <- my_palette[decile]
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
plot(pca$x[,1:2],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - BMI deciles")
plot(pca$x[,c(1,3)],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - BMI deciles")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=mycols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - BMI deciles")
# asaD levels 1:4 black,red,green,blue
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$asaD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - asaD")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - asaD")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - asaD")
# Current smoker no, yes
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$current_smokerD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - current smoker")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - current smoker")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - current smoker")
# diabetes
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$diabetes_typeD)
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - diabetes")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - diabetes")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - diabetes")
# treatment group
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$treatment_group)
cols <- gsub("2","orange",gsub("1","cyan3",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - treatment group")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - treatment group")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - treatment group")
# treatment group
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss_pod1$crp_group)
cols <- gsub("4","orange",gsub("1","cyan3",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - CRP group")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - CRP group")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - CRP group")
Load infection data
infec <- read.table("infec.tsv",header=TRUE)
head(infec)
## PG_number infection30d crp_group
## 1 PG022 0 1
## 2 PG177 0 4
## 3 PG198 1 4
## 4 PG3233 1 1
## 5 PG002 0 4
## 6 PG004 0 1
mx <- xt0f
ss2 <- ss_t0
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 21
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-T0","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss2$infec)
cols <- gsub("1","red",gsub("0","gray",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("T0 - infection")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("T0 - infection")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("T0 - infection")
mx <- xeosf
ss2 <- ss_eos
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 77 21
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-EOS","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss2$infec)
cols <- gsub("1","red",gsub("0","gray",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("EOS - infection")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("EOS - infection")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("EOS - infection")
mx <- xpod1f
ss2 <- ss_pod1
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 19
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
labs=gsub("-POD1","",rownames(pca$x))
XMIN=min(pca$x[,1])*1.1
XMAX=max(pca$x[,1])*1.1
cols <- as.character(ss2$infec)
cols <- gsub("1","red",gsub("0","gray",cols))
plot(pca$x[,1:2],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,1:2],labels=labs,cex=0.7)
mtext("POD1 - infection")
plot(pca$x[,c(1,3)],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,c(1,3)],labels=labs,cex=0.7)
mtext("POD1 - infection")
XMIN=min(pca$x[,2])*1.1
XMAX=max(pca$x[,2])*1.1
plot(pca$x[,2:3],cex=2,col=cols,pch=19,bty="none", xlim=c(XMIN,XMAX) )
text(pca$x[,2:3],labels=labs,cex=0.7)
mtext("POD1 - infection")
xn <- xx
gt <- as.data.frame(sapply(strsplit(rownames(xn)," "),"[[",2) )
rownames(gt) <- rownames(xx)
colnames(gt) = "genesymbol"
gt$geneID <- rownames(xx)
blood <- read.table("https://raw.githubusercontent.com/giannimonaco/ABIS/master/data/sigmatrixRNAseq.txt")
blood2 <- merge(gt,blood,by.x="genesymbol",by.y=0)
blood2 <- blood2[which(!duplicated(blood2$genesymbol)),]
rownames(blood2) <- blood2$geneID
blood2 <- blood2[,c(3:ncol(blood2))]
genes <- intersect(rownames(xx), rownames(blood2))
dec <- apply(xx[genes, , drop=F], 2, function(x) coef(rlm( as.matrix(blood2[genes,]), x, maxit =100 ))) *100
## Warning in rlm.default(as.matrix(blood2[genes, ]), x, maxit = 100): 'rlm'
## failed to converge in 100 steps
## Warning in rlm.default(as.matrix(blood2[genes, ]), x, maxit = 100): 'rlm'
## failed to converge in 100 steps
dec <- t(dec/colSums(dec)*100)
dec <- signif(dec, 3)
# remove negative values
dec2 <- t(apply(dec,2,function(x) { mymin=min(x) ; if (mymin<0) { x + (mymin * -1) } else { x } } ))
dec2 <- apply(dec2,2,function(x) {x / sum(x) *100} )
colfunc <- colorRampPalette(c("blue", "white", "red"))
heatmap.2( dec2, col=colfunc(25),scale="row",
trace="none",margins = c(5,5), cexRow=.7, cexCol=.8, main="cell type abundances")
heatmap.2( dec2, col=colfunc(25),scale="none",
trace="none",margins = c(5,5), cexRow=.7, cexCol=.8, main="cell type abundances")
par(mar=c(5,10,3,1))
boxplot(t(dec2[order(rowMeans(dec2)),]),horizontal=TRUE,las=1, xlab="estimated cell proportion (%)")
par(mar = c(5.1, 4.1, 4.1, 2.1))
heatmap.2( cor(dec2),trace="none",scale="none")
heatmap.2( cor(t(dec2)),trace="none",scale="none", margins = c(8,8))
par(mar=c(5,10,3,1))
barplot(apply(dec2,1,sd),horiz=TRUE,las=1,xlab="SD of cell proportions (%)")
which(apply(dec2,1,sd)>4)
## Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
## 1 2 3 4 10
Based on this analysis we can begin with correction of:
According to the correlation heatmap, these are not strongly correlated.
Now look at how the cell proportions change over time.
ct0 <- dec2[,grep("-T0",colnames(dec2))]
ceos <- dec2[,grep("-EOS",colnames(dec2))]
cpod1 <- dec2[,grep("-POD1",colnames(dec2))]
par(mar=c(5,10,3,1))
boxplot(t(ct0),horizontal=TRUE,las=1, xlab="estimated cell proportion (%)",main="T0")
boxplot(t(ceos),horizontal=TRUE,las=1, xlab="estimated cell proportion (%)",main="EOS")
boxplot(t(cpod1),horizontal=TRUE,las=1, xlab="estimated cell proportion (%)",main="POD1")
sscell <- as.data.frame(t(dec2))
sscell_t0 <- sscell[grep("-T0",rownames(sscell)),]
sscell_eos <- sscell[grep("-EOS",rownames(sscell)),]
sscell_pod1 <- sscell[grep("POD1",rownames(sscell)),]
Now look at how cell types associate with the PCAs.
#xt0f xeosf xpod1f
#sscell_t0 sscell_eos sscell_pod1
## T0
mx <- xt0f
ss2 <- sscell_t0
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
par(mar = c(5.1, 4.1, 4.1, 2.1))
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-cell relationships @T0: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are below given minimum and will be truncated to
## the minimum.
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
## EOS
mx <- xeosf
ss2 <- sscell_eos
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-cell relationships @EOS: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are below given minimum and will be truncated to
## the minimum.
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
## POD1
mx <- xpod1f
ss2 <- sscell_pod1
pca <- prcomp(t(mx),center = TRUE, scale = TRUE,retx=TRUE)
loadings = pca$x
plot(pca,type="lines",col="blue")
nGenes <- nrow(mx)
nSamples <- ncol(mx)
datTraits <- ss2
moduleTraitCor <- cor(loadings[,1:8], datTraits, use = "p")
moduleTraitPvalue <- corPvalueStudent(moduleTraitCor, nSamples)
textMatrix <- paste(signif(moduleTraitCor, 2), "\n(",
signif(moduleTraitPvalue, 1), ")", sep = "")
dim(textMatrix) = dim(moduleTraitCor)
labeledHeatmap(Matrix = t(moduleTraitCor),
xLabels = colnames(loadings)[1:ncol(t(moduleTraitCor))],
yLabels = names(datTraits), colorLabels = FALSE, colors = blueWhiteRed(6),
textMatrix = t(textMatrix), setStdMargins = FALSE, cex.text = 0.5,
cex.lab.y = 0.6, zlim = c(-0.45,0.45),
main = paste("PCA-cell relationships @POD1: Top principal components"))
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are below given minimum and will be truncated to
## the minimum.
## Warning in numbers2colors(data, signed, colors = colors, lim = zlim, naColor =
## naColor): Some values of 'x' are above given maximum and will be truncated to
## the maximum.
The conclusion here is that the cell types correlate strongly with the principal components. The good news is that we have selected the cell types that associate the strongest, so we can correct for their contribution.
Specific PCAs for key clinical parameters:
And blood composition:
And ones we didn’t include:
TODO:
age data centred and scaled
ethnicity categories unordered
CRP group comparisons not stratified for treatment group (inflamation)
Treatment group comparisons not stratified for CRP group (Steroid)
CRP group comparisons statified for treatment group: inflammation and steroid
Treatment group comparisons stratified for CRP group: steroid and inflammation
Sex differences in low CRP group (not stratified for treatment group)
Sex differences in high CRP group (not stratified for treatment group)
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 390 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000179593.16 ALOX15B 192.12350 -0.7187508 0.10489179 -6.852308
## ENSG00000141744.4 PNMT 35.64128 -0.4354625 0.09456325 -4.604986
## ENSG00000087116.16 ADAMTS2 96.08857 -0.5387022 0.12313891 -4.374752
## ENSG00000057294.16 PKP2 83.96200 -0.3049742 0.07109219 -4.289842
## ENSG00000279359.1 RP11-36D19.9 12.76771 -0.5030155 0.11986382 -4.196558
## ENSG00000276168.1 RN7SL1 591.11188 0.2489061 0.06119920 4.067147
## ENSG00000063438.20 AHRR 92.23299 -0.4595163 0.11376981 -4.039000
## ENSG00000233916.1 ZDHHC20P1 21.16714 -0.3347389 0.08736544 -3.831480
## ENSG00000189056.15 RELN 17.27434 0.1809771 0.04776302 3.789062
## ENSG00000274012.1 RN7SL2 1037.64399 0.2367552 0.06518141 3.632250
## pvalue padj
## ENSG00000179593.16 ALOX15B 7.266794e-12 1.593971e-07
## ENSG00000141744.4 PNMT 4.124926e-06 4.524013e-02
## ENSG00000087116.16 ADAMTS2 1.215705e-05 8.888828e-02
## ENSG00000057294.16 PKP2 1.788006e-05 9.804975e-02
## ENSG00000279359.1 RP11-36D19.9 2.710017e-05 1.188885e-01
## ENSG00000276168.1 RN7SL1 4.759225e-05 1.682084e-01
## ENSG00000063438.20 AHRR 5.367947e-05 1.682084e-01
## ENSG00000233916.1 ZDHHC20P1 1.273749e-04 3.492459e-01
## ENSG00000189056.15 RELN 1.512169e-04 3.685491e-01
## ENSG00000274012.1 RN7SL2 2.809608e-04 6.162874e-01
mean(abs(dge$stat))
## [1] 0.7207644
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 19 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000223609.11 HBD 153.11532 0.6127017 0.13080850 4.683959
## ENSG00000261026.1 CTD-3247F14.2 13.74757 -1.0503473 0.23156953 -4.535775
## ENSG00000206177.7 HBM 45.70050 0.6155113 0.14218666 4.328896
## ENSG00000004939.16 SLC4A1 233.82225 0.4920445 0.12031713 4.089563
## ENSG00000169877.10 AHSP 34.93993 0.6188578 0.15316672 4.040419
## ENSG00000179593.16 ALOX15B 250.75091 -0.3691708 0.09277372 -3.979261
## ENSG00000218052.5 ADAMTS7P4 23.44923 0.2098524 0.05372181 3.906280
## ENSG00000268734.1 CTB-61M7.2 10.82400 -1.4946787 0.38782949 -3.853958
## ENSG00000166947.15 EPB42 37.40076 0.4773174 0.12488402 3.822085
## ENSG00000179388.9 EGR3 270.39768 -0.5197675 0.13823652 -3.759987
## pvalue padj
## ENSG00000223609.11 HBD 2.813859e-06 0.0617220
## ENSG00000261026.1 CTD-3247F14.2 5.739229e-06 0.0629450
## ENSG00000206177.7 HBM 1.498584e-05 0.1095715
## ENSG00000004939.16 SLC4A1 4.321860e-05 0.2340715
## ENSG00000169877.10 AHSP 5.335572e-05 0.2340715
## ENSG00000179593.16 ALOX15B 6.912983e-05 0.2527272
## ENSG00000218052.5 ADAMTS7P4 9.372795e-05 0.2937032
## ENSG00000268734.1 CTB-61M7.2 1.162234e-04 0.3186701
## ENSG00000166947.15 EPB42 1.323281e-04 0.3225129
## ENSG00000179388.9 EGR3 1.699222e-04 0.3727244
mean(abs(dge$stat))
## [1] 0.7617828
crp_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 10 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000223609.11 HBD 153.11532 0.58209593 0.13422143 4.336833
## ENSG00000206177.7 HBM 45.70050 0.55382261 0.14446831 3.833523
## ENSG00000004939.16 SLC4A1 233.82225 0.46687689 0.12375126 3.772704
## ENSG00000132122.12 SPATA6 215.55260 -0.09259567 0.02479843 -3.733933
## ENSG00000169877.10 AHSP 34.93993 0.58801794 0.15784167 3.725366
## ENSG00000076864.20 RAP1GAP 14.58834 0.30003375 0.08110668 3.699248
## ENSG00000181126.13 HLA-V 366.66600 -0.44464770 0.12076431 -3.681946
## ENSG00000218052.5 ADAMTS7P4 23.44923 0.18427003 0.05011221 3.677148
## ENSG00000170153.11 RNF150 16.74377 -0.64212105 0.17646644 -3.638771
## ENSG00000166947.15 EPB42 37.40076 0.45926219 0.12825639 3.580813
## pvalue padj
## ENSG00000223609.11 HBD 1.445504e-05 0.3170712
## ENSG00000206177.7 HBM 1.263209e-04 0.6289033
## ENSG00000004939.16 SLC4A1 1.614877e-04 0.6289033
## ENSG00000132122.12 SPATA6 1.885127e-04 0.6289033
## ENSG00000169877.10 AHSP 1.950322e-04 0.6289033
## ENSG00000076864.20 RAP1GAP 2.162390e-04 0.6289033
## ENSG00000181126.13 HLA-V 2.314602e-04 0.6289033
## ENSG00000218052.5 ADAMTS7P4 2.358558e-04 0.6289033
## ENSG00000170153.11 RNF150 2.739418e-04 0.6289033
## ENSG00000166947.15 EPB42 3.425263e-04 0.6289033
mean(abs(dge$stat))
## [1] 0.7499593
crp_t0_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 118 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000139572.4 GPR84 231.01210 0.7893177 0.09543038 8.271137
## ENSG00000113368.12 LMNB1 2665.88768 0.4224482 0.05453011 7.747062
## ENSG00000280091.1 CTC-312O10.3 32.83608 0.5061573 0.06958189 7.274267
## ENSG00000137193.14 PIM1 7966.43548 0.3023359 0.04160685 7.266494
## ENSG00000170525.21 PFKFB3 4788.95198 0.5364125 0.07387475 7.261108
## ENSG00000079385.23 CEACAM1 1095.76169 0.7812428 0.10858770 7.194579
## ENSG00000069399.15 BCL3 3591.85376 0.4579237 0.06480903 7.065740
## ENSG00000184557.4 SOCS3 13113.96513 0.6655088 0.09494900 7.009118
## ENSG00000198019.13 FCGR1B 681.83152 0.5275512 0.07634910 6.909724
## ENSG00000163251.4 FZD5 91.58576 0.4341470 0.06308193 6.882272
## pvalue padj
## ENSG00000139572.4 GPR84 1.326872e-16 2.871217e-12
## ENSG00000113368.12 LMNB1 9.404335e-15 1.017502e-10
## ENSG00000280091.1 CTC-312O10.3 3.483056e-13 1.661581e-09
## ENSG00000137193.14 PIM1 3.689370e-13 1.661581e-09
## ENSG00000170525.21 PFKFB3 3.839320e-13 1.661581e-09
## ENSG00000079385.23 CEACAM1 6.265387e-13 2.259612e-09
## ENSG00000069399.15 BCL3 1.597623e-12 4.938709e-09
## ENSG00000184557.4 SOCS3 2.398244e-12 6.486949e-09
## ENSG00000198019.13 FCGR1B 4.855988e-12 1.167541e-08
## ENSG00000163251.4 FZD5 5.890533e-12 1.274652e-08
mean(abs(dge$stat))
## [1] 1.485292
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 10 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000139572.4 GPR84 231.01210 0.7183062 0.11564967 6.211053
## ENSG00000127954.13 STEAP4 2533.73582 0.5130516 0.09062352 5.661352
## ENSG00000184557.4 SOCS3 13113.96513 0.6216040 0.11307436 5.497303
## ENSG00000176597.12 B3GNT5 367.05640 0.4553917 0.08837595 5.152891
## ENSG00000059804.16 SLC2A3 9795.99656 0.4116953 0.08137334 5.059338
## ENSG00000170525.21 PFKFB3 4788.95198 0.4492633 0.08908047 5.043343
## ENSG00000069399.15 BCL3 3591.85376 0.3933702 0.07815095 5.033467
## ENSG00000121742.19 GJB6 54.89967 0.6668659 0.13279923 5.021610
## ENSG00000113368.12 LMNB1 2665.88768 0.3025021 0.06103820 4.955947
## ENSG00000173281.5 PPP1R3B 1142.35672 0.4439020 0.08968117 4.949780
## pvalue padj
## ENSG00000139572.4 GPR84 5.263078e-10 1.138877e-05
## ENSG00000127954.13 STEAP4 1.501854e-08 1.624930e-04
## ENSG00000184557.4 SOCS3 3.856442e-08 2.781651e-04
## ENSG00000176597.12 B3GNT5 2.565005e-07 1.385985e-03
## ENSG00000059804.16 SLC2A3 4.207135e-07 1.385985e-03
## ENSG00000170525.21 PFKFB3 4.574686e-07 1.385985e-03
## ENSG00000069399.15 BCL3 4.816882e-07 1.385985e-03
## ENSG00000121742.19 GJB6 5.124025e-07 1.385985e-03
## ENSG00000113368.12 LMNB1 7.197870e-07 1.407299e-03
## ENSG00000173281.5 PPP1R3B 7.429758e-07 1.407299e-03
mean(abs(dge$stat))
## [1] 1.185118
crp_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 9 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000197632.9 SERPINB2 260.41250 0.3545761 0.06850137 5.176189
## ENSG00000127954.13 STEAP4 2533.73582 0.2836443 0.05488758 5.167732
## ENSG00000139572.4 GPR84 231.01210 0.4490274 0.08779641 5.114416
## ENSG00000211459.2 MT-RNR1 100058.33839 -0.2858692 0.05843432 -4.892145
## ENSG00000241560.7 ZBTB20-AS1 46.10569 0.2579937 0.05302105 4.865873
## ENSG00000210082.2 MT-RNR2 242496.36564 -0.2401137 0.04976253 -4.825190
## ENSG00000064763.11 FAR2 604.97645 -0.2053606 0.04311803 -4.762754
## ENSG00000135678.12 CPM 575.43286 -0.3101463 0.06641943 -4.669512
## ENSG00000155659.15 VSIG4 356.81388 -0.4006950 0.08583339 -4.668288
## ENSG00000050730.16 TNIP3 55.66088 0.2325018 0.04992622 4.656908
## pvalue padj
## ENSG00000197632.9 SERPINB2 2.264642e-07 0.002314927
## ENSG00000127954.13 STEAP4 2.369520e-07 0.002314927
## ENSG00000139572.4 GPR84 3.147134e-07 0.002314927
## ENSG00000211459.2 MT-RNR1 9.974275e-07 0.005029176
## ENSG00000241560.7 ZBTB20-AS1 1.139524e-06 0.005029176
## ENSG00000210082.2 MT-RNR2 1.398696e-06 0.005144171
## ENSG00000064763.11 FAR2 1.909682e-06 0.006020136
## ENSG00000135678.12 CPM 3.019165e-06 0.007083372
## ENSG00000155659.15 VSIG4 3.037196e-06 0.007083372
## ENSG00000050730.16 TNIP3 3.209939e-06 0.007083372
mean(abs(dge$stat))
## [1] 0.9721167
crp_eos_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 134 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000007968.7 E2F2 870.17416 0.4436761 0.03676788 12.066948
## ENSG00000137869.15 CYP19A1 81.93527 0.9960649 0.09264040 10.751949
## ENSG00000163710.9 PCOLCE2 18.25602 1.0744848 0.10096581 10.642066
## ENSG00000104918.8 RETN 1801.19251 0.7757049 0.07614859 10.186726
## ENSG00000132170.24 PPARG 168.63657 0.5429631 0.05405474 10.044690
## ENSG00000145287.11 PLAC8 4671.33188 0.3457873 0.03478431 9.940898
## ENSG00000183578.8 TNFAIP8L3 24.65972 0.7720344 0.07809643 9.885655
## ENSG00000135424.18 ITGA7 436.10324 0.5378278 0.05479987 9.814399
## ENSG00000108950.12 FAM20A 1751.41363 0.6102328 0.06295930 9.692497
## ENSG00000165092.13 ALDH1A1 410.41112 -0.5138028 0.05334920 -9.630936
## pvalue padj
## ENSG00000007968.7 E2F2 1.578813e-33 3.364925e-29
## ENSG00000137869.15 CYP19A1 5.802268e-27 6.183186e-23
## ENSG00000163710.9 PCOLCE2 1.898717e-26 1.348912e-22
## ENSG00000104918.8 RETN 2.272894e-24 1.211055e-20
## ENSG00000132170.24 PPARG 9.695213e-24 4.132682e-20
## ENSG00000145287.11 PLAC8 2.763251e-23 9.815527e-20
## ENSG00000183578.8 TNFAIP8L3 4.804293e-23 1.462770e-19
## ENSG00000135424.18 ITGA7 9.761766e-23 2.600656e-19
## ENSG00000108950.12 FAM20A 3.244949e-22 7.684400e-19
## ENSG00000165092.13 ALDH1A1 5.918789e-22 1.261472e-18
mean(abs(dge$stat))
## [1] 1.827282
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 21 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000007968.7 E2F2 870.17416 0.3860936 0.04401299 8.772264
## ENSG00000163710.9 PCOLCE2 18.25602 0.9879318 0.11793493 8.376923
## ENSG00000137869.15 CYP19A1 81.93527 0.8122448 0.10657233 7.621535
## ENSG00000104918.8 RETN 1801.19251 0.6489135 0.08536396 7.601727
## ENSG00000132170.24 PPARG 168.63657 0.4789945 0.06318857 7.580398
## ENSG00000135424.18 ITGA7 436.10324 0.4767597 0.06544080 7.285359
## ENSG00000108950.12 FAM20A 1751.41363 0.5260557 0.07302081 7.204190
## ENSG00000169994.19 MYO7B 618.31752 0.3324557 0.04684399 7.097083
## ENSG00000165092.13 ALDH1A1 410.41112 -0.4432919 0.06264590 -7.076151
## ENSG00000116016.14 EPAS1 154.17271 0.3555413 0.05078144 7.001402
## pvalue padj
## ENSG00000007968.7 E2F2 1.751085e-18 3.732087e-14
## ENSG00000163710.9 PCOLCE2 5.432955e-17 5.789629e-13
## ENSG00000137869.15 CYP19A1 2.506768e-14 1.468447e-10
## ENSG00000104918.8 RETN 2.922041e-14 1.468447e-10
## ENSG00000132170.24 PPARG 3.444956e-14 1.468447e-10
## ENSG00000135424.18 ITGA7 3.208156e-13 1.139591e-09
## ENSG00000108950.12 FAM20A 5.839004e-13 1.777810e-09
## ENSG00000169994.19 MYO7B 1.274176e-12 3.394564e-09
## ENSG00000165092.13 ALDH1A1 1.482132e-12 3.509854e-09
## ENSG00000116016.14 EPAS1 2.534142e-12 5.401016e-09
mean(abs(dge$stat))
## [1] 1.324953
crp_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 8 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000007968.7 E2F2 870.17416 0.3472345 0.04014188 8.650180
## ENSG00000165092.13 ALDH1A1 410.41112 -0.4939869 0.05929803 -8.330579
## ENSG00000137869.15 CYP19A1 81.93527 0.7605312 0.09261922 8.211375
## ENSG00000132170.24 PPARG 168.63657 0.4271253 0.05248859 8.137489
## ENSG00000163710.9 PCOLCE2 18.25602 0.7905130 0.09924488 7.965277
## ENSG00000108950.12 FAM20A 1751.41363 0.4989988 0.06457599 7.727313
## ENSG00000135424.18 ITGA7 436.10324 0.4381670 0.05874170 7.459216
## ENSG00000116016.14 EPAS1 154.17271 0.2817621 0.03934883 7.160622
## ENSG00000104918.8 RETN 1801.19251 0.5040455 0.07279215 6.924449
## ENSG00000169994.19 MYO7B 618.31752 0.3134418 0.04566546 6.863870
## pvalue padj
## ENSG00000007968.7 E2F2 5.141826e-18 1.095877e-13
## ENSG00000165092.13 ALDH1A1 8.044826e-17 8.572969e-13
## ENSG00000137869.15 CYP19A1 2.186696e-16 1.553502e-12
## ENSG00000132170.24 PPARG 4.035604e-16 2.150271e-12
## ENSG00000163710.9 PCOLCE2 1.648539e-15 7.027064e-12
## ENSG00000108950.12 FAM20A 1.098407e-14 3.901725e-11
## ENSG00000135424.18 ITGA7 8.703888e-14 2.650085e-10
## ENSG00000116016.14 EPAS1 8.031209e-13 2.139615e-09
## ENSG00000104918.8 RETN 4.376743e-12 1.036461e-08
## ENSG00000169994.19 MYO7B 6.701962e-12 1.428389e-08
mean(abs(dge$stat))
## [1] 1.110871
crp_pod1_adj <- dge
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 364 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000179593.16 ALOX15B 250.75091 -2.4535079 0.3356004 -7.310802
## ENSG00000279359.1 RP11-36D19.9 24.97417 -2.8528835 0.4030002 -7.079112
## ENSG00000141744.4 PNMT 35.64128 -1.7581232 0.2682321 -6.554486
## ENSG00000276085.1 CCL3L1 308.64747 -1.7142384 0.3110313 -5.511466
## ENSG00000057294.16 PKP2 92.33310 -1.2052440 0.2230360 -5.403810
## ENSG00000079215.15 SLC1A3 219.79787 -1.7005517 0.3337533 -5.095236
## ENSG00000164056.11 SPRY1 69.48429 -1.1641127 0.2336978 -4.981274
## ENSG00000233916.1 ZDHHC20P1 21.16714 -1.2626226 0.2544397 -4.962364
## ENSG00000277632.2 CCL3 559.74862 -1.3892331 0.2870354 -4.839937
## ENSG00000122644.13 ARL4A 383.47521 -0.6984909 0.1479189 -4.722121
## pvalue padj
## ENSG00000179593.16 ALOX15B 2.655536e-13 5.824918e-09
## ENSG00000279359.1 RP11-36D19.9 1.450807e-12 1.591172e-08
## ENSG00000141744.4 PNMT 5.583392e-11 4.082390e-07
## ENSG00000276085.1 CCL3L1 3.558581e-08 1.951437e-04
## ENSG00000057294.16 PKP2 6.523988e-08 2.862073e-04
## ENSG00000079215.15 SLC1A3 3.483078e-07 1.273355e-03
## ENSG00000164056.11 SPRY1 6.316694e-07 1.909446e-03
## ENSG00000233916.1 ZDHHC20P1 6.964017e-07 1.909446e-03
## ENSG00000277632.2 CCL3 1.298804e-06 3.165474e-03
## ENSG00000122644.13 ARL4A 2.333975e-06 5.119575e-03
mean(abs(dge$stat))
## [1] 0.733293
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 14 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000123838.11 C4BPA 33.66576 2.4906731 0.51109471 4.873212
## ENSG00000131845.15 ZNF304 418.79903 0.2874110 0.06121095 4.695418
## ENSG00000277632.2 CCL3 559.74862 -1.2496211 0.26775579 -4.667018
## ENSG00000179593.16 ALOX15B 250.75091 -1.0955858 0.23730414 -4.616800
## ENSG00000122644.13 ARL4A 383.47521 -0.6835449 0.15051405 -4.541403
## ENSG00000229807.13 XIST 10626.92285 -1.8128259 0.41367719 -4.382223
## ENSG00000162599.17 NFIA 232.38832 -0.3571616 0.08172420 -4.370328
## ENSG00000276085.1 CCL3L1 308.64747 -1.2489298 0.29873614 -4.180712
## ENSG00000079215.15 SLC1A3 219.79787 -0.9048776 0.21710493 -4.167927
## ENSG00000115306.16 SPTBN1 3986.49830 0.2464913 0.06133022 4.019084
## pvalue padj
## ENSG00000123838.11 C4BPA 1.097980e-06 0.01722679
## ENSG00000131845.15 ZNF304 2.660615e-06 0.01722679
## ENSG00000277632.2 CCL3 3.056028e-06 0.01722679
## ENSG00000179593.16 ALOX15B 3.897024e-06 0.01722679
## ENSG00000122644.13 ARL4A 5.588117e-06 0.01976182
## ENSG00000229807.13 XIST 1.174742e-05 0.03133750
## ENSG00000162599.17 NFIA 1.240598e-05 0.03133750
## ENSG00000276085.1 CCL3L1 2.905975e-05 0.06039029
## ENSG00000079215.15 SLC1A3 3.073818e-05 0.06039029
## ENSG00000115306.16 SPTBN1 5.842491e-05 0.09853338
mean(abs(dge$stat))
## [1] 0.8180539
avb_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 19 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000169429.11 CXCL8 1371.70225 -2.0614320 0.36621890 -5.628961
## ENSG00000131845.15 ZNF304 418.79903 0.2742914 0.05932004 4.623924
## ENSG00000234665.9 LERFS 41.30684 -1.7954846 0.40114817 -4.475864
## ENSG00000122644.13 ARL4A 383.47521 -0.6387885 0.14587971 -4.378872
## ENSG00000104361.10 NIPAL2 763.75731 -0.3251850 0.07817291 -4.159817
## ENSG00000276085.1 CCL3L1 308.64747 -1.2043196 0.29535827 -4.077488
## ENSG00000162599.17 NFIA 232.38832 -0.3313865 0.08168434 -4.056915
## ENSG00000256128.6 LINC00944 120.56609 -0.4234680 0.10605135 -3.993047
## ENSG00000166394.15 CYB5R2 33.12472 -0.5263833 0.13248031 -3.973294
## ENSG00000115306.16 SPTBN1 3986.49830 0.1888577 0.04792220 3.940922
## pvalue padj
## ENSG00000169429.11 CXCL8 1.812982e-08 0.0003976776
## ENSG00000131845.15 ZNF304 3.765473e-06 0.0412978268
## ENSG00000234665.9 LERFS 7.610294e-06 0.0556439312
## ENSG00000122644.13 ARL4A 1.192954e-05 0.0654186306
## ENSG00000104361.10 NIPAL2 3.185028e-05 0.1397271679
## ENSG00000276085.1 CCL3L1 4.552497e-05 0.1558172013
## ENSG00000162599.17 NFIA 4.972512e-05 0.1558172013
## ENSG00000256128.6 LINC00944 6.522973e-05 0.1618763170
## ENSG00000166394.15 CYB5R2 7.088532e-05 0.1618763170
## ENSG00000115306.16 SPTBN1 8.116910e-05 0.1618763170
mean(abs(dge$stat))
## [1] 0.8369
avb_t0_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 129 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000164056.11 SPRY1 165.16703 -2.6495876 0.15023175 -17.63667
## ENSG00000141744.4 PNMT 87.09740 -3.6453219 0.21308129 -17.10766
## ENSG00000048740.18 CELF2 14860.56325 -0.8538206 0.06266085 -13.62606
## ENSG00000279359.1 RP11-36D19.9 103.42684 -3.9721070 0.31107434 -12.76900
## ENSG00000179593.16 ALOX15B 847.15559 -3.1265685 0.24545606 -12.73779
## ENSG00000057294.16 PKP2 172.03315 -2.2928606 0.18394552 -12.46489
## ENSG00000064300.9 NGFR 61.81062 -2.2893270 0.18445275 -12.41145
## ENSG00000196935.9 SRGAP1 329.84060 -1.7220092 0.14799455 -11.63563
## ENSG00000272870.3 SAP30-DT 136.77299 -0.7521312 0.06569743 -11.44841
## ENSG00000145990.11 GFOD1 1933.33604 -1.2571490 0.11028762 -11.39882
## pvalue padj
## ENSG00000164056.11 SPRY1 1.288377e-69 2.843061e-65
## ENSG00000141744.4 PNMT 1.301255e-65 1.435740e-61
## ENSG00000048740.18 CELF2 2.802994e-42 2.061789e-38
## ENSG00000279359.1 RP11-36D19.9 2.442852e-37 1.347661e-33
## ENSG00000179593.16 ALOX15B 3.645637e-37 1.608966e-33
## ENSG00000057294.16 PKP2 1.160326e-35 4.267485e-32
## ENSG00000064300.9 NGFR 2.265006e-35 7.140270e-32
## ENSG00000196935.9 SRGAP1 2.715924e-31 7.491537e-28
## ENSG00000272870.3 SAP30-DT 2.394942e-30 5.872132e-27
## ENSG00000145990.11 GFOD1 4.238097e-30 9.352208e-27
mean(abs(dge$stat))
## [1] 1.492199
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 9 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000164056.11 SPRY1 165.16703 -2.7059237 0.15638312 -17.30317
## ENSG00000141744.4 PNMT 87.09740 -3.5010599 0.21937506 -15.95924
## ENSG00000279359.1 RP11-36D19.9 103.42684 -4.2902517 0.32298993 -13.28293
## ENSG00000179593.16 ALOX15B 847.15559 -3.2375724 0.25327794 -12.78269
## ENSG00000196935.9 SRGAP1 329.84060 -1.8289813 0.14310723 -12.78050
## ENSG00000048740.18 CELF2 14860.56325 -0.8384829 0.06660418 -12.58904
## ENSG00000057294.16 PKP2 172.03315 -2.2412930 0.18649597 -12.01792
## ENSG00000064300.9 NGFR 61.81062 -2.3210963 0.19383753 -11.97444
## ENSG00000272870.3 SAP30-DT 136.77299 -0.7544104 0.06880368 -10.96468
## ENSG00000145990.11 GFOD1 1933.33604 -1.2370843 0.11592467 -10.67145
## pvalue padj
## ENSG00000164056.11 SPRY1 4.452072e-67 9.824386e-63
## ENSG00000141744.4 PNMT 2.456924e-57 2.710847e-53
## ENSG00000279359.1 RP11-36D19.9 2.908007e-40 2.139033e-36
## ENSG00000179593.16 ALOX15B 2.048692e-37 9.300053e-34
## ENSG00000196935.9 SRGAP1 2.107231e-37 9.300053e-34
## ENSG00000048740.18 CELF2 2.425963e-36 8.922286e-33
## ENSG00000057294.16 PKP2 2.860952e-33 9.018946e-30
## ENSG00000064300.9 NGFR 4.836794e-33 1.334169e-29
## ENSG00000272870.3 SAP30-DT 5.649914e-28 1.385296e-24
## ENSG00000145990.11 GFOD1 1.384500e-26 3.055175e-23
mean(abs(dge$stat))
## [1] 1.414198
avb_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 12 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000164056.11 SPRY1 165.1670 -2.7923982 0.17387073 -16.06020
## ENSG00000179593.16 ALOX15B 847.1556 -3.8150687 0.26013490 -14.66573
## ENSG00000198585.12 NUDT16 5301.4426 -1.3105279 0.09401215 -13.93998
## ENSG00000135678.12 CPM 575.4329 -1.7177297 0.12683010 -13.54355
## ENSG00000111666.11 CHPT1 1197.7668 -1.0307176 0.07716078 -13.35805
## ENSG00000279359.1 RP11-36D19.9 103.4268 -4.1756341 0.32422360 -12.87887
## ENSG00000141744.4 PNMT 87.0974 -3.0318991 0.23852957 -12.71079
## ENSG00000177575.13 CD163 25120.8560 -2.0894785 0.16631212 -12.56360
## ENSG00000171105.14 INSR 1211.6882 -1.3065370 0.10720954 -12.18676
## ENSG00000136478.8 TEX2 944.3836 -0.9718897 0.08209004 -11.83931
## pvalue padj
## ENSG00000164056.11 SPRY1 4.850104e-58 1.070273e-53
## ENSG00000179593.16 ALOX15B 1.068588e-48 1.179026e-44
## ENSG00000198585.12 NUDT16 3.620213e-44 2.662908e-40
## ENSG00000135678.12 CPM 8.650568e-42 4.772302e-38
## ENSG00000111666.11 CHPT1 1.063116e-40 4.691957e-37
## ENSG00000279359.1 RP11-36D19.9 5.919469e-38 2.177082e-34
## ENSG00000141744.4 PNMT 5.151169e-37 1.623869e-33
## ENSG00000177575.13 CD163 3.347600e-36 9.233935e-33
## ENSG00000171105.14 INSR 3.656660e-34 8.965724e-31
## ENSG00000136478.8 TEX2 2.444439e-32 5.394144e-29
mean(abs(dge$stat))
## [1] 1.213911
avb_eos_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 253 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000186081.12 KRT5 14.54120 2.1890736 0.27368008 7.998659
## ENSG00000115414.21 FN1 184.93708 -1.5590144 0.21045312 -7.407894
## ENSG00000155659.15 VSIG4 1506.08672 -2.0082041 0.28041210 -7.161617
## ENSG00000149534.9 MS4A2 83.83329 1.4807287 0.21267882 6.962276
## ENSG00000154269.15 ENPP3 37.09384 1.2813396 0.18714544 6.846758
## ENSG00000131016.17 AKAP12 103.10713 1.4235129 0.21406789 6.649820
## ENSG00000259162.1 RP11-203M5.6 24.74288 1.3334574 0.20408383 6.533871
## ENSG00000140287.11 HDC 496.98643 1.4313408 0.22105777 6.474963
## ENSG00000179348.12 GATA2 687.51559 1.2570611 0.19461993 6.459056
## ENSG00000163050.18 COQ8A 2009.95700 -0.3111471 0.04862536 -6.398865
## pvalue padj
## ENSG00000186081.12 KRT5 1.257816e-15 2.680784e-11
## ENSG00000115414.21 FN1 1.283207e-13 1.367449e-09
## ENSG00000155659.15 VSIG4 7.973092e-13 5.664350e-09
## ENSG00000149534.9 MS4A2 3.348181e-12 1.783995e-08
## ENSG00000154269.15 ENPP3 7.554214e-12 3.220059e-08
## ENSG00000131016.17 AKAP12 2.934514e-11 1.042388e-07
## ENSG00000259162.1 RP11-203M5.6 6.409120e-11 1.951394e-07
## ENSG00000140287.11 HDC 9.483534e-11 2.494996e-07
## ENSG00000179348.12 GATA2 1.053581e-10 2.494996e-07
## ENSG00000163050.18 COQ8A 1.565362e-10 3.336256e-07
mean(abs(dge$stat))
## [1] 1.083107
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 19 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000186081.12 KRT5 14.54120 2.189179 0.2813120 7.782032
## ENSG00000155659.15 VSIG4 1506.08672 -2.046371 0.2808756 -7.285686
## ENSG00000149534.9 MS4A2 83.83329 1.501000 0.2091159 7.177839
## ENSG00000259162.1 RP11-203M5.6 24.74288 1.419827 0.1981389 7.165817
## ENSG00000229961.3 RP11-71G12.1 66.93635 1.279558 0.1908412 6.704834
## ENSG00000154269.15 ENPP3 37.09384 1.266610 0.1901442 6.661315
## ENSG00000131016.17 AKAP12 103.10713 1.405739 0.2170297 6.477170
## ENSG00000179348.12 GATA2 687.51559 1.268771 0.1991819 6.369909
## ENSG00000115414.21 FN1 184.93708 -1.183497 0.1874768 -6.312766
## ENSG00000140287.11 HDC 496.98643 1.423136 0.2259646 6.298047
## pvalue padj
## ENSG00000186081.12 KRT5 7.136868e-15 1.521081e-10
## ENSG00000155659.15 VSIG4 3.200364e-13 3.410468e-09
## ENSG00000149534.9 MS4A2 7.082214e-13 4.120009e-09
## ENSG00000259162.1 RP11-203M5.6 7.732388e-13 4.120009e-09
## ENSG00000229961.3 RP11-71G12.1 2.016353e-11 8.594904e-08
## ENSG00000154269.15 ENPP3 2.713883e-11 9.640164e-08
## ENSG00000131016.17 AKAP12 9.345891e-11 2.845557e-07
## ENSG00000179348.12 GATA2 1.891400e-10 5.038926e-07
## ENSG00000115414.21 FN1 2.740912e-10 6.424139e-07
## ENSG00000140287.11 HDC 3.014188e-10 6.424139e-07
mean(abs(dge$stat))
## [1] 0.9476336
avb_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 7 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000186081.12 KRT5 14.54120 2.2389705 0.28501584 7.855600
## ENSG00000155659.15 VSIG4 1506.08672 -1.8413863 0.23966497 -7.683168
## ENSG00000259162.1 RP11-203M5.6 24.74288 1.4492545 0.19622316 7.385746
## ENSG00000229961.3 RP11-71G12.1 66.93635 1.3530051 0.19043462 7.104828
## ENSG00000149534.9 MS4A2 83.83329 1.4271747 0.20531024 6.951308
## ENSG00000105426.17 PTPRS 203.16124 0.8200685 0.12225141 6.708049
## ENSG00000131016.17 AKAP12 103.10713 1.4256435 0.21310542 6.689851
## ENSG00000154269.15 ENPP3 37.09384 1.1882697 0.18396236 6.459309
## ENSG00000135218.19 CD36 11489.98755 0.4990621 0.07877094 6.335612
## ENSG00000070915.10 SLC12A3 21.89994 1.1148229 0.17749304 6.280938
## pvalue padj
## ENSG00000186081.12 KRT5 3.978614e-15 8.479620e-11
## ENSG00000155659.15 VSIG4 1.552015e-14 1.653905e-10
## ENSG00000259162.1 RP11-203M5.6 1.516008e-13 1.077022e-09
## ENSG00000229961.3 RP11-71G12.1 1.204729e-12 6.419095e-09
## ENSG00000149534.9 MS4A2 3.619152e-12 1.542700e-08
## ENSG00000105426.17 PTPRS 1.972438e-11 6.801829e-08
## ENSG00000131016.17 AKAP12 2.233979e-11 6.801829e-08
## ENSG00000154269.15 ENPP3 1.051821e-10 2.802183e-07
## ENSG00000135218.19 CD36 2.364016e-10 5.598251e-07
## ENSG00000070915.10 SLC12A3 3.365353e-10 7.172577e-07
mean(abs(dge$stat))
## [1] 1.031521
avb_pod1_adj <- dge
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2 <- subset(ss2,crp_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 480 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000279359.1 RP11-36D19.9 39.871182 -3.3255603 0.6426406 -5.174837
## ENSG00000141744.4 PNMT 51.670695 -2.1905424 0.4563927 -4.799687
## ENSG00000204936.10 CD177 229.544285 -2.5908627 0.5761080 -4.497182
## ENSG00000169429.11 CXCL8 901.340924 -2.4617931 0.5620815 -4.379780
## ENSG00000179593.16 ALOX15B 395.592390 -2.3562217 0.5403176 -4.360808
## ENSG00000122644.13 ARL4A 438.849245 -0.9185857 0.2291829 -4.008090
## ENSG00000115155.19 OTOF 120.892044 1.2138657 0.3046429 3.984553
## ENSG00000258471.2 RP11-84C10.4 14.882458 0.8381249 0.2161208 3.878039
## ENSG00000253230.9 MIR124-1HG 8.779545 -3.3333199 0.8661717 -3.848336
## ENSG00000079215.15 SLC1A3 309.537268 -2.0483519 0.5335749 -3.838921
## pvalue padj
## ENSG00000279359.1 RP11-36D19.9 2.281100e-07 0.005003594
## ENSG00000141744.4 PNMT 1.589136e-06 0.017428844
## ENSG00000204936.10 CD177 6.885989e-06 0.050348057
## ENSG00000169429.11 CXCL8 1.187994e-05 0.056847985
## ENSG00000179593.16 ALOX15B 1.295828e-05 0.056847985
## ENSG00000122644.13 ARL4A 6.121173e-05 0.211852584
## ENSG00000115155.19 OTOF 6.760739e-05 0.211852584
## ENSG00000258471.2 RP11-84C10.4 1.053019e-04 0.257544790
## ENSG00000253230.9 MIR124-1HG 1.189227e-04 0.257544790
## ENSG00000079215.15 SLC1A3 1.235763e-04 0.257544790
mean(abs(dge$stat))
## [1] 0.7426341
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 14 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000123838.11 C4BPA 56.27087 4.1714068 0.7549693 5.525267
## ENSG00000115155.19 OTOF 120.89204 1.3340109 0.2995082 4.454005
## ENSG00000258471.2 RP11-84C10.4 14.88246 0.9535482 0.2237348 4.261957
## ENSG00000234665.9 LERFS 57.91435 -2.2458162 0.5334487 -4.209995
## ENSG00000119922.11 IFIT2 1771.40268 1.5450687 0.3900461 3.961246
## ENSG00000126262.5 FFAR2 1188.14797 1.9522596 0.5036095 3.876534
## ENSG00000185745.10 IFIT1 822.12911 1.4987801 0.3868363 3.874456
## ENSG00000215630.6 GUSBP9 203.39614 -0.6165686 0.1653777 -3.728245
## ENSG00000119917.15 IFIT3 1355.72352 1.3841010 0.3721027 3.719675
## ENSG00000287095.1 CTC-215C12.2 51.19664 0.7011972 0.1942702 3.609391
## pvalue padj
## ENSG00000123838.11 C4BPA 3.289854e-08 0.0007216295
## ENSG00000115155.19 OTOF 8.428321e-06 0.0924376158
## ENSG00000258471.2 RP11-84C10.4 2.026441e-05 0.1400421942
## ENSG00000234665.9 LERFS 2.553767e-05 0.1400421942
## ENSG00000119922.11 IFIT2 7.455966e-05 0.3270932241
## ENSG00000126262.5 FFAR2 1.059549e-04 0.3348632016
## ENSG00000185745.10 IFIT1 1.068631e-04 0.3348632016
## ENSG00000215630.6 GUSBP9 1.928181e-04 0.4861753379
## ENSG00000119917.15 IFIT3 1.994793e-04 0.4861753379
## ENSG00000287095.1 CTC-215C12.2 3.069169e-04 0.5721559718
mean(abs(dge$stat))
## [1] 0.7451909
avb_crplo_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 34 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000074803.20 SLC12A1 84.81635 -2.4537886 0.56636642 -4.332511
## ENSG00000198794.12 SCAMP5 140.06582 0.6687557 0.16129243 4.146231
## ENSG00000258471.2 RP11-84C10.4 14.88246 0.9213004 0.22556852 4.084348
## ENSG00000146426.19 TIAM2 156.83010 0.2658418 0.06670186 3.985523
## ENSG00000165029.17 ABCA1 721.16202 0.4761514 0.12104587 3.933644
## ENSG00000177191.2 B3GNT8 337.40623 -0.4161721 0.10764360 -3.866204
## ENSG00000215630.6 GUSBP9 203.39614 -0.5856877 0.15250841 -3.840363
## ENSG00000125384.7 PTGER2 3444.28853 -0.3095152 0.08088605 -3.826559
## ENSG00000115155.19 OTOF 120.89204 1.1760123 0.31810878 3.696887
## ENSG00000234665.9 LERFS 57.91435 -1.9741234 0.53705027 -3.675863
## pvalue padj
## ENSG00000074803.20 SLC12A1 1.474184e-05 0.3231808
## ENSG00000198794.12 SCAMP5 3.379930e-05 0.3231808
## ENSG00000258471.2 RP11-84C10.4 4.420071e-05 0.3231808
## ENSG00000146426.19 TIAM2 6.733154e-05 0.3562985
## ENSG00000165029.17 ABCA1 8.366765e-05 0.3562985
## ENSG00000177191.2 B3GNT8 1.105425e-04 0.3562985
## ENSG00000215630.6 GUSBP9 1.228525e-04 0.3562985
## ENSG00000125384.7 PTGER2 1.299470e-04 0.3562985
## ENSG00000115155.19 OTOF 2.182591e-04 0.5031082
## ENSG00000234665.9 LERFS 2.370464e-04 0.5031082
mean(abs(dge$stat))
## [1] 0.7355129
avb_crplo_t0_adj <- dge
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2 <- subset(ss2,crp_group==4)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 280 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000211652.2 IGLV7-43 57.72448 2.1239087 0.4170212 5.093048
## ENSG00000276085.1 CCL3L1 255.05022 -2.1290046 0.4413594 -4.823744
## ENSG00000263711.6 LINC02864 15.37815 -1.0912344 0.2360114 -4.623651
## ENSG00000211655.3 IGLV1-36 15.01384 1.7966484 0.3932588 4.568616
## ENSG00000278920.1 RP3-412A9.17 103.00331 -0.4584785 0.1045193 -4.386546
## ENSG00000211644.3 IGLV1-51 213.01635 1.6268219 0.3714944 4.379130
## ENSG00000211640.4 IGLV6-57 85.08443 1.7610091 0.4123498 4.270669
## ENSG00000211673.2 IGLV3-1 232.41139 1.6806179 0.3957575 4.246585
## ENSG00000211659.2 IGLV3-25 147.02115 1.3601124 0.3264996 4.165740
## ENSG00000203999.9 LINC01270 100.96329 -0.9138279 0.2252296 -4.057317
## pvalue padj
## ENSG00000211652.2 IGLV7-43 3.523525e-07 0.007728851
## ENSG00000276085.1 CCL3L1 1.408886e-06 0.015451954
## ENSG00000263711.6 LINC02864 3.770448e-06 0.026922780
## ENSG00000211655.3 IGLV1-36 4.909556e-06 0.026922780
## ENSG00000278920.1 RP3-412A9.17 1.151649e-05 0.043560797
## ENSG00000211644.3 IGLV1-51 1.191542e-05 0.043560797
## ENSG00000211640.4 IGLV6-57 1.948878e-05 0.059513437
## ENSG00000211673.2 IGLV3-1 2.170538e-05 0.059513437
## ENSG00000211659.2 IGLV3-25 3.103448e-05 0.075637926
## ENSG00000203999.9 LINC01270 4.963970e-05 0.108884692
mean(abs(dge$stat))
## [1] 0.9391897
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 7 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000274611.4 TBC1D3 75.11060 28.5191889 3.4788629 8.197848
## ENSG00000225630.1 MTND2P28 130.06123 -2.3686307 0.4535022 -5.222975
## ENSG00000276085.1 CCL3L1 255.05022 -2.3116418 0.4587641 -5.038847
## ENSG00000211652.2 IGLV7-43 57.72448 2.0854416 0.4313195 4.835027
## ENSG00000278920.1 RP3-412A9.17 103.00331 -0.5045173 0.1058888 -4.764595
## ENSG00000263711.6 LINC02864 15.37815 -1.0913346 0.2490844 -4.381385
## ENSG00000211935.3 IGHV1-3 103.80490 1.9185937 0.4471155 4.291047
## ENSG00000272763.1 RP11-357H14.17 12.74908 -1.7976987 0.4292167 -4.188324
## ENSG00000248801.7 C8orf34-AS1 64.73886 -0.4796651 0.1151577 -4.165288
## ENSG00000211655.3 IGLV1-36 15.01384 1.5932344 0.3874480 4.112124
## pvalue padj
## ENSG00000274611.4 TBC1D3 2.447286e-16 5.368122e-12
## ENSG00000225630.1 MTND2P28 1.760713e-07 1.931062e-03
## ENSG00000276085.1 CCL3L1 4.683455e-07 3.424386e-03
## ENSG00000211652.2 IGLV7-43 1.331273e-06 7.300367e-03
## ENSG00000278920.1 RP3-412A9.17 1.892338e-06 8.301685e-03
## ENSG00000263711.6 LINC02864 1.179272e-05 4.311222e-02
## ENSG00000211935.3 IGHV1-3 1.778326e-05 5.572511e-02
## ENSG00000272763.1 RP11-357H14.17 2.810225e-05 7.578767e-02
## ENSG00000248801.7 C8orf34-AS1 3.109592e-05 7.578767e-02
## ENSG00000211655.3 IGLV1-36 3.920355e-05 8.599298e-02
mean(abs(dge$stat))
## [1] 0.9639989
avb_crphi_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 40 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000278920.1 RP3-412A9.17 103.00331 -0.5815433 0.10376894 -5.604213
## ENSG00000184166.3 OR1D2 43.84299 -0.7606212 0.14772438 -5.148921
## ENSG00000130368.7 MAS1 36.87331 -0.6653173 0.13143995 -5.061759
## ENSG00000243290.3 IGKV1-12 66.84308 1.3286837 0.28374046 4.682743
## ENSG00000100304.13 TTLL12 1016.56775 0.3673839 0.07974214 4.607149
## ENSG00000248801.7 C8orf34-AS1 64.73886 -0.5485503 0.12079641 -4.541115
## ENSG00000261501.1 BBS7-DT 61.88604 -0.6923031 0.15272923 -4.532879
## ENSG00000287671.1 RP11-728E14.5 125.36742 -0.5274317 0.11684974 -4.513760
## ENSG00000142910.16 TINAGL1 22.44874 -0.6820575 0.15340449 -4.446138
## ENSG00000229321.2 AC008269.2 16.95875 -0.7855289 0.17956739 -4.374563
## pvalue padj
## ENSG00000278920.1 RP3-412A9.17 2.092028e-08 0.0004588863
## ENSG00000184166.3 OR1D2 2.619887e-07 0.0028733615
## ENSG00000130368.7 MAS1 4.154066e-07 0.0030373144
## ENSG00000243290.3 IGKV1-12 2.830609e-06 0.0155223525
## ENSG00000100304.13 TTLL12 4.082279e-06 0.0174625398
## ENSG00000248801.7 C8orf34-AS1 5.595761e-06 0.0174625398
## ENSG00000261501.1 BBS7-DT 5.818510e-06 0.0174625398
## ENSG00000287671.1 RP11-728E14.5 6.368831e-06 0.0174625398
## ENSG00000142910.16 TINAGL1 8.742790e-06 0.0213081227
## ENSG00000229321.2 AC008269.2 1.216759e-05 0.0248308670
mean(abs(dge$stat))
## [1] 1.015081
avb_crphi_t0_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2 <- subset(ss2,crp_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 107 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000279359.1 RP11-36D19.9 119.04727 -5.1395504 0.37985433 -13.530320
## ENSG00000141744.4 PNMT 139.75155 -3.9629221 0.29974528 -13.220966
## ENSG00000164056.11 SPRY1 233.24893 -2.7293406 0.22123908 -12.336611
## ENSG00000101187.16 SLCO4A1 173.38666 -2.6729667 0.23016884 -11.613069
## ENSG00000145990.11 GFOD1 2388.12563 -1.6208396 0.15147358 -10.700478
## ENSG00000048740.18 CELF2 17256.47082 -0.9431284 0.08907268 -10.588301
## ENSG00000079215.15 SLC1A3 1233.66208 -3.3548344 0.32004337 -10.482437
## ENSG00000064300.9 NGFR 79.80579 -2.7214417 0.27282216 -9.975149
## ENSG00000057294.16 PKP2 259.99533 -2.5509565 0.28208476 -9.043227
## ENSG00000168807.16 SNTB2 1676.95276 -0.9053557 0.10490101 -8.630572
## pvalue padj
## ENSG00000279359.1 RP11-36D19.9 1.035724e-41 2.285429e-37
## ENSG00000141744.4 PNMT 6.640567e-40 7.326538e-36
## ENSG00000164056.11 SPRY1 5.752639e-35 4.231258e-31
## ENSG00000101187.16 SLCO4A1 3.536877e-31 1.951118e-27
## ENSG00000145990.11 GFOD1 1.012550e-26 4.468585e-23
## ENSG00000048740.18 CELF2 3.376656e-26 1.241821e-22
## ENSG00000079215.15 SLC1A3 1.040281e-25 3.279263e-22
## ENSG00000064300.9 NGFR 1.958070e-23 5.400845e-20
## ENSG00000057294.16 PKP2 1.521137e-19 3.729489e-16
## ENSG00000168807.16 SNTB2 6.104587e-18 1.347038e-14
mean(abs(dge$stat))
## [1] 1.272032
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 16 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000279359.1 RP11-36D19.9 119.04727 -5.3472996 0.39841130 -13.421556
## ENSG00000141744.4 PNMT 139.75155 -3.7766877 0.30856468 -12.239533
## ENSG00000164056.11 SPRY1 233.24893 -2.8225809 0.23140843 -12.197399
## ENSG00000079215.15 SLC1A3 1233.66208 -3.7771409 0.31036871 -12.169851
## ENSG00000101187.16 SLCO4A1 173.38666 -2.7558923 0.24132532 -11.419822
## ENSG00000048740.18 CELF2 17256.47082 -0.9616907 0.09345649 -10.290251
## ENSG00000145990.11 GFOD1 2388.12563 -1.5620788 0.15651131 -9.980613
## ENSG00000057294.16 PKP2 259.99533 -2.7302298 0.28018872 -9.744253
## ENSG00000064300.9 NGFR 79.80579 -2.5660575 0.28091181 -9.134744
## ENSG00000119138.5 KLF9 2597.08994 -1.0787187 0.11881849 -9.078710
## pvalue padj
## ENSG00000279359.1 RP11-36D19.9 4.521202e-41 9.976937e-37
## ENSG00000141744.4 PNMT 1.911285e-34 2.108816e-30
## ENSG00000164056.11 SPRY1 3.209131e-34 2.360530e-30
## ENSG00000079215.15 SLC1A3 4.499039e-34 2.482007e-30
## ENSG00000101187.16 SLCO4A1 3.329138e-30 1.469282e-26
## ENSG00000048740.18 CELF2 7.797324e-25 2.867726e-21
## ENSG00000145990.11 GFOD1 1.853188e-23 5.842043e-20
## ENSG00000057294.16 PKP2 1.952091e-22 5.384598e-19
## ENSG00000064300.9 NGFR 6.556155e-20 1.607496e-16
## ENSG00000119138.5 KLF9 1.098688e-19 2.424474e-16
mean(abs(dge$stat))
## [1] 1.214831
avb_crplo_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 36 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000279359.1 RP11-36D19.9 119.0473 -5.771853 0.4355807 -13.250941
## ENSG00000079215.15 SLC1A3 1233.6621 -4.437439 0.3485794 -12.730066
## ENSG00000164056.11 SPRY1 233.2489 -2.991515 0.2506529 -11.934892
## ENSG00000177575.13 CD163 22464.4744 -2.338915 0.2145180 -10.903116
## ENSG00000198363.18 ASPH 1869.6176 -1.762933 0.1692565 -10.415751
## ENSG00000101187.16 SLCO4A1 173.3867 -2.889167 0.2818293 -10.251477
## ENSG00000179593.16 ALOX15B 1208.5228 -4.514366 0.4436746 -10.174949
## ENSG00000174705.13 SH3PXD2B 449.8765 -2.978486 0.2929394 -10.167583
## ENSG00000111666.11 CHPT1 1299.4517 -1.180981 0.1223822 -9.649940
## ENSG00000135678.12 CPM 757.4190 -2.017817 0.2097880 -9.618365
## pvalue padj
## ENSG00000279359.1 RP11-36D19.9 4.455820e-40 9.832658e-36
## ENSG00000079215.15 SLC1A3 4.024988e-37 4.440971e-33
## ENSG00000164056.11 SPRY1 7.786033e-33 5.727146e-29
## ENSG00000177575.13 CD163 1.113768e-27 6.144379e-24
## ENSG00000198363.18 ASPH 2.101319e-25 9.273963e-22
## ENSG00000101187.16 SLCO4A1 1.165485e-24 4.286460e-21
## ENSG00000179593.16 ALOX15B 2.565331e-24 7.632111e-21
## ENSG00000174705.13 SH3PXD2B 2.766887e-24 7.632111e-21
## ENSG00000111666.11 CHPT1 4.918461e-22 1.205952e-18
## ENSG00000135678.12 CPM 6.688594e-22 1.475972e-18
mean(abs(dge$stat))
## [1] 1.063154
avb_crplo_eos_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2 <- subset(ss2,crp_group==4)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 138 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000164056.11 SPRY1 108.76416 -2.532611 0.23899712 -10.596828
## ENSG00000141744.4 PNMT 43.59054 -3.129900 0.34146482 -9.166098
## ENSG00000179593.16 ALOX15B 547.79622 -3.162199 0.34916134 -9.056555
## ENSG00000279359.1 RP11-36D19.9 90.34984 -4.142272 0.49438312 -8.378669
## ENSG00000196935.9 SRGAP1 282.89286 -1.854628 0.22365011 -8.292544
## ENSG00000078053.17 AMPH 157.80806 -2.281000 0.27608001 -8.262098
## ENSG00000162599.17 NFIA 308.67237 -1.033152 0.12746020 -8.105685
## ENSG00000272870.3 SAP30-DT 121.11095 -0.790170 0.09793683 -8.068160
## ENSG00000110721.12 CHKA 869.95783 -1.224275 0.15266598 -8.019306
## ENSG00000121578.13 B4GALT4 865.64102 -1.070193 0.13932538 -7.681251
## pvalue padj
## ENSG00000164056.11 SPRY1 3.082572e-26 6.802313e-22
## ENSG00000141744.4 PNMT 4.904496e-20 5.411376e-16
## ENSG00000179593.16 ALOX15B 1.346361e-19 9.903380e-16
## ENSG00000279359.1 RP11-36D19.9 5.352966e-17 2.953098e-13
## ENSG00000196935.9 SRGAP1 1.108512e-16 4.892306e-13
## ENSG00000078053.17 AMPH 1.431340e-16 5.264228e-13
## ENSG00000162599.17 NFIA 5.244902e-16 1.653418e-12
## ENSG00000272870.3 SAP30-DT 7.136550e-16 1.968528e-12
## ENSG00000110721.12 CHKA 1.063440e-15 2.607436e-12
## ENSG00000121578.13 B4GALT4 1.575425e-14 3.476490e-11
mean(abs(dge$stat))
## [1] 1.198902
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 4 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000164056.11 SPRY1 108.76416 -2.6668649 0.24704112 -10.795227
## ENSG00000279359.1 RP11-36D19.9 90.34984 -4.7806957 0.51198611 -9.337550
## ENSG00000141744.4 PNMT 43.59054 -3.2127745 0.34991481 -9.181590
## ENSG00000196935.9 SRGAP1 282.89286 -1.9362951 0.21148877 -9.155545
## ENSG00000179593.16 ALOX15B 547.79622 -3.1108104 0.36495565 -8.523804
## ENSG00000272870.3 SAP30-DT 121.11095 -0.8277754 0.09990025 -8.286020
## ENSG00000198585.12 NUDT16 4719.70533 -1.1852330 0.14956356 -7.924611
## ENSG00000121578.13 B4GALT4 865.64102 -1.0802468 0.13973364 -7.730757
## ENSG00000078053.17 AMPH 157.80806 -2.1961542 0.28426014 -7.725860
## ENSG00000124523.17 SIRT5 897.95933 -0.9857449 0.12942403 -7.616398
## pvalue padj
## ENSG00000164056.11 SPRY1 3.625670e-27 8.000766e-23
## ENSG00000279359.1 RP11-36D19.9 9.858924e-21 1.087784e-16
## ENSG00000141744.4 PNMT 4.247710e-20 2.983674e-16
## ENSG00000196935.9 SRGAP1 5.408391e-20 2.983674e-16
## ENSG00000179593.16 ALOX15B 1.543970e-17 6.814156e-14
## ENSG00000272870.3 SAP30-DT 1.171015e-16 4.306798e-13
## ENSG00000198585.12 NUDT16 2.288621e-15 7.214715e-12
## ENSG00000121578.13 B4GALT4 1.069090e-14 2.724055e-11
## ENSG00000078053.17 AMPH 1.111002e-14 2.724055e-11
## ENSG00000124523.17 SIRT5 2.608535e-14 5.404484e-11
mean(abs(dge$stat))
## [1] 1.268338
avb_crphi_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 21 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000164056.11 SPRY1 108.76416 -2.6240004 0.2866809 -9.153036
## ENSG00000198585.12 NUDT16 4719.70533 -1.1065368 0.1246267 -8.878812
## ENSG00000179593.16 ALOX15B 547.79622 -2.9766733 0.3371682 -8.828453
## ENSG00000141744.4 PNMT 43.59054 -2.7567440 0.3639445 -7.574628
## ENSG00000135678.12 CPM 424.57761 -1.1706712 0.1570310 -7.455033
## ENSG00000136478.8 TEX2 868.89100 -0.8306822 0.1119672 -7.418975
## ENSG00000279359.1 RP11-36D19.9 90.34984 -3.6900821 0.4986320 -7.400412
## ENSG00000196935.9 SRGAP1 282.89286 -1.4946291 0.2031922 -7.355740
## ENSG00000177575.13 CD163 27248.95998 -1.9982460 0.2717813 -7.352403
## ENSG00000111666.11 CHPT1 1111.30704 -0.9224094 0.1255707 -7.345735
## pvalue padj
## ENSG00000164056.11 SPRY1 5.535555e-20 1.221531e-15
## ENSG00000198585.12 NUDT16 6.757838e-19 7.456261e-15
## ENSG00000179593.16 ALOX15B 1.061335e-18 7.806830e-15
## ENSG00000141744.4 PNMT 3.601587e-14 1.986906e-10
## ENSG00000135678.12 CPM 8.984555e-14 3.965244e-10
## ENSG00000136478.8 TEX2 1.180302e-13 4.279816e-10
## ENSG00000279359.1 RP11-36D19.9 1.357625e-13 4.279816e-10
## ENSG00000196935.9 SRGAP1 1.898722e-13 4.515593e-10
## ENSG00000177575.13 CD163 1.946742e-13 4.515593e-10
## ENSG00000111666.11 CHPT1 2.046310e-13 4.515593e-10
mean(abs(dge$stat))
## [1] 0.9300171
avb_crphi_eos_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2 <- subset(ss2,crp_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 101 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000186081.12 KRT5 14.54816 2.576551 0.4235145 6.083739
## ENSG00000204044.6 SLC12A5-AS1 25.14097 -3.075518 0.5156713 -5.964106
## ENSG00000152463.15 OLAH 24.53332 -2.873612 0.4851144 -5.923576
## ENSG00000146072.6 TNFRSF21 42.89273 1.040202 0.1856047 5.604397
## ENSG00000259162.1 RP11-203M5.6 26.14874 1.581181 0.2897282 5.457464
## ENSG00000142627.13 EPHA2 19.20075 1.637299 0.3055916 5.357800
## ENSG00000204936.10 CD177 618.22625 -3.037256 0.5784007 -5.251128
## ENSG00000155659.15 VSIG4 1772.93303 -2.147337 0.4164836 -5.155873
## ENSG00000154269.15 ENPP3 32.41553 1.154158 0.2238815 5.155219
## ENSG00000229961.3 RP11-71G12.1 75.23398 1.526810 0.2963257 5.152473
## pvalue padj
## ENSG00000186081.12 KRT5 1.174117e-09 2.237975e-05
## ENSG00000204044.6 SLC12A5-AS1 2.459774e-09 2.237975e-05
## ENSG00000152463.15 OLAH 3.150155e-09 2.237975e-05
## ENSG00000146072.6 TNFRSF21 2.089813e-08 1.113505e-04
## ENSG00000259162.1 RP11-203M5.6 4.829843e-08 2.058769e-04
## ENSG00000142627.13 EPHA2 8.424125e-08 2.992389e-04
## ENSG00000204936.10 CD177 1.511709e-07 4.602722e-04
## ENSG00000155659.15 VSIG4 2.524517e-07 5.478997e-04
## ENSG00000154269.15 ENPP3 2.533344e-07 5.478997e-04
## ENSG00000229961.3 RP11-71G12.1 2.570730e-07 5.478997e-04
mean(abs(dge$stat))
## [1] 1.083967
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 13 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000155659.15 VSIG4 1772.93303 -2.8430229 0.3779008 -7.523199
## ENSG00000087116.16 ADAMTS2 1436.45625 -3.7809954 0.5047904 -7.490228
## ENSG00000186081.12 KRT5 14.54816 2.8196082 0.4406358 6.398953
## ENSG00000100985.7 MMP9 3694.58444 -2.9906166 0.5378907 -5.559897
## ENSG00000229961.3 RP11-71G12.1 75.23398 1.6244682 0.2922447 5.558589
## ENSG00000259162.1 RP11-203M5.6 26.14874 1.5474793 0.2819920 5.487672
## ENSG00000105223.20 PLD3 5830.74295 -0.5634004 0.1044934 -5.391731
## ENSG00000142627.13 EPHA2 19.20075 1.7165708 0.3199429 5.365241
## ENSG00000149534.9 MS4A2 73.42690 1.3047162 0.2497248 5.224615
## ENSG00000115590.14 IL1R2 1210.70599 -2.4983175 0.4790051 -5.215638
## pvalue padj
## ENSG00000155659.15 VSIG4 5.345199e-14 4.485859e-10
## ENSG00000087116.16 ADAMTS2 6.875406e-14 4.485859e-10
## ENSG00000186081.12 KRT5 1.564457e-10 NA
## ENSG00000100985.7 MMP9 2.699346e-08 1.174126e-04
## ENSG00000229961.3 RP11-71G12.1 2.719649e-08 NA
## ENSG00000259162.1 RP11-203M5.6 4.072656e-08 NA
## ENSG00000105223.20 PLD3 6.978235e-08 2.276475e-04
## ENSG00000142627.13 EPHA2 8.084104e-08 NA
## ENSG00000149534.9 MS4A2 1.745178e-07 NA
## ENSG00000115590.14 IL1R2 1.831854e-07 4.780772e-04
mean(abs(dge$stat))
## [1] 1.269171
avb_crplo_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 31 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000155659.15 VSIG4 1772.93303 -2.0162400 0.33720409 -5.979287
## ENSG00000087116.16 ADAMTS2 1436.45625 -2.5921949 0.44416561 -5.836100
## ENSG00000186081.12 KRT5 14.54816 2.3884544 0.46530924 5.133047
## ENSG00000101004.15 NINL 135.28899 -0.8218318 0.16059437 -5.117438
## ENSG00000229961.3 RP11-71G12.1 75.23398 1.5446697 0.30769579 5.020120
## ENSG00000135218.19 CD36 8887.60274 0.5861706 0.12676967 4.623902
## ENSG00000259162.1 RP11-203M5.6 26.14874 1.3790996 0.29872566 4.616609
## ENSG00000149534.9 MS4A2 73.42690 1.1437686 0.24915410 4.590607
## ENSG00000146072.6 TNFRSF21 42.89273 0.9927942 0.21822908 4.549321
## ENSG00000102524.12 TNFSF13B 1891.40085 0.4415408 0.09736608 4.534853
## pvalue padj
## ENSG00000155659.15 VSIG4 2.241164e-09 4.776592e-05
## ENSG00000087116.16 ADAMTS2 5.343688e-09 5.694501e-05
## ENSG00000186081.12 KRT5 2.850879e-07 1.650231e-03
## ENSG00000101004.15 NINL 3.097136e-07 1.650231e-03
## ENSG00000229961.3 RP11-71G12.1 5.163929e-07 2.201176e-03
## ENSG00000135218.19 CD36 3.765872e-06 1.177432e-02
## ENSG00000259162.1 RP11-203M5.6 3.900613e-06 1.177432e-02
## ENSG00000149534.9 MS4A2 4.419584e-06 1.177432e-02
## ENSG00000146072.6 TNFRSF21 5.381922e-06 1.228560e-02
## ENSG00000102524.12 TNFSF13B 5.764368e-06 1.228560e-02
mean(abs(dge$stat))
## [1] 1.007265
avb_crplo_pod1_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2 <- subset(ss2,crp_group==4)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ treatment_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 250 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000131016.17 AKAP12 118.08416 1.904352 0.3589243 5.305720
## ENSG00000149534.9 MS4A2 93.32501 1.866884 0.3696944 5.049802
## ENSG00000186081.12 KRT5 14.48988 2.017019 0.4076553 4.947854
## ENSG00000229961.3 RP11-71G12.1 58.76722 1.339119 0.2735828 4.894748
## ENSG00000140287.11 HDC 571.69806 1.885327 0.3856378 4.888854
## ENSG00000179348.12 GATA2 810.50517 1.608564 0.3351194 4.799972
## ENSG00000158715.6 SLC45A3 258.52371 1.306224 0.2724756 4.793914
## ENSG00000155659.15 VSIG4 1247.57497 -2.125654 0.4456523 -4.769759
## ENSG00000246363.3 LINC02458 30.59379 1.828502 0.3945107 4.634861
## ENSG00000259162.1 RP11-203M5.6 23.31281 1.481681 0.3212551 4.612163
## pvalue padj
## ENSG00000131016.17 AKAP12 1.122292e-07 0.002252664
## ENSG00000149534.9 MS4A2 4.422678e-07 0.004071602
## ENSG00000186081.12 KRT5 7.503634e-07 0.004071602
## ENSG00000229961.3 RP11-71G12.1 9.843169e-07 0.004071602
## ENSG00000140287.11 HDC 1.014249e-06 0.004071602
## ENSG00000179348.12 GATA2 1.586878e-06 0.004627772
## ENSG00000158715.6 SLC45A3 1.635586e-06 0.004627772
## ENSG00000155659.15 VSIG4 1.844469e-06 0.004627772
## ENSG00000246363.3 LINC02458 3.571771e-06 0.007965842
## ENSG00000259162.1 RP11-203M5.6 3.985011e-06 0.007998713
mean(abs(dge$stat))
## [1] 0.7008559
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 12 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000274611.4 TBC1D3 56.78775 24.6259129 3.4483190 7.141425
## ENSG00000155659.15 VSIG4 1247.57497 -2.6203399 0.4278950 -6.123792
## ENSG00000225630.1 MTND2P28 106.98061 -2.1843250 0.4275721 -5.108671
## ENSG00000078053.17 AMPH 167.98498 -1.4433937 0.2894022 -4.987501
## ENSG00000186081.12 KRT5 14.48988 2.0084378 0.4078153 4.924871
## ENSG00000198794.12 SCAMP5 65.44507 0.9688744 0.1982756 4.886503
## ENSG00000131016.17 AKAP12 118.08416 1.8265393 0.3775147 4.838327
## ENSG00000211935.3 IGHV1-3 64.54582 1.5403262 0.3349438 4.598760
## ENSG00000179348.12 GATA2 810.50517 1.6245124 0.3536185 4.593969
## ENSG00000158715.6 SLC45A3 258.52371 1.3109805 0.2872035 4.564640
## pvalue padj
## ENSG00000274611.4 TBC1D3 9.236800e-13 1.968639e-08
## ENSG00000155659.15 VSIG4 9.137432e-10 9.737304e-06
## ENSG00000225630.1 MTND2P28 3.244325e-07 2.304877e-03
## ENSG00000078053.17 AMPH 6.116527e-07 3.259039e-03
## ENSG00000186081.12 KRT5 8.441599e-07 3.598316e-03
## ENSG00000198794.12 SCAMP5 1.026427e-06 3.646041e-03
## ENSG00000131016.17 AKAP12 1.309367e-06 3.986649e-03
## ENSG00000211935.3 IGHV1-3 4.250123e-06 1.000198e-02
## ENSG00000179348.12 GATA2 4.348936e-06 1.000198e-02
## ENSG00000158715.6 SLC45A3 5.003516e-06 1.000198e-02
mean(abs(dge$stat))
## [1] 0.6749622
avb_crphi_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + treatment_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 27 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000155659.15 VSIG4 1247.57497 -2.2950320 0.4103782 -5.592480
## ENSG00000186081.12 KRT5 14.48988 2.2017581 0.4208578 5.231596
## ENSG00000244116.3 IGKV2-28 94.97033 1.5259902 0.2978606 5.123169
## ENSG00000078053.17 AMPH 167.98498 -1.3607277 0.2682920 -5.071816
## ENSG00000198794.12 SCAMP5 65.44507 1.0198080 0.2040033 4.998977
## ENSG00000100453.13 GZMB 1428.85140 0.7408132 0.1523886 4.861341
## ENSG00000111249.14 CUX2 26.61410 1.3377186 0.2819178 4.745066
## ENSG00000132465.12 JCHAIN 1006.89958 1.1477793 0.2442314 4.699557
## ENSG00000211644.3 IGLV1-51 142.58626 1.2413126 0.2645593 4.692001
## ENSG00000211648.2 IGLV1-47 132.22632 1.2555063 0.2699166 4.651460
## pvalue padj
## ENSG00000155659.15 VSIG4 2.238485e-08 0.0004770883
## ENSG00000186081.12 KRT5 1.680525e-07 0.0017908519
## ENSG00000244116.3 IGKV2-28 3.004422e-07 0.0020995293
## ENSG00000078053.17 AMPH 3.940373e-07 0.0020995293
## ENSG00000198794.12 SCAMP5 5.763528e-07 0.0024567613
## ENSG00000100453.13 GZMB 1.165930e-06 0.0041415789
## ENSG00000111249.14 CUX2 2.084384e-06 0.0063463532
## ENSG00000132465.12 JCHAIN 2.607268e-06 0.0064068377
## ENSG00000211644.3 IGLV1-51 2.705463e-06 0.0064068377
## ENSG00000211648.2 IGLV1-47 3.295937e-06 0.0070246314
mean(abs(dge$stat))
## [1] 0.7716238
avb_crphi_pod1_adj <- dge
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2 <- subset(ss2,treatment_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 116 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000211640.4 IGLV6-57 73.94116 -0.5012755 0.09503047 -5.274892
## ENSG00000211644.3 IGLV1-51 249.02248 -0.6160909 0.11760002 -5.238867
## ENSG00000211652.2 IGLV7-43 52.51336 -0.6502306 0.12553982 -5.179477
## ENSG00000211655.3 IGLV1-36 15.50197 -0.6064267 0.12419400 -4.882899
## ENSG00000279359.1 RP11-36D19.9 43.68155 -1.1247960 0.24645555 -4.563890
## ENSG00000263711.6 LINC02864 16.11890 0.3756337 0.08415429 4.463631
## ENSG00000211673.2 IGLV3-1 180.11226 -0.4099034 0.09521025 -4.305244
## ENSG00000113790.11 EHHADH 70.48409 0.1809165 0.04324047 4.183961
## ENSG00000087116.16 ADAMTS2 144.89306 -0.8669518 0.21120081 -4.104869
## ENSG00000211649.3 IGLV7-46 63.48682 -0.5888044 0.14456169 -4.073032
## pvalue padj
## ENSG00000211640.4 IGLV6-57 1.328340e-07 0.001595313
## ENSG00000211644.3 IGLV1-51 1.615655e-07 0.001595313
## ENSG00000211652.2 IGLV7-43 2.225087e-07 0.001595313
## ENSG00000211655.3 IGLV1-36 1.045375e-06 0.005621245
## ENSG00000279359.1 RP11-36D19.9 5.021440e-06 0.021601231
## ENSG00000263711.6 LINC02864 8.058233e-06 0.028887422
## ENSG00000211673.2 IGLV3-1 1.668016e-05 0.051253362
## ENSG00000113790.11 EHHADH 2.864731e-05 0.077021866
## ENSG00000087116.16 ADAMTS2 4.045433e-05 0.096681348
## ENSG00000211649.3 IGLV7-46 4.640502e-05 0.099812567
mean(abs(dge$stat))
## [1] 0.8542119
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 32 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000274611.4 TBC1D3 61.02093 -9.1421319 1.32357018 -6.907176
## ENSG00000278599.6 TBC1D3E 18.70446 -7.7732985 1.32279633 -5.876414
## ENSG00000280035.1 RP11-10J21.2 19.49861 0.4739069 0.10831763 4.375159
## ENSG00000203999.9 LINC01270 118.07792 0.3907644 0.09069852 4.308388
## ENSG00000203814.6 H2BC18 64.74206 0.4615320 0.10714744 4.307448
## ENSG00000211652.2 IGLV7-43 52.51336 -0.6209722 0.14601307 -4.252854
## ENSG00000211655.3 IGLV1-36 15.50197 -0.5985422 0.14468746 -4.136794
## ENSG00000225764.2 P3H2-AS1 10.20911 0.3674466 0.08926867 4.116188
## ENSG00000211679.2 IGLC3 802.12582 -0.6140036 0.14922781 -4.114539
## ENSG00000267303.1 CTD-2369P2.12 14.16784 1.8578040 0.45893943 4.048037
## pvalue padj
## ENSG00000274611.4 TBC1D3 4.943970e-12 1.084460e-07
## ENSG00000278599.6 TBC1D3E 4.192497e-09 4.598122e-05
## ENSG00000280035.1 RP11-10J21.2 1.213438e-05 7.245086e-02
## ENSG00000203999.9 LINC01270 1.644490e-05 7.245086e-02
## ENSG00000203814.6 H2BC18 1.651490e-05 7.245086e-02
## ENSG00000211652.2 IGLV7-43 2.110636e-05 7.716132e-02
## ENSG00000211655.3 IGLV1-36 3.521917e-05 9.455306e-02
## ENSG00000225764.2 P3H2-AS1 3.851906e-05 9.455306e-02
## ENSG00000211679.2 IGLC3 3.879542e-05 9.455306e-02
## ENSG00000267303.1 CTD-2369P2.12 5.164888e-05 1.132918e-01
mean(abs(dge$stat))
## [1] 0.8946569
crp_t0_a <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 64 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000159189.12 C1QC 45.19243 -0.61040271 0.11320809 -5.391865
## ENSG00000087116.16 ADAMTS2 144.89306 -0.85240097 0.18580458 -4.587621
## ENSG00000211679.2 IGLC3 802.12582 -0.62143112 0.13983515 -4.444027
## ENSG00000173372.17 C1QA 192.01764 -0.39137607 0.09031117 -4.333640
## ENSG00000203999.9 LINC01270 118.07792 0.22311851 0.05197879 4.292491
## ENSG00000173369.17 C1QB 92.28254 -0.43614730 0.10549032 -4.134477
## ENSG00000213614.11 HEXA 3639.45630 -0.06053357 0.01485216 -4.075741
## ENSG00000255446.1 CTD-2531D15.4 17.55993 0.76019150 0.18656840 4.074600
## ENSG00000225764.2 P3H2-AS1 10.20911 0.38295801 0.09524856 4.020618
## ENSG00000211652.2 IGLV7-43 52.51336 -0.56403860 0.14280690 -3.949659
## pvalue padj
## ENSG00000159189.12 C1QC 6.973031e-08 0.001529534
## ENSG00000087116.16 ADAMTS2 4.483259e-06 0.049170146
## ENSG00000211679.2 IGLC3 8.829069e-06 0.064555213
## ENSG00000173372.17 C1QA 1.466640e-05 0.077509273
## ENSG00000203999.9 LINC01270 1.766794e-05 0.077509273
## ENSG00000173369.17 C1QB 3.557638e-05 0.126382864
## ENSG00000213614.11 HEXA 4.586806e-05 0.126382864
## ENSG00000255446.1 CTD-2531D15.4 4.609359e-05 0.126382864
## ENSG00000225764.2 P3H2-AS1 5.804578e-05 0.141470463
## ENSG00000211652.2 IGLV7-43 7.826251e-05 0.164623107
mean(abs(dge$stat))
## [1] 0.9127956
crp_t0_a_adj <- dge
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2 <- subset(ss2,treatment_group==2)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 698 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000274012.1 RN7SL2 1005.235348 0.3958682 0.08669141 4.566406
## ENSG00000276168.1 RN7SL1 552.306545 0.3503249 0.07971834 4.394533
## ENSG00000165029.17 ABCA1 676.864267 -0.2828435 0.07210273 -3.922786
## ENSG00000050767.18 COL23A1 32.027614 0.3304131 0.08555804 3.861860
## ENSG00000134321.13 RSAD2 562.578486 -0.3324054 0.08678497 -3.830218
## ENSG00000183117.19 CSMD1 45.694830 -0.4349344 0.11463824 -3.793973
## ENSG00000160179.19 ABCG1 431.970974 -0.1967463 0.05391060 -3.649492
## ENSG00000170153.11 RNF150 8.709392 -0.6451407 0.18034165 -3.577325
## ENSG00000196565.15 HBG2 259.037625 0.6586983 0.18849443 3.494524
## ENSG00000049247.14 UTS2 39.902866 0.3607631 0.10531292 3.425630
## pvalue padj
## ENSG00000274012.1 RN7SL2 4.961569e-06 0.1088320
## ENSG00000276168.1 RN7SL1 1.110112e-05 0.1217515
## ENSG00000165029.17 ABCA1 8.753099e-05 0.5419992
## ENSG00000050767.18 COL23A1 1.125272e-04 0.5419992
## ENSG00000134321.13 RSAD2 1.280297e-04 0.5419992
## ENSG00000183117.19 CSMD1 1.482560e-04 0.5419992
## ENSG00000160179.19 ABCG1 2.627594e-04 0.8233755
## ENSG00000170153.11 RNF150 3.471282e-04 0.9483616
## ENSG00000196565.15 HBG2 4.749080e-04 0.9483616
## ENSG00000049247.14 UTS2 6.133744e-04 0.9483616
mean(abs(dge$stat))
## [1] 0.8139089
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 10 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000261026.1 CTD-3247F14.2 17.85183 -1.5571374 0.32799079 -4.747503
## ENSG00000078114.19 NEBL 35.44421 -0.8971179 0.19693027 -4.555510
## ENSG00000181126.13 HLA-V 336.08002 -0.6596009 0.14626672 -4.509576
## ENSG00000243224.1 RP5-1157M23.2 44.63609 0.2250026 0.05160220 4.360330
## ENSG00000139287.13 TPH2 42.04591 -0.1966640 0.04960475 -3.964621
## ENSG00000123838.11 C4BPA 52.54006 -1.0745917 0.27120418 -3.962298
## ENSG00000152767.17 FARP1 160.37684 -0.1615427 0.04078672 -3.960669
## ENSG00000254681.6 PKD1P5 2181.95360 0.3720053 0.09477315 3.925218
## ENSG00000234200.2 U82671.8 7.77978 -3.3541119 0.85766640 -3.910742
## ENSG00000119922.11 IFIT2 1651.64063 -0.5395922 0.13801521 -3.909657
## pvalue padj
## ENSG00000261026.1 CTD-3247F14.2 2.059429e-06 0.04517357
## ENSG00000078114.19 NEBL 5.225851e-06 0.04749455
## ENSG00000181126.13 HLA-V 6.495722e-06 0.04749455
## ENSG00000243224.1 RP5-1157M23.2 1.298667e-05 0.07121566
## ENSG00000139287.13 TPH2 7.351253e-05 0.20273898
## ENSG00000123838.11 C4BPA 7.423193e-05 0.20273898
## ENSG00000152767.17 FARP1 7.474008e-05 0.20273898
## ENSG00000254681.6 PKD1P5 8.665112e-05 0.20273898
## ENSG00000234200.2 U82671.8 9.201305e-05 0.20273898
## ENSG00000119922.11 IFIT2 9.242716e-05 0.20273898
mean(abs(dge$stat))
## [1] 1.048837
crp_t0_b <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 37 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000074803.20 SLC12A1 35.08453 0.9232535 0.20493435 4.505118
## ENSG00000181126.13 HLA-V 336.08002 -0.6874493 0.15532026 -4.426012
## ENSG00000260447.1 RP11-304L19.2 11.22085 0.5616031 0.13834177 4.059534
## ENSG00000243224.1 RP5-1157M23.2 44.63609 0.1916944 0.04730875 4.051987
## ENSG00000102854.16 MSLN 53.97058 -0.8976728 0.22200779 -4.043429
## ENSG00000274012.1 RN7SL2 1005.23535 0.4319977 0.11522675 3.749110
## ENSG00000154620.6 TMSB4Y 57.87988 -0.3468061 0.09388817 -3.693821
## ENSG00000249021.1 CTC-505O3.3 12.00956 -0.2429091 0.06703903 -3.623398
## ENSG00000275329.1 RP11-83N9.6 9.92069 0.3085148 0.08571373 3.599363
## ENSG00000276168.1 RN7SL1 552.30655 0.3722778 0.10422593 3.571835
## pvalue padj
## ENSG00000074803.20 SLC12A1 6.633602e-06 0.1052785
## ENSG00000181126.13 HLA-V 9.599131e-06 0.1052785
## ENSG00000260447.1 RP11-304L19.2 4.917078e-05 0.2310855
## ENSG00000243224.1 RP5-1157M23.2 5.078452e-05 0.2310855
## ENSG00000102854.16 MSLN 5.267506e-05 0.2310855
## ENSG00000274012.1 RN7SL2 1.774633e-04 0.6487762
## ENSG00000154620.6 TMSB4Y 2.209090e-04 0.6895471
## ENSG00000249021.1 CTC-505O3.3 2.907582e-04 0.6895471
## ENSG00000275329.1 RP11-83N9.6 3.189981e-04 0.6895471
## ENSG00000276168.1 RN7SL1 3.544892e-04 0.6895471
mean(abs(dge$stat))
## [1] 0.9050121
crp_t0_b_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2 <- subset(ss2,treatment_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 147 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234200.2 U82671.8 23.74781 -8.4175144 0.77303756 -10.888881
## ENSG00000204936.10 CD177 2664.36071 1.3040566 0.17489685 7.456147
## ENSG00000139572.4 GPR84 194.83068 0.9793090 0.14085127 6.952788
## ENSG00000170525.21 PFKFB3 4648.61993 0.6889378 0.10435049 6.602152
## ENSG00000176597.12 B3GNT5 364.32833 0.6671879 0.10515716 6.344674
## ENSG00000132170.24 PPARG 105.03591 0.8466718 0.13451424 6.294291
## ENSG00000079385.23 CEACAM1 1083.04204 0.9180621 0.14625935 6.276946
## ENSG00000187775.17 DNAH17 603.73830 0.4485171 0.07177030 6.249342
## ENSG00000136634.7 IL10 75.57604 0.7835233 0.12565466 6.235529
## ENSG00000135916.16 ITM2C 660.76621 -0.3402779 0.05494465 -6.193103
## pvalue padj
## ENSG00000234200.2 U82671.8 1.302299e-27 2.873783e-23
## ENSG00000204936.10 CD177 8.908934e-14 9.829672e-10
## ENSG00000139572.4 GPR84 3.581374e-12 2.634339e-08
## ENSG00000170525.21 PFKFB3 4.052308e-11 2.235557e-07
## ENSG00000176597.12 B3GNT5 2.228974e-10 9.837353e-07
## ENSG00000132170.24 PPARG 3.088083e-10 1.088493e-06
## ENSG00000079385.23 CEACAM1 3.452870e-10 1.088493e-06
## ENSG00000187775.17 DNAH17 4.121860e-10 1.103974e-06
## ENSG00000136634.7 IL10 4.502544e-10 1.103974e-06
## ENSG00000135916.16 ITM2C 5.899107e-10 1.301756e-06
mean(abs(dge$stat))
## [1] 1.38808
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 15 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000278599.6 TBC1D3E 21.72131 -9.5091840 1.36712905 -6.955586
## ENSG00000258035.2 RP11-74K11.2 20.51590 0.7829509 0.14913630 5.249902
## ENSG00000076356.7 PLXNA2 257.73317 0.4637443 0.09006613 5.148932
## ENSG00000204936.10 CD177 2664.36071 1.1813741 0.23038336 5.127862
## ENSG00000159339.13 PADI4 13293.00692 0.8773302 0.17145591 5.116943
## ENSG00000283345.1 CTD-3092A11.3 36.24736 -0.8647693 0.17126294 -5.049366
## ENSG00000235750.10 KIAA0040 3047.27341 0.3626779 0.07233427 5.013916
## ENSG00000211966.2 IGHV5-51 190.52981 -0.6341490 0.12802140 -4.953461
## ENSG00000176597.12 B3GNT5 364.32833 0.6767111 0.13768368 4.914969
## ENSG00000203999.9 LINC01270 154.60133 0.5209756 0.10632606 4.899793
## pvalue padj
## ENSG00000278599.6 TBC1D3E 3.510989e-12 7.597430e-08
## ENSG00000258035.2 RP11-74K11.2 1.521803e-07 1.343898e-03
## ENSG00000076356.7 PLXNA2 2.619744e-07 1.343898e-03
## ENSG00000204936.10 CD177 2.930508e-07 1.343898e-03
## ENSG00000159339.13 PADI4 3.105268e-07 1.343898e-03
## ENSG00000283345.1 CTD-3092A11.3 4.432785e-07 1.598684e-03
## ENSG00000235750.10 KIAA0040 5.333326e-07 1.648683e-03
## ENSG00000211966.2 IGHV5-51 7.290485e-07 1.971985e-03
## ENSG00000176597.12 B3GNT5 8.879630e-07 2.075999e-03
## ENSG00000203999.9 LINC01270 9.593786e-07 2.075999e-03
mean(abs(dge$stat))
## [1] 1.342191
crp_eos_a <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 40 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000278599.6 TBC1D3E 21.721315 -10.2553792 1.57048944 -6.530053
## ENSG00000283345.1 CTD-3092A11.3 36.247361 -1.0143138 0.19599438 -5.175219
## ENSG00000282339.1 LLNLF-176F2.1 8.210905 -7.0092066 1.56534067 -4.477752
## ENSG00000228668.1 TRGV5P 68.256402 -0.6121273 0.14417889 -4.245609
## ENSG00000102524.12 TNFSF13B 1626.167180 0.2026085 0.04878493 4.153095
## ENSG00000233937.7 CTC-338M12.4 139.717383 -0.1962152 0.04828903 -4.063350
## ENSG00000087116.16 ADAMTS2 779.297104 -0.8224640 0.21388124 -3.845424
## ENSG00000204001.10 LCN8 74.503582 1.3227176 0.35412841 3.735135
## ENSG00000260271.3 RP1-45N11.1 58.349444 0.2237876 0.05997524 3.731333
## ENSG00000258035.2 RP11-74K11.2 20.515904 0.4374891 0.12062897 3.626734
## pvalue padj
## ENSG00000278599.6 TBC1D3E 6.574645e-11 1.450827e-06
## ENSG00000283345.1 CTD-3092A11.3 2.276443e-07 2.511713e-03
## ENSG00000282339.1 LLNLF-176F2.1 7.543333e-06 5.548624e-02
## ENSG00000228668.1 TRGV5P 2.180001e-05 1.202652e-01
## ENSG00000102524.12 TNFSF13B 3.280081e-05 1.447631e-01
## ENSG00000233937.7 CTC-338M12.4 4.837345e-05 1.779095e-01
## ENSG00000087116.16 ADAMTS2 1.203443e-04 3.793770e-01
## ENSG00000204001.10 LCN8 1.876142e-04 4.670096e-01
## ENSG00000260271.3 RP1-45N11.1 1.904693e-04 4.670096e-01
## ENSG00000258035.2 RP11-74K11.2 2.870291e-04 5.974092e-01
mean(abs(dge$stat))
## [1] 0.8431272
crp_eos_a_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2 <- subset(ss2,treatment_group==2)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 128 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000254873.1 RP11-770J1.5 47.54511 -0.5668914 0.10956959 -5.173802
## ENSG00000224370.1 RP11-814E24.3 49.76184 0.4472603 0.09003302 4.967737
## ENSG00000241860.7 RP11-34P13.13 4903.99846 0.3733341 0.07548746 4.945644
## ENSG00000236911.6 RP11-78B10.2 74.74105 0.5093406 0.10317766 4.936539
## ENSG00000238035.8 AC138035.2 1313.25226 0.3740127 0.07824059 4.780290
## ENSG00000280279.1 LINC02887 485.42034 0.3717109 0.07800330 4.765323
## ENSG00000101187.16 SLCO4A1 97.59953 0.5650296 0.11859698 4.764283
## ENSG00000260528.5 FAM157C 3056.40251 0.3665312 0.07752802 4.727726
## ENSG00000230724.9 LINC01001 3269.93917 0.3263093 0.06964405 4.685387
## ENSG00000264769.1 RP11-498C9.12 41.75148 0.2941786 0.06310258 4.661910
## pvalue padj
## ENSG00000254873.1 RP11-770J1.5 2.293775e-07 0.004386980
## ENSG00000224370.1 RP11-814E24.3 6.773878e-07 0.004386980
## ENSG00000241860.7 RP11-34P13.13 7.589257e-07 0.004386980
## ENSG00000236911.6 RP11-78B10.2 7.952109e-07 0.004386980
## ENSG00000238035.8 AC138035.2 1.750423e-06 0.005974689
## ENSG00000280279.1 LINC02887 1.885513e-06 0.005974689
## ENSG00000101187.16 SLCO4A1 1.895265e-06 0.005974689
## ENSG00000260528.5 FAM157C 2.270486e-06 0.006262853
## ENSG00000230724.9 LINC01001 2.794320e-06 0.006349407
## ENSG00000264769.1 RP11-498C9.12 3.132880e-06 0.006349407
mean(abs(dge$stat))
## [1] 1.041369
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 6 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000181126.13 HLA-V 396.17882 -0.8065180 0.16968812 -4.752943
## ENSG00000254873.1 RP11-770J1.5 47.54511 -0.5826915 0.13036812 -4.469586
## ENSG00000175874.10 CREG2 15.85760 0.3267149 0.07358462 4.439989
## ENSG00000074803.20 SLC12A1 36.66028 0.8797267 0.20623352 4.265682
## ENSG00000147852.17 VLDLR 60.44142 0.2384517 0.05590948 4.264959
## ENSG00000136274.9 NACAD 13.87223 -0.6254904 0.14828300 -4.218221
## ENSG00000158321.18 AUTS2 1107.33233 0.2789351 0.07162732 3.894256
## ENSG00000241484.10 ARHGAP8 48.02920 0.3161707 0.08197094 3.857107
## ENSG00000179841.8 AKAP5 164.98261 0.2475874 0.06420904 3.855958
## ENSG00000043514.17 TRIT1 438.15726 -0.1127517 0.02930561 -3.847443
## pvalue padj
## ENSG00000181126.13 HLA-V 2.004768e-06 0.04423922
## ENSG00000254873.1 RP11-770J1.5 7.837117e-06 0.06617416
## ENSG00000175874.10 CREG2 8.996352e-06 0.06617416
## ENSG00000074803.20 SLC12A1 1.992921e-05 0.08824094
## ENSG00000147852.17 VLDLR 1.999387e-05 0.08824094
## ENSG00000136274.9 NACAD 2.462376e-05 0.09056209
## ENSG00000158321.18 AUTS2 9.850051e-05 0.26338539
## ENSG00000241484.10 ARHGAP8 1.147368e-04 0.26338539
## ENSG00000179841.8 AKAP5 1.152772e-04 0.26338539
## ENSG00000043514.17 TRIT1 1.193571e-04 0.26338539
mean(abs(dge$stat))
## [1] 0.8858679
crp_eos_b <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 43 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000074803.20 SLC12A1 36.66028 0.9974386 0.22461306 4.440697
## ENSG00000099139.14 PCSK5 1284.53014 0.3871194 0.09428587 4.105805
## ENSG00000175874.10 CREG2 15.85760 0.2917051 0.07128112 4.092320
## ENSG00000112799.9 LY86 950.19080 -0.1665188 0.04275946 -3.894314
## ENSG00000181126.13 HLA-V 396.17882 -0.7233685 0.18577054 -3.893882
## ENSG00000167680.17 SEMA6B 577.15312 0.6250133 0.16781047 3.724519
## ENSG00000225813.1 AC009299.4 13.97230 0.6532641 0.17553863 3.721484
## ENSG00000264522.6 OTUD7B 213.51620 0.1084653 0.02937036 3.693019
## ENSG00000254873.1 RP11-770J1.5 47.54511 -0.4819888 0.13089433 -3.682274
## ENSG00000188599.17 NPIPP1 219.50741 -0.1879393 0.05163696 -3.639628
## pvalue padj
## ENSG00000074803.20 SLC12A1 8.966780e-06 0.1978699
## ENSG00000099139.14 PCSK5 4.029098e-05 0.3141451
## ENSG00000175874.10 CREG2 4.270791e-05 0.3141451
## ENSG00000112799.9 LY86 9.847704e-05 0.4353942
## ENSG00000181126.13 HLA-V 9.865279e-05 0.4353942
## ENSG00000167680.17 SEMA6B 1.956880e-04 0.5667857
## ENSG00000225813.1 AC009299.4 1.980558e-04 0.5667857
## ENSG00000264522.6 OTUD7B 2.216079e-04 0.5667857
## ENSG00000254873.1 RP11-770J1.5 2.311629e-04 0.5667857
## ENSG00000188599.17 NPIPP1 2.730319e-04 0.6024996
mean(abs(dge$stat))
## [1] 0.812993
crp_eos_b_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2 <- subset(ss2,treatment_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 189 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000007968.7 E2F2 650.98716 0.3793948 0.06004490 6.318518
## ENSG00000137869.15 CYP19A1 60.07323 0.9765750 0.15547632 6.281181
## ENSG00000157064.11 NMNAT2 41.43058 0.4835623 0.07985053 6.055844
## ENSG00000229647.2 MYOSLID 64.61066 0.3230300 0.05367352 6.018424
## ENSG00000165092.13 ALDH1A1 556.69422 -0.4848466 0.08067903 -6.009574
## ENSG00000145287.11 PLAC8 3561.63102 0.2630043 0.04382192 6.001663
## ENSG00000138821.14 SLC39A8 521.43838 0.2641136 0.04467962 5.911276
## ENSG00000132170.24 PPARG 144.66113 0.5565381 0.09494223 5.861861
## ENSG00000198018.7 ENTPD7 368.85840 0.2161036 0.03790805 5.700732
## ENSG00000116016.14 EPAS1 116.95950 0.3548271 0.06246220 5.680670
## pvalue padj
## ENSG00000007968.7 E2F2 2.640843e-10 2.955877e-06
## ENSG00000137869.15 CYP19A1 3.360096e-10 2.955877e-06
## ENSG00000157064.11 NMNAT2 1.396832e-09 5.727059e-06
## ENSG00000229647.2 MYOSLID 1.761237e-09 5.727059e-06
## ENSG00000165092.13 ALDH1A1 1.860109e-09 5.727059e-06
## ENSG00000145287.11 PLAC8 1.953072e-09 5.727059e-06
## ENSG00000138821.14 SLC39A8 3.394684e-09 8.532295e-06
## ENSG00000132170.24 PPARG 4.577089e-09 1.006616e-05
## ENSG00000198018.7 ENTPD7 1.192940e-08 2.332066e-05
## ENSG00000116016.14 EPAS1 1.341684e-08 2.360559e-05
mean(abs(dge$stat))
## [1] 1.199973
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 24 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000274611.4 TBC1D3 51.46372 -29.9971014 1.33164948 -22.526274
## ENSG00000278599.6 TBC1D3E 16.34998 -24.6438767 1.33044880 -18.522980
## ENSG00000124508.17 BTN2A2 1048.79703 -0.1524054 0.02952679 -5.161599
## ENSG00000215883.11 CYB5RL 377.01565 -0.1494372 0.02926069 -5.107098
## ENSG00000145287.11 PLAC8 3561.63102 0.2386112 0.05025490 4.748019
## ENSG00000116016.14 EPAS1 116.95950 0.3652741 0.07758263 4.708195
## ENSG00000157064.11 NMNAT2 41.43058 0.4652492 0.10224666 4.550263
## ENSG00000132170.24 PPARG 144.66113 0.5281924 0.11718850 4.507203
## ENSG00000128928.10 IVD 1354.85885 -0.1506996 0.03353677 -4.493564
## ENSG00000229647.2 MYOSLID 64.61066 0.2852099 0.06372435 4.475682
## pvalue padj
## ENSG00000274611.4 TBC1D3 2.294649e-112 3.942437e-108
## ENSG00000278599.6 TBC1D3E 1.347666e-76 NA
## ENSG00000124508.17 BTN2A2 2.448497e-07 1.873552e-03
## ENSG00000215883.11 CYB5RL 3.271437e-07 1.873552e-03
## ENSG00000145287.11 PLAC8 2.054189e-06 8.587756e-03
## ENSG00000116016.14 EPAS1 2.499202e-06 8.587756e-03
## ENSG00000157064.11 NMNAT2 5.357881e-06 1.454044e-02
## ENSG00000132170.24 PPARG 6.568765e-06 1.454044e-02
## ENSG00000128928.10 IVD 7.004101e-06 1.454044e-02
## ENSG00000229647.2 MYOSLID 7.616782e-06 1.454044e-02
mean(abs(dge$stat))
## [1] 0.9984722
crp_pod1_a <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 43 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000157064.11 NMNAT2 41.43058 0.4307897 0.07049745 6.110713
## ENSG00000137869.15 CYP19A1 60.07323 0.7601826 0.12811907 5.933407
## ENSG00000132170.24 PPARG 144.66113 0.4814874 0.08507043 5.659869
## ENSG00000188404.10 SELL 17514.30914 0.2026588 0.03680197 5.506737
## ENSG00000116016.14 EPAS1 116.95950 0.3446736 0.06274336 5.493387
## ENSG00000145287.11 PLAC8 3561.63102 0.2334478 0.04478158 5.213032
## ENSG00000124508.17 BTN2A2 1048.79703 -0.1590625 0.03058358 -5.200911
## ENSG00000148926.10 ADM 1409.50883 0.3668759 0.07139352 5.138785
## ENSG00000121316.11 PLBD1 16469.89344 0.2213555 0.04552841 4.861920
## ENSG00000213694.6 S1PR3 1119.21786 -0.2721949 0.05603887 -4.857252
## pvalue padj
## ENSG00000157064.11 NMNAT2 9.918731e-10 1.622109e-05
## ENSG00000137869.15 CYP19A1 2.967124e-09 2.426217e-05
## ENSG00000132170.24 PPARG 1.514889e-08 8.258167e-05
## ENSG00000188404.10 SELL 3.655462e-08 1.289671e-04
## ENSG00000116016.14 EPAS1 3.942984e-08 1.289671e-04
## ENSG00000145287.11 PLAC8 1.857793e-07 4.633183e-04
## ENSG00000124508.17 BTN2A2 1.983140e-07 4.633183e-04
## ENSG00000148926.10 ADM 2.765205e-07 5.652770e-04
## ENSG00000121316.11 PLBD1 1.162527e-06 1.835260e-03
## ENSG00000213694.6 S1PR3 1.190260e-06 1.835260e-03
mean(abs(dge$stat))
## [1] 1.026972
crp_pod1_a_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2 <- subset(ss2,treatment_group==2)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ crp_group )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 155 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000163710.9 PCOLCE2 25.62512 1.3200438 0.14618904 9.029704
## ENSG00000108950.12 FAM20A 2102.38150 0.6575075 0.07427370 8.852494
## ENSG00000100985.7 MMP9 18630.15953 1.8475668 0.21250458 8.694245
## ENSG00000007968.7 E2F2 1048.78071 0.4485439 0.05226883 8.581478
## ENSG00000137869.15 CYP19A1 99.79869 1.1020306 0.13128996 8.393869
## ENSG00000132170.24 PPARG 188.27369 0.5948070 0.07128398 8.344190
## ENSG00000204044.6 SLC12A5-AS1 114.09553 1.7590874 0.21836871 8.055583
## ENSG00000104918.8 RETN 2356.87387 0.9009685 0.11335111 7.948475
## ENSG00000170439.7 METTL7B 242.42694 0.8427309 0.10841597 7.773125
## ENSG00000135424.18 ITGA7 522.88039 0.5671980 0.07353784 7.713009
## pvalue padj
## ENSG00000163710.9 PCOLCE2 1.721371e-19 3.668585e-15
## ENSG00000108950.12 FAM20A 8.558475e-19 9.119911e-15
## ENSG00000100985.7 MMP9 3.491438e-18 2.480318e-14
## ENSG00000007968.7 E2F2 9.366156e-18 4.990288e-14
## ENSG00000137869.15 CYP19A1 4.704022e-17 2.005042e-13
## ENSG00000132170.24 PPARG 7.170384e-17 2.546921e-13
## ENSG00000204044.6 SLC12A5-AS1 7.910065e-16 2.408276e-12
## ENSG00000104918.8 RETN 1.888211e-15 5.030195e-12
## ENSG00000170439.7 METTL7B 7.657300e-15 1.813249e-11
## ENSG00000135424.18 ITGA7 1.228856e-14 2.618938e-11
mean(abs(dge$stat))
## [1] 1.485697
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 9 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000163710.9 PCOLCE2 25.62512 1.2259018 0.16300795 7.520503
## ENSG00000108950.12 FAM20A 2102.38150 0.5855328 0.08329079 7.029983
## ENSG00000007968.7 E2F2 1048.78071 0.3962539 0.05925388 6.687392
## ENSG00000104918.8 RETN 2356.87387 0.6987862 0.11335834 6.164401
## ENSG00000132170.24 PPARG 188.27369 0.4927135 0.08019928 6.143615
## ENSG00000135424.18 ITGA7 522.88039 0.5073466 0.08559086 5.927579
## ENSG00000169994.19 MYO7B 752.03451 0.3289064 0.05770625 5.699667
## ENSG00000170439.7 METTL7B 242.42694 0.7107035 0.12800229 5.552272
## ENSG00000050767.18 COL23A1 59.82685 0.4828160 0.08728568 5.531446
## ENSG00000137869.15 CYP19A1 99.79869 0.8331307 0.15065203 5.530166
## pvalue padj
## ENSG00000163710.9 PCOLCE2 5.456575e-14 1.095298e-09
## ENSG00000108950.12 FAM20A 2.065586e-12 2.073126e-08
## ENSG00000007968.7 E2F2 2.271825e-11 1.520078e-07
## ENSG00000104918.8 RETN 7.075058e-10 3.238341e-06
## ENSG00000132170.24 PPARG 8.066410e-10 3.238341e-06
## ENSG00000135424.18 ITGA7 3.074338e-09 1.028520e-05
## ENSG00000169994.19 MYO7B 1.200415e-08 3.442275e-05
## ENSG00000170439.7 METTL7B 2.819810e-08 6.421919e-05
## ENSG00000050767.18 COL23A1 3.176018e-08 6.421919e-05
## ENSG00000137869.15 CYP19A1 3.199282e-08 6.421919e-05
mean(abs(dge$stat))
## [1] 1.126178
crp_pod1_b <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 11 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000108950.12 FAM20A 2102.38150 0.5299315 0.07604550 6.968611
## ENSG00000163710.9 PCOLCE2 25.62512 0.9580851 0.14903229 6.428708
## ENSG00000132170.24 PPARG 188.27369 0.3979331 0.06842514 5.815597
## ENSG00000007968.7 E2F2 1048.78071 0.3171167 0.05559108 5.704454
## ENSG00000050767.18 COL23A1 59.82685 0.4708113 0.08827758 5.333306
## ENSG00000169994.19 MYO7B 752.03451 0.2899857 0.05465457 5.305789
## ENSG00000165092.13 ALDH1A1 290.37570 -0.4508214 0.08776359 -5.136770
## ENSG00000135424.18 ITGA7 522.88039 0.4244227 0.08275511 5.128659
## ENSG00000101187.16 SLCO4A1 83.64489 0.3936650 0.07837397 5.022906
## ENSG00000104918.8 RETN 2356.87387 0.4949748 0.10112306 4.894776
## pvalue padj
## ENSG00000108950.12 FAM20A 3.200849e-12 6.425064e-08
## ENSG00000163710.9 PCOLCE2 1.286929e-10 1.291626e-06
## ENSG00000132170.24 PPARG 6.041783e-09 4.042557e-05
## ENSG00000007968.7 E2F2 1.167168e-08 5.857141e-05
## ENSG00000050767.18 COL23A1 9.644083e-08 3.753197e-04
## ENSG00000169994.19 MYO7B 1.121864e-07 3.753197e-04
## ENSG00000165092.13 ALDH1A1 2.795007e-07 7.321965e-04
## ENSG00000135424.18 ITGA7 2.918135e-07 7.321965e-04
## ENSG00000101187.16 SLCO4A1 5.089559e-07 1.135141e-03
## ENSG00000104918.8 RETN 9.841752e-07 1.975535e-03
mean(abs(dge$stat))
## [1] 0.9390948
crp_pod1_b_adj <- dge
SexD: 1=Female and 2=Male I confirmed with this expresion data
No correction for treatment group.
#load chromossome2gene table
chr2gene <- read.table("../ref/chr2gene.tsv")
xyg <- subset(chr2gene,V1=="chrX" | V1=="chrY")
mx <- xt0
dim(mx)
## [1] 60649 111
mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
dim(mx)
## [1] 57660 111
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2 <- subset(ss2,crp_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 21291 56
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 344 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 33.64159 2.3288688 0.1992312 11.689275
## ENSG00000223078.1 RNU2-55P 14.21314 2.1704799 0.2226793 9.747112
## ENSG00000287059.1 RP11-14A10.1 32.94434 1.8071869 0.2310635 7.821169
## ENSG00000249036.1 RP11-625I7.1 28.25079 -1.5570247 0.2110902 -7.376110
## ENSG00000280384.1 RP4-695O20.1 17.21499 0.8828052 0.1408053 6.269688
## ENSG00000196415.10 PRTN3 119.82185 2.4998558 0.4519385 5.531407
## ENSG00000247081.8 BAALC-AS1 32.75231 0.7696459 0.1396175 5.512532
## ENSG00000205611.5 LINC01597 73.26068 0.7738234 0.1414218 5.471742
## ENSG00000287763.1 RP11-153P14.1 75.91258 -2.5117536 0.4793555 -5.239855
## ENSG00000164821.5 DEFA4 258.59435 2.1153822 0.4041379 5.234308
## pvalue padj
## ENSG00000234551.2 LINC01309 1.446156e-31 3.079011e-27
## ENSG00000223078.1 RNU2-55P 1.897902e-22 2.020412e-18
## ENSG00000287059.1 RP11-14A10.1 5.233482e-15 3.714202e-11
## ENSG00000249036.1 RP11-625I7.1 1.629809e-13 8.675068e-10
## ENSG00000280384.1 RP4-695O20.1 3.617720e-10 1.540498e-06
## ENSG00000196415.10 PRTN3 3.176724e-08 1.075830e-04
## ENSG00000247081.8 BAALC-AS1 3.537085e-08 1.075830e-04
## ENSG00000205611.5 LINC01597 4.456328e-08 1.185996e-04
## ENSG00000287763.1 RP11-153P14.1 1.607025e-07 3.525872e-04
## ENSG00000164821.5 DEFA4 1.656039e-07 3.525872e-04
mean(abs(dge$stat))
## [1] 0.8831267
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 11 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 33.64159 2.4713038 0.23173499 10.664353
## ENSG00000223078.1 RNU2-55P 14.21314 2.1079567 0.25785314 8.175028
## ENSG00000287059.1 RP11-14A10.1 32.94434 1.7241550 0.26989561 6.388229
## ENSG00000249036.1 RP11-625I7.1 28.25079 -1.4908856 0.24549422 -6.072997
## ENSG00000205611.5 LINC01597 73.26068 0.8616120 0.15206704 5.666001
## ENSG00000280384.1 RP4-695O20.1 17.21499 0.8767754 0.16064316 5.457907
## ENSG00000261795.1 RP11-90P13.1 15.89334 -3.3636765 0.67220055 -5.003978
## ENSG00000184385.2 UMODL1-AS1 14.24879 3.4958582 0.72614584 4.814265
## ENSG00000196415.10 PRTN3 119.82185 2.3118201 0.48338320 4.782583
## ENSG00000128872.10 TMOD2 1918.82328 -0.4642050 0.09895551 -4.691047
## pvalue padj
## ENSG00000234551.2 LINC01309 1.494350e-26 3.181620e-22
## ENSG00000223078.1 RNU2-55P 2.957964e-16 3.148900e-12
## ENSG00000287059.1 RP11-14A10.1 1.678178e-10 1.191003e-06
## ENSG00000249036.1 RP11-625I7.1 1.255450e-09 6.682449e-06
## ENSG00000205611.5 LINC01597 1.461687e-08 6.224154e-05
## ENSG00000280384.1 RP4-695O20.1 4.817810e-08 1.709600e-04
## ENSG00000261795.1 RP11-90P13.1 5.615930e-07 1.708125e-03
## ENSG00000184385.2 UMODL1-AS1 1.477430e-06 3.931995e-03
## ENSG00000196415.10 PRTN3 1.730573e-06 4.093959e-03
## ENSG00000128872.10 TMOD2 2.718100e-06 5.787108e-03
mean(abs(dge$stat))
## [1] 0.9241176
mvf_lo_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 32 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 33.64159 2.4518641 0.2491134 9.842363
## ENSG00000223078.1 RNU2-55P 14.21314 2.0931762 0.2679161 7.812805
## ENSG00000249036.1 RP11-625I7.1 28.25079 -1.5651460 0.2478516 -6.314851
## ENSG00000287059.1 RP11-14A10.1 32.94434 1.6366468 0.2682068 6.102182
## ENSG00000205611.5 LINC01597 73.26068 0.7846061 0.1529007 5.131476
## ENSG00000110203.9 FOLR3 799.87269 1.6227754 0.3273100 4.957916
## ENSG00000261795.1 RP11-90P13.1 15.89334 -3.3514359 0.6905475 -4.853302
## ENSG00000196415.10 PRTN3 119.82185 2.4205744 0.5002323 4.838901
## ENSG00000280384.1 RP4-695O20.1 17.21499 0.8158338 0.1693220 4.818240
## ENSG00000165029.17 ABCA1 721.24100 -0.7541062 0.1577748 -4.779636
## pvalue padj
## ENSG00000234551.2 LINC01309 7.395309e-23 1.543993e-18
## ENSG00000223078.1 RNU2-55P 5.592889e-15 5.838417e-11
## ENSG00000249036.1 RP11-625I7.1 2.704224e-10 1.881960e-06
## ENSG00000287059.1 RP11-14A10.1 1.046300e-09 5.461163e-06
## ENSG00000205611.5 LINC01597 2.874783e-07 1.200394e-03
## ENSG00000110203.9 FOLR3 7.125324e-07 2.479375e-03
## ENSG00000261795.1 RP11-90P13.1 1.214223e-06 3.358953e-03
## ENSG00000196415.10 PRTN3 1.305591e-06 3.358953e-03
## ENSG00000280384.1 RP4-695O20.1 1.448304e-06 3.358953e-03
## ENSG00000165029.17 ABCA1 1.756131e-06 3.358953e-03
mean(abs(dge$stat))
## [1] 1.053754
mvf_lo_t0_adj <- dge
dim(subset(mvf_lo_t0,padj<0.05))
## [1] 19 62
No correction for treatment group.
#load chromossome2gene table
chr2gene <- read.table("../ref/chr2gene.tsv")
xyg <- subset(chr2gene,V1=="chrX" | V1=="chrY")
mx <- xeos
dim(mx)
## [1] 60649 98
mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
dim(mx)
## [1] 57660 98
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2 <- subset(ss2,crp_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 21512 46
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 106 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 32.07936 2.626331 0.2043337 12.853148
## ENSG00000223078.1 RNU2-55P 16.91390 2.317063 0.2134552 10.855030
## ENSG00000287059.1 RP11-14A10.1 31.85305 2.343386 0.2446115 9.580030
## ENSG00000249036.1 RP11-625I7.1 25.07900 -1.818686 0.1976662 -9.200793
## ENSG00000241111.1 PRICKLE2-AS1 13.60330 1.929835 0.2331559 8.277016
## ENSG00000261618.2 LINC02605 34.62957 1.152047 0.1606795 7.169844
## ENSG00000280384.1 RP4-695O20.1 17.77907 1.061146 0.1746189 6.076926
## ENSG00000279319.1 RP11-693M3.1 16.85722 1.165895 0.2188475 5.327430
## ENSG00000284692.2 RP1-58B11.2 15.70704 1.211100 0.2331978 5.193448
## ENSG00000159212.13 CLIC6 13.32563 1.538270 0.3144323 4.892215
## pvalue padj
## ENSG00000234551.2 LINC01309 8.257972e-38 1.776455e-33
## ENSG00000223078.1 RNU2-55P 1.887450e-27 2.030141e-23
## ENSG00000287059.1 RP11-14A10.1 9.701656e-22 6.956734e-18
## ENSG00000249036.1 RP11-625I7.1 3.553184e-20 1.910902e-16
## ENSG00000241111.1 PRICKLE2-AS1 1.263003e-16 5.433946e-13
## ENSG00000261618.2 LINC02605 7.508315e-13 2.691981e-09
## ENSG00000280384.1 RP4-695O20.1 1.225079e-09 3.764843e-06
## ENSG00000279319.1 RP11-693M3.1 9.961199e-08 2.678566e-04
## ENSG00000284692.2 RP1-58B11.2 2.064350e-07 4.934255e-04
## ENSG00000159212.13 CLIC6 9.970758e-07 2.144910e-03
mean(abs(dge$stat))
## [1] 0.9159988
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 2 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 32.07936 2.589031 0.2311757 11.199412
## ENSG00000223078.1 RNU2-55P 16.91390 2.527264 0.2374445 10.643597
## ENSG00000287059.1 RP11-14A10.1 31.85305 2.572627 0.2805731 9.169186
## ENSG00000241111.1 PRICKLE2-AS1 13.60330 2.013881 0.2721151 7.400843
## ENSG00000249036.1 RP11-625I7.1 25.07900 -1.735733 0.2349588 -7.387393
## ENSG00000261618.2 LINC02605 34.62957 1.170551 0.1797314 6.512778
## ENSG00000184385.2 UMODL1-AS1 59.58609 5.093233 0.9088449 5.604073
## ENSG00000280384.1 RP4-695O20.1 17.77907 1.056001 0.1916923 5.508834
## ENSG00000205611.5 LINC01597 94.98889 1.075723 0.2276478 4.725385
## ENSG00000284692.2 RP1-58B11.2 15.70704 1.226824 0.2615174 4.691173
## pvalue padj
## ENSG00000234551.2 LINC01309 4.104502e-29 8.829604e-25
## ENSG00000223078.1 RNU2-55P 1.867768e-26 2.008971e-22
## ENSG00000287059.1 RP11-14A10.1 4.766060e-20 3.417583e-16
## ENSG00000241111.1 PRICKLE2-AS1 1.353225e-13 6.442204e-10
## ENSG00000249036.1 RP11-625I7.1 1.497351e-13 6.442204e-10
## ENSG00000261618.2 LINC02605 7.377374e-11 2.645034e-07
## ENSG00000184385.2 UMODL1-AS1 2.093725e-08 6.434317e-05
## ENSG00000280384.1 RP4-695O20.1 3.612176e-08 9.713142e-05
## ENSG00000205611.5 LINC01597 2.296802e-06 5.464631e-03
## ENSG00000284692.2 RP1-58B11.2 2.716431e-06 5.464631e-03
mean(abs(dge$stat))
## [1] 1.04124
mvf_lo_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 44 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 32.07936 2.5687275 0.2647720 9.701658
## ENSG00000223078.1 RNU2-55P 16.91390 2.4648245 0.2728490 9.033659
## ENSG00000287059.1 RP11-14A10.1 31.85305 2.4178005 0.2710450 8.920291
## ENSG00000249036.1 RP11-625I7.1 25.07900 -1.8549612 0.2709654 -6.845748
## ENSG00000241111.1 PRICKLE2-AS1 13.60330 1.8163376 0.2918429 6.223682
## ENSG00000261618.2 LINC02605 34.62957 1.1442358 0.2006206 5.703481
## ENSG00000205611.5 LINC01597 94.98889 1.0459231 0.1887536 5.541208
## ENSG00000165617.15 DACT1 178.23996 -1.3978545 0.2563630 -5.452638
## ENSG00000157985.19 AGAP1 505.51722 0.8943585 0.1906237 4.691749
## ENSG00000287763.1 RP11-153P14.1 61.09995 -3.3614571 0.7429108 -4.524712
## pvalue padj
## ENSG00000234551.2 LINC01309 2.966372e-22 6.257266e-18
## ENSG00000223078.1 RNU2-55P 1.660257e-19 1.751074e-15
## ENSG00000287059.1 RP11-14A10.1 4.650657e-19 3.270032e-15
## ENSG00000249036.1 RP11-625I7.1 7.607712e-12 4.011927e-08
## ENSG00000241111.1 PRICKLE2-AS1 4.856196e-10 2.048732e-06
## ENSG00000261618.2 LINC02605 1.173850e-08 4.126866e-05
## ENSG00000205611.5 LINC01597 3.003928e-08 9.052122e-05
## ENSG00000165617.15 DACT1 4.962809e-08 1.308569e-04
## ENSG00000157985.19 AGAP1 2.708789e-06 6.348799e-03
## ENSG00000287763.1 RP11-153P14.1 6.047789e-06 1.275721e-02
mean(abs(dge$stat))
## [1] 0.8323054
mvf_lo_eos_adj <- dge
dim(subset(mvf_lo_eos,padj<0.05))
## [1] 33 52
No correction for treatment group.
#load chromossome2gene table
chr2gene <- read.table("../ref/chr2gene.tsv")
xyg <- subset(chr2gene,V1=="chrX" | V1=="chrY")
mx <- xpod1
dim(mx)
## [1] 60649 109
mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
dim(mx)
## [1] 57660 109
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2 <- subset(ss2,crp_group==1)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 20659 55
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 122 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 34.12360 2.4651830 0.1761162 13.997477
## ENSG00000249036.1 RP11-625I7.1 21.88654 -1.9476850 0.1672608 -11.644596
## ENSG00000223078.1 RNU2-55P 13.46670 2.4858941 0.2197302 11.313394
## ENSG00000251199.6 RP11-400D2.2 25.27517 1.2521909 0.1550861 8.074167
## ENSG00000241111.1 PRICKLE2-AS1 10.02828 1.7763327 0.2567832 6.917637
## ENSG00000287763.1 RP11-153P14.1 69.51778 -2.6560497 0.4655662 -5.704988
## ENSG00000287059.1 RP11-14A10.1 22.95559 1.3682563 0.2444267 5.597818
## ENSG00000261618.2 LINC02605 31.80415 0.8652806 0.1596389 5.420235
## ENSG00000078114.19 NEBL 33.02077 3.1015394 0.5786268 5.360172
## ENSG00000242741.2 LINC02005 15.47253 0.9387608 0.1777857 5.280294
## pvalue padj
## ENSG00000234551.2 LINC01309 1.615029e-44 3.336487e-40
## ENSG00000249036.1 RP11-625I7.1 2.444769e-31 2.525325e-27
## ENSG00000223078.1 RNU2-55P 1.126456e-29 7.757148e-26
## ENSG00000251199.6 RP11-400D2.2 6.793902e-16 3.508881e-12
## ENSG00000241111.1 PRICKLE2-AS1 4.592394e-12 1.897485e-08
## ENSG00000287763.1 RP11-153P14.1 1.163512e-08 4.006167e-05
## ENSG00000287059.1 RP11-14A10.1 2.170659e-08 6.406234e-05
## ENSG00000261618.2 LINC02605 5.952074e-08 1.537049e-04
## ENSG00000078114.19 NEBL 8.314272e-08 1.908495e-04
## ENSG00000242741.2 LINC02005 1.289770e-07 2.664537e-04
mean(abs(dge$stat))
## [1] 0.7233669
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 8 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 34.12360 2.5895969 0.1983491 13.055751
## ENSG00000249036.1 RP11-625I7.1 21.88654 -1.8468544 0.1885919 -9.792862
## ENSG00000223078.1 RNU2-55P 13.46670 2.4049572 0.2521087 9.539366
## ENSG00000251199.6 RP11-400D2.2 25.27517 1.2983374 0.1804491 7.195034
## ENSG00000241111.1 PRICKLE2-AS1 10.02828 2.0733786 0.2923354 7.092465
## ENSG00000261618.2 LINC02605 31.80415 0.9719705 0.1888472 5.146862
## ENSG00000184385.2 UMODL1-AS1 10.70454 4.5617023 0.9045180 5.043241
## ENSG00000205611.5 LINC01597 51.31977 0.8279942 0.1641945 5.042766
## ENSG00000287059.1 RP11-14A10.1 22.95559 1.3892825 0.2782894 4.992223
## ENSG00000242741.2 LINC02005 15.47253 0.9523263 0.1973762 4.824930
## pvalue padj
## ENSG00000234551.2 LINC01309 5.892606e-39 1.217353e-34
## ENSG00000249036.1 RP11-625I7.1 1.208266e-22 1.248078e-18
## ENSG00000223078.1 RNU2-55P 1.437083e-21 9.896229e-18
## ENSG00000251199.6 RP11-400D2.2 6.244523e-13 3.225140e-09
## ENSG00000241111.1 PRICKLE2-AS1 1.317439e-12 5.443395e-09
## ENSG00000261618.2 LINC02605 2.648805e-07 9.120278e-04
## ENSG00000184385.2 UMODL1-AS1 4.577127e-07 1.184926e-03
## ENSG00000205611.5 LINC01597 4.588514e-07 1.184926e-03
## ENSG00000287059.1 RP11-14A10.1 5.968836e-07 1.370113e-03
## ENSG00000242741.2 LINC02005 1.400521e-06 2.893337e-03
mean(abs(dge$stat))
## [1] 0.802444
mvf_lo_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 24 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 34.12360 2.6736069 0.2200505 12.149972
## ENSG00000249036.1 RP11-625I7.1 21.88654 -1.9946353 0.1980806 -10.069818
## ENSG00000223078.1 RNU2-55P 13.46670 2.0483749 0.2508718 8.165027
## ENSG00000251199.6 RP11-400D2.2 25.27517 1.2757453 0.2026393 6.295645
## ENSG00000241111.1 PRICKLE2-AS1 10.02828 1.8227198 0.3176453 5.738224
## ENSG00000287763.1 RP11-153P14.1 69.51778 -2.9784871 0.5306885 -5.612496
## ENSG00000287059.1 RP11-14A10.1 22.95559 1.3525084 0.2781883 4.861846
## ENSG00000261618.2 LINC02605 31.80415 0.9841545 0.2160895 4.554384
## ENSG00000165617.15 DACT1 136.99579 -1.0197244 0.2519796 -4.046853
## ENSG00000284692.2 RP1-58B11.2 13.53876 1.2539524 0.3136024 3.998543
## pvalue padj
## ENSG00000234551.2 LINC01309 5.738502e-34 1.185517e-29
## ENSG00000249036.1 RP11-625I7.1 7.511698e-24 7.759208e-20
## ENSG00000223078.1 RNU2-55P 3.213630e-16 2.213013e-12
## ENSG00000251199.6 RP11-400D2.2 3.061249e-10 1.581059e-06
## ENSG00000241111.1 PRICKLE2-AS1 9.567436e-09 3.953073e-05
## ENSG00000287763.1 RP11-153P14.1 1.994291e-08 6.866676e-05
## ENSG00000287059.1 RP11-14A10.1 1.162964e-06 3.432238e-03
## ENSG00000261618.2 LINC02605 5.253923e-06 1.356760e-02
## ENSG00000165617.15 DACT1 5.191081e-05 1.191584e-01
## ENSG00000284692.2 RP1-58B11.2 6.373373e-05 1.307941e-01
mean(abs(dge$stat))
## [1] 0.8050376
mvf_lo_pod1_adj <- dge
dim(subset(mvf_lo_pod1,padj<0.05))
## [1] 16 61
No correction for treatment group.
#load chromosome2gene table
chr2gene <- read.table("../ref/chr2gene.tsv")
xyg <- subset(chr2gene,V1=="chrX" | V1=="chrY")
mx <- xt0
dim(mx)
## [1] 60649 111
mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
dim(mx)
## [1] 57660 111
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2 <- subset(ss2,crp_group==4)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 21177 55
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 291 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 41.09906 2.7475538 0.1703948 16.124634
## ENSG00000249036.1 RP11-625I7.1 22.89777 -1.6712038 0.1890118 -8.841796
## ENSG00000223078.1 RNU2-55P 15.68780 2.0376671 0.2319994 8.783072
## ENSG00000287059.1 RP11-14A10.1 30.25752 1.8377734 0.2273394 8.083833
## ENSG00000251199.6 RP11-400D2.2 34.79657 1.1193965 0.1559826 7.176420
## ENSG00000241111.1 PRICKLE2-AS1 12.71155 1.9977146 0.2785771 7.171138
## ENSG00000282826.2 FRG1CP 568.54614 0.5373063 0.1038772 5.172515
## ENSG00000029534.21 ANK1 308.08237 -0.6663338 0.1370848 -4.860743
## ENSG00000118492.18 ADGB 28.31195 0.6425526 0.1324436 4.851519
## ENSG00000205611.5 LINC01597 72.91149 0.7235389 0.1492394 4.848175
## pvalue padj
## ENSG00000234551.2 LINC01309 1.712703e-58 3.626992e-54
## ENSG00000249036.1 RP11-625I7.1 9.419198e-19 9.973518e-15
## ENSG00000223078.1 RNU2-55P 1.590687e-18 1.122866e-14
## ENSG00000287059.1 RP11-14A10.1 6.276204e-16 3.322779e-12
## ENSG00000251199.6 RP11-400D2.2 7.156033e-13 2.625132e-09
## ENSG00000241111.1 PRICKLE2-AS1 7.437689e-13 2.625132e-09
## ENSG00000282826.2 FRG1CP 2.309639e-07 6.987318e-04
## ENSG00000029534.21 ANK1 1.169460e-06 2.638709e-03
## ENSG00000118492.18 ADGB 1.225196e-06 2.638709e-03
## ENSG00000205611.5 LINC01597 1.246026e-06 2.638709e-03
mean(abs(dge$stat))
## [1] 0.8218259
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 6 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 41.09906 2.7234360 0.1838595 14.812594
## ENSG00000223078.1 RNU2-55P 15.68780 2.0820131 0.2407429 8.648286
## ENSG00000249036.1 RP11-625I7.1 22.89777 -1.6615975 0.2070612 -8.024668
## ENSG00000287059.1 RP11-14A10.1 30.25752 1.8229839 0.2395041 7.611493
## ENSG00000251199.6 RP11-400D2.2 34.79657 1.1448244 0.1615118 7.088176
## ENSG00000241111.1 PRICKLE2-AS1 12.71155 1.9854604 0.2902379 6.840804
## ENSG00000160789.24 LMNA 1114.84849 -0.7007711 0.1364879 -5.134309
## ENSG00000205611.5 LINC01597 72.91149 0.7420386 0.1454348 5.102207
## ENSG00000259719.6 LINC02284 54.28891 -0.9905106 0.1944177 -5.094755
## ENSG00000163735.7 CXCL5 255.03703 -1.6090605 0.3224991 -4.989348
## pvalue padj
## ENSG00000234551.2 LINC01309 1.214613e-49 2.572185e-45
## ENSG00000223078.1 RNU2-55P 5.227859e-18 5.535519e-14
## ENSG00000249036.1 RP11-625I7.1 1.018008e-15 7.186116e-12
## ENSG00000287059.1 RP11-14A10.1 2.709486e-14 1.434469e-10
## ENSG00000251199.6 RP11-400D2.2 1.358908e-12 5.755518e-09
## ENSG00000241111.1 PRICKLE2-AS1 7.874971e-12 2.779471e-08
## ENSG00000160789.24 LMNA 2.831819e-07 8.216514e-04
## ENSG00000205611.5 LINC01597 3.357159e-07 8.216514e-04
## ENSG00000259719.6 LINC02284 3.491931e-07 8.216514e-04
## ENSG00000163735.7 CXCL5 6.058330e-07 1.282973e-03
mean(abs(dge$stat))
## [1] 0.815457
mvf_hi_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 31 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 41.09906 2.6846929 0.1911398 14.045707
## ENSG00000249036.1 RP11-625I7.1 22.89777 -1.6363919 0.1700257 -9.624382
## ENSG00000223078.1 RNU2-55P 15.68780 2.1673734 0.2405782 9.009017
## ENSG00000287059.1 RP11-14A10.1 30.25752 1.9148333 0.2435354 7.862649
## ENSG00000241111.1 PRICKLE2-AS1 12.71155 2.1716924 0.2840957 7.644228
## ENSG00000251199.6 RP11-400D2.2 34.79657 1.1927663 0.1641208 7.267611
## ENSG00000160789.24 LMNA 1114.84849 -0.7762545 0.1363735 -5.692123
## ENSG00000259719.6 LINC02284 54.28891 -1.0524868 0.1892147 -5.562395
## ENSG00000154917.11 RAB6B 155.08204 -0.8860746 0.1625378 -5.451499
## ENSG00000119326.15 CTNNAL1 57.45744 -0.9210509 0.1717614 -5.362386
## pvalue padj
## ENSG00000234551.2 LINC01309 8.184765e-45 1.699648e-40
## ENSG00000249036.1 RP11-625I7.1 6.308485e-22 6.550100e-18
## ENSG00000223078.1 RNU2-55P 2.079120e-19 1.439167e-15
## ENSG00000287059.1 RP11-14A10.1 3.760924e-15 1.952483e-11
## ENSG00000241111.1 PRICKLE2-AS1 2.102020e-14 8.730108e-11
## ENSG00000251199.6 RP11-400D2.2 3.659007e-13 1.266382e-09
## ENSG00000160789.24 LMNA 1.254694e-08 3.722140e-05
## ENSG00000259719.6 LINC02284 2.660978e-08 6.907234e-05
## ENSG00000154917.11 RAB6B 4.994700e-08 1.152444e-04
## ENSG00000119326.15 CTNNAL1 8.212991e-08 1.676947e-04
mean(abs(dge$stat))
## [1] 0.9985743
mvf_hi_t0_adj <- dge
dim(subset(mvf_hi_t0,padj<0.05))
## [1] 71 61
No correction for treatment group.
#load chromosome2gene table
chr2gene <- read.table("../ref/chr2gene.tsv")
xyg <- subset(chr2gene,V1=="chrX" | V1=="chrY")
mx <- xeos
dim(mx)
## [1] 60649 98
mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
dim(mx)
## [1] 57660 98
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2 <- subset(ss2,crp_group==4)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 21199 52
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 124 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 33.33921 2.4403190 0.1856270 13.146361
## ENSG00000287059.1 RP11-14A10.1 29.01227 1.5746345 0.2048465 7.686900
## ENSG00000223078.1 RNU2-55P 15.44212 1.6837808 0.2307396 7.297320
## ENSG00000241111.1 PRICKLE2-AS1 11.99614 1.8361849 0.2656031 6.913266
## ENSG00000249036.1 RP11-625I7.1 20.48683 -1.3690932 0.2206460 -6.204930
## ENSG00000251199.6 RP11-400D2.2 29.81371 1.3818661 0.2242179 6.163049
## ENSG00000142606.16 MMEL1 41.01197 1.2360933 0.2570226 4.809278
## ENSG00000261618.2 LINC02605 38.40968 0.6575011 0.1371833 4.792867
## ENSG00000182263.14 FIGN 14.80449 3.0725635 0.6525952 4.708223
## ENSG00000164821.5 DEFA4 396.33649 2.4120317 0.5537353 4.355929
## pvalue padj
## ENSG00000234551.2 LINC01309 1.785629e-39 3.785355e-35
## ENSG00000287059.1 RP11-14A10.1 1.507434e-14 1.597805e-10
## ENSG00000223078.1 RNU2-55P 2.935551e-13 2.074358e-09
## ENSG00000241111.1 PRICKLE2-AS1 4.736200e-12 2.510068e-08
## ENSG00000249036.1 RP11-625I7.1 5.472112e-10 2.320066e-06
## ENSG00000251199.6 RP11-400D2.2 7.135740e-10 2.521176e-06
## ENSG00000142606.16 MMEL1 1.514764e-06 4.356779e-03
## ENSG00000261618.2 LINC02605 1.644145e-06 4.356779e-03
## ENSG00000182263.14 FIGN 2.498862e-06 5.885930e-03
## ENSG00000164821.5 DEFA4 1.325038e-05 2.808948e-02
mean(abs(dge$stat))
## [1] 0.7524278
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 33.33921 2.3569271 0.1869109 12.609898
## ENSG00000287059.1 RP11-14A10.1 29.01227 1.5794570 0.2151250 7.342044
## ENSG00000223078.1 RNU2-55P 15.44212 1.6681236 0.2412151 6.915503
## ENSG00000241111.1 PRICKLE2-AS1 11.99614 1.8381605 0.2758089 6.664617
## ENSG00000249036.1 RP11-625I7.1 20.48683 -1.4387781 0.2362987 -6.088810
## ENSG00000251199.6 RP11-400D2.2 29.81371 1.4085373 0.2348460 5.997706
## ENSG00000279319.1 RP11-693M3.1 16.33692 1.0589761 0.2254952 4.696226
## ENSG00000261618.2 LINC02605 38.40968 0.6611660 0.1467102 4.506612
## ENSG00000282826.2 FRG1CP 514.33435 0.4766021 0.1125323 4.235247
## ENSG00000142606.16 MMEL1 41.01197 1.0664933 0.2523226 4.226705
## pvalue padj
## ENSG00000234551.2 LINC01309 1.862331e-36 3.947956e-32
## ENSG00000287059.1 RP11-14A10.1 2.103563e-13 2.229672e-09
## ENSG00000223078.1 RNU2-55P 4.662056e-12 3.294364e-08
## ENSG00000241111.1 PRICKLE2-AS1 2.653562e-11 1.406321e-07
## ENSG00000249036.1 RP11-625I7.1 1.137527e-09 4.822889e-06
## ENSG00000251199.6 RP11-400D2.2 2.001239e-09 7.070711e-06
## ENSG00000279319.1 RP11-693M3.1 2.650129e-06 8.025727e-03
## ENSG00000261618.2 LINC02605 6.587084e-06 1.745495e-02
## ENSG00000282826.2 FRG1CP 2.283009e-05 5.027092e-02
## ENSG00000142606.16 MMEL1 2.371382e-05 5.027092e-02
mean(abs(dge$stat))
## [1] 0.7620071
mvf_hi_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 21 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 33.33921 2.3190333 0.19888030 11.660448
## ENSG00000287059.1 RP11-14A10.1 29.01227 1.6504563 0.21782313 7.577048
## ENSG00000223078.1 RNU2-55P 15.44212 1.7428476 0.23923642 7.285043
## ENSG00000241111.1 PRICKLE2-AS1 11.99614 1.8924611 0.28170825 6.717805
## ENSG00000249036.1 RP11-625I7.1 20.48683 -1.5987534 0.25234023 -6.335705
## ENSG00000251199.6 RP11-400D2.2 29.81371 1.4028800 0.24878909 5.638832
## ENSG00000282826.2 FRG1CP 514.33435 0.4860188 0.09781564 4.968723
## ENSG00000261618.2 LINC02605 38.40968 0.6865628 0.15310432 4.484281
## ENSG00000205611.5 LINC01597 72.82931 0.7217618 0.16444591 4.389053
## ENSG00000149531.15 FRG1BP 65.48543 0.8062481 0.18386032 4.385112
## pvalue padj
## ENSG00000234551.2 LINC01309 2.029715e-31 4.302794e-27
## ENSG00000287059.1 RP11-14A10.1 3.535059e-14 3.746985e-10
## ENSG00000223078.1 RNU2-55P 3.215679e-13 2.272306e-09
## ENSG00000241111.1 PRICKLE2-AS1 1.844822e-11 9.777095e-08
## ENSG00000249036.1 RP11-625I7.1 2.362579e-10 1.001686e-06
## ENSG00000251199.6 RP11-400D2.2 1.712071e-08 6.049032e-05
## ENSG00000282826.2 FRG1CP 6.739532e-07 2.041019e-03
## ENSG00000261618.2 LINC02605 7.316020e-06 1.938654e-02
## ENSG00000205611.5 LINC01597 1.138455e-05 2.457517e-02
## ENSG00000149531.15 FRG1BP 1.159261e-05 2.457517e-02
mean(abs(dge$stat))
## [1] 0.7764473
mvf_hi_eos_adj <- dge
dim(subset(mvf_hi_eos,padj<0.05))
## [1] 8 58
No correction for treatment group.
#load chromosome2gene table
chr2gene <- read.table("../ref/chr2gene.tsv")
xyg <- subset(chr2gene,V1=="chrX" | V1=="chrY")
mx <- xpod1
dim(mx)
## [1] 60649 109
mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
dim(mx)
## [1] 57660 109
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2 <- subset(ss2,crp_group==4)
mx <- mx[,colnames(mx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 20547 54
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD )
## converting counts to integer mode
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 231 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 28.88719 2.3303369 0.2131694 10.931854
## ENSG00000251199.6 RP11-400D2.2 24.04062 1.5721928 0.1783430 8.815555
## ENSG00000249036.1 RP11-625I7.1 16.02735 -1.8716484 0.2140172 -8.745319
## ENSG00000287059.1 RP11-14A10.1 17.45351 1.2737217 0.2194248 5.804820
## ENSG00000223078.1 RNU2-55P 10.63492 1.3736437 0.2454948 5.595408
## ENSG00000142606.16 MMEL1 42.71184 1.1433146 0.2127386 5.374269
## ENSG00000280384.1 RP4-695O20.1 14.18123 0.9356208 0.1807317 5.176848
## ENSG00000162069.16 BICDL2 33.70854 -2.7279023 0.5323684 -5.124087
## ENSG00000254873.1 RP11-770J1.5 48.30434 1.8877549 0.3853291 4.899071
## ENSG00000268758.7 ADGRE4P 473.39915 -1.2894914 0.2855855 -4.515255
## pvalue padj
## ENSG00000234551.2 LINC01309 8.117291e-28 1.667779e-23
## ENSG00000251199.6 RP11-400D2.2 1.190934e-18 1.223447e-14
## ENSG00000249036.1 RP11-625I7.1 2.223873e-18 1.523056e-14
## ENSG00000287059.1 RP11-14A10.1 6.443507e-09 3.309708e-05
## ENSG00000223078.1 RNU2-55P 2.201039e-08 9.044510e-05
## ENSG00000142606.16 MMEL1 7.689389e-08 2.633103e-04
## ENSG00000280384.1 RP4-695O20.1 2.256658e-07 6.623612e-04
## ENSG00000162069.16 BICDL2 2.989829e-07 7.678627e-04
## ENSG00000254873.1 RP11-770J1.5 9.629062e-07 2.198208e-03
## ENSG00000268758.7 ADGRE4P 6.324051e-06 1.299339e-02
mean(abs(dge$stat))
## [1] 0.6861829
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 5 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 28.88719 2.385083 0.2268342 10.514655
## ENSG00000251199.6 RP11-400D2.2 24.04062 1.570040 0.1886671 8.321744
## ENSG00000249036.1 RP11-625I7.1 16.02735 -1.776728 0.2305346 -7.706991
## ENSG00000287059.1 RP11-14A10.1 17.45351 1.416739 0.2211889 6.405110
## ENSG00000223078.1 RNU2-55P 10.63492 1.447740 0.2598982 5.570413
## ENSG00000280384.1 RP4-695O20.1 14.18123 0.920274 0.1922327 4.787291
## ENSG00000255398.3 HCAR3 337.50562 -2.477792 0.5269334 -4.702287
## ENSG00000162069.16 BICDL2 50.61023 -2.985381 0.6528841 -4.572605
## ENSG00000137261.15 KIAA0319 44.83620 -2.267737 0.5122779 -4.426770
## ENSG00000288700.1 RP11-22E12.2 31.68989 -2.035255 0.4605020 -4.419644
## pvalue padj
## ENSG00000234551.2 LINC01309 7.395101e-26 1.519471e-21
## ENSG00000251199.6 RP11-400D2.2 8.667817e-17 8.904882e-13
## ENSG00000249036.1 RP11-625I7.1 1.288186e-14 8.822788e-11
## ENSG00000287059.1 RP11-14A10.1 1.502613e-10 7.718545e-07
## ENSG00000223078.1 RNU2-55P 2.541368e-08 1.044350e-04
## ENSG00000280384.1 RP4-695O20.1 1.690472e-06 5.789023e-03
## ENSG00000255398.3 HCAR3 2.572637e-06 7.551426e-03
## ENSG00000162069.16 BICDL2 4.816974e-06 1.237180e-02
## ENSG00000137261.15 KIAA0319 9.565469e-06 2.031348e-02
## ENSG00000288700.1 RP11-22E12.2 9.886349e-06 2.031348e-02
mean(abs(dge$stat))
## [1] 0.6998276
mvf_hi_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + sexD )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 27 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234551.2 LINC01309 28.88719 2.4115310 0.2449942 9.843215
## ENSG00000251199.6 RP11-400D2.2 24.04062 1.6172506 0.2030531 7.964666
## ENSG00000249036.1 RP11-625I7.1 16.02735 -1.7745609 0.2596696 -6.833918
## ENSG00000287059.1 RP11-14A10.1 17.45351 1.5247645 0.2436644 6.257643
## ENSG00000223078.1 RNU2-55P 10.63492 1.3888619 0.2622136 5.296682
## ENSG00000175084.13 DES 28.27529 -1.6131153 0.3231908 -4.991216
## ENSG00000287763.1 RP11-153P14.1 50.70110 -2.4542507 0.5217216 -4.704138
## ENSG00000142606.16 MMEL1 42.71184 0.9755700 0.2083536 4.682281
## ENSG00000254873.1 RP11-770J1.5 48.30434 1.6729181 0.3625609 4.614171
## ENSG00000280384.1 RP4-695O20.1 14.18123 0.9225726 0.2086620 4.421374
## pvalue padj
## ENSG00000234551.2 LINC01309 7.332945e-23 1.506700e-18
## ENSG00000251199.6 RP11-400D2.2 1.656702e-15 1.702013e-11
## ENSG00000249036.1 RP11-625I7.1 8.262641e-12 5.659083e-08
## ENSG00000287059.1 RP11-14A10.1 3.908403e-10 2.007649e-06
## ENSG00000223078.1 RNU2-55P 1.179260e-07 4.846052e-04
## ENSG00000175084.13 DES 6.000027e-07 2.054709e-03
## ENSG00000287763.1 RP11-153P14.1 2.549403e-06 7.286498e-03
## ENSG00000142606.16 MMEL1 2.837007e-06 7.286498e-03
## ENSG00000254873.1 RP11-770J1.5 3.946675e-06 9.010259e-03
## ENSG00000280384.1 RP4-695O20.1 9.807504e-06 2.015148e-02
mean(abs(dge$stat))
## [1] 0.8264306
mvf_hi_pod1_adj <- dge
dim(subset(mvf_hi_pod1,padj<0.05))
## [1] 18 60
16 females only with T0 and POD1
ss2 <- merge(sscell,ss,by=0)
rownames(ss2) <- ss2$Row.names
ss2 <- subset(ss2,crp_group==4 & timepoint != "EOS" & sexD == 1 )
mx <- xx[,colnames(xx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
dim(mx)
## [1] 21567 33
table(chr2gene[match(sapply(strsplit(rownames(mx)," "),"[[",1),chr2gene$V2),1])
##
## chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr2 chr20
## 2142 809 1139 1125 375 806 739 1104 1352 336 1548 1412 545
## chr21 chr22 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrM chrX chrY
## 266 600 1147 760 931 1087 1110 724 805 19 654 32
ss2 <- ss2[which(rownames(ss2) %in% colnames(mx)),]
ss2 <- ss2[order(rownames(ss2)),]
ss2$timepoint <- factor(ss2$timepoint,levels=c("T0","POD1"))
#dim(mx)
#mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
#dim(mx)
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ PG_number + timepoint )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000108950.12 FAM20A 1418.5179 3.869053 0.19174287 20.17834
## ENSG00000014257.16 ACP3 869.0445 1.338041 0.07937339 16.85755
## ENSG00000156414.19 TDRD9 896.8816 2.319293 0.13761906 16.85299
## ENSG00000132170.24 PPARG 123.3816 3.213308 0.20483828 15.68705
## ENSG00000168615.13 ADAM9 1574.2542 1.624515 0.10366969 15.67010
## ENSG00000161944.16 ASGR2 2906.2035 1.620646 0.10347070 15.66285
## ENSG00000169385.3 RNASE2 1250.4443 2.346518 0.15249733 15.38727
## ENSG00000164125.16 GASK1B 3608.4252 1.654531 0.10827604 15.28067
## ENSG00000183019.7 MCEMP1 4357.8963 2.863233 0.18810447 15.22151
## ENSG00000203710.12 CR1 11629.4813 2.374281 0.16726003 14.19515
## pvalue padj
## ENSG00000108950.12 FAM20A 1.517611e-90 3.273032e-86
## ENSG00000014257.16 ACP3 9.233298e-64 7.170255e-60
## ENSG00000156414.19 TDRD9 9.973926e-64 7.170255e-60
## ENSG00000132170.24 PPARG 1.854888e-55 9.757214e-52
## ENSG00000168615.13 ADAM9 2.421816e-55 9.757214e-52
## ENSG00000161944.16 ASGR2 2.714484e-55 9.757214e-52
## ENSG00000169385.3 RNASE2 1.992547e-53 6.139037e-50
## ENSG00000164125.16 GASK1B 1.028698e-52 2.773242e-49
## ENSG00000183019.7 MCEMP1 2.546033e-52 6.101143e-49
## ENSG00000203710.12 CR1 9.817353e-46 2.117308e-42
mean(abs(dge$stat))
## [1] 2.247041
surgfemale <- dge
dim(subset(surgfemale,padj<0.05))
## [1] 8135 39
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ PG_number + Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + timepoint )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## 2 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000165092.13 ALDH1A1 556.0900 -2.9820987 0.4109257 -7.257027
## ENSG00000108950.12 FAM20A 1418.5179 2.5446023 0.3561134 7.145483
## ENSG00000152518.8 ZFP36L2 21784.6356 -1.3510264 0.2412505 -5.600098
## ENSG00000161944.16 ASGR2 2906.2035 1.1293842 0.2188673 5.160131
## ENSG00000132170.24 PPARG 123.3816 2.4161730 0.5013513 4.819322
## ENSG00000204642.14 HLA-F 11602.3630 -0.6359361 0.1322543 -4.808434
## ENSG00000135218.19 CD36 10743.9529 1.0554441 0.2204909 4.786793
## ENSG00000156414.19 TDRD9 896.8816 1.5403556 0.3221443 4.781570
## ENSG00000203710.12 CR1 11629.4813 1.5053257 0.3160264 4.763290
## ENSG00000019169.11 MARCO 1204.4460 1.3706841 0.2912783 4.705754
## pvalue padj
## ENSG00000165092.13 ALDH1A1 3.956894e-13 5.721273e-09
## ENSG00000108950.12 FAM20A 8.967968e-13 6.483393e-09
## ENSG00000152518.8 ZFP36L2 2.142310e-08 1.032522e-04
## ENSG00000161944.16 ASGR2 2.467767e-07 8.920360e-04
## ENSG00000132170.24 PPARG 1.440472e-06 3.059869e-03
## ENSG00000204642.14 HLA-F 1.521174e-06 3.059869e-03
## ENSG00000135218.19 CD36 1.694679e-06 3.059869e-03
## ENSG00000156414.19 TDRD9 1.739316e-06 3.059869e-03
## ENSG00000203710.12 CR1 1.904614e-06 3.059869e-03
## ENSG00000019169.11 MARCO 2.529292e-06 3.657104e-03
mean(abs(dge$stat))
## [1] 0.7913256
surgfemale_adj <- dge
dim(subset(surgfemale_adj,padj<0.05))
## [1] 41 39
(dim(subset(surgfemale,padj<0.05))[1] - dim(subset(surgfemale_adj,padj<0.05))[1]) / dim(subset(surgfemale,padj<0.05))[1]
## [1] 0.99496
38 males with T0 and POD1
ss2 <- merge(sscell,ss,by=0)
rownames(ss2) <- ss2$Row.names
ss2 <- subset(ss2,crp_group==4 & timepoint != "EOS" & sexD == 2 )
mx <- xx[,colnames(xx) %in% rownames(ss2)]
mx <- mx[which(rowMeans(mx)>10),]
table(chr2gene[match(sapply(strsplit(rownames(mx)," "),"[[",1),chr2gene$V2),1])
##
## chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr2 chr20
## 2146 816 1148 1130 380 814 743 1109 1354 334 1553 1416 548
## chr21 chr22 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrM chrX chrY
## 264 606 1150 758 937 1084 1120 730 805 18 647 48
ss2 <- ss2[which(rownames(ss2) %in% colnames(mx)),]
ss2 <- ss2[order(rownames(ss2)),]
ss2$timepoint <- factor(ss2$timepoint,levels=c("T0","POD1"))
#dim(mx)
#mx <- mx[which(! sapply(strsplit(rownames(mx)," "),"[[",1) %in% xyg$V2),]
#dim(mx)
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ PG_number + timepoint )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## 1 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000108950.12 FAM20A 1528.14262 3.829099 0.16185989 23.65687
## ENSG00000132170.24 PPARG 149.75730 3.301257 0.15532787 21.25348
## ENSG00000170439.7 METTL7B 166.27177 4.936676 0.24257968 20.35074
## ENSG00000121316.11 PLBD1 15706.49246 2.007735 0.11104993 18.07957
## ENSG00000163221.9 S100A12 16539.12795 3.113329 0.18300044 17.01268
## ENSG00000174705.13 SH3PXD2B 504.64277 3.214712 0.19136611 16.79875
## ENSG00000137869.15 CYP19A1 81.26793 6.512124 0.39413773 16.52246
## ENSG00000168615.13 ADAM9 1617.51324 1.541855 0.09517653 16.19995
## ENSG00000166033.13 HTRA1 134.28398 2.514197 0.15543931 16.17478
## ENSG00000169385.3 RNASE2 1315.10460 2.148848 0.13303322 16.15272
## pvalue padj
## ENSG00000108950.12 FAM20A 1.002886e-123 2.172051e-119
## ENSG00000132170.24 PPARG 3.061301e-100 3.315083e-96
## ENSG00000170439.7 METTL7B 4.573255e-92 3.301585e-88
## ENSG00000121316.11 PLBD1 4.616370e-73 2.499533e-69
## ENSG00000163221.9 S100A12 6.613770e-65 2.864821e-61
## ENSG00000174705.13 SH3PXD2B 2.492307e-63 8.996398e-60
## ENSG00000137869.15 CYP19A1 2.528798e-61 7.824100e-58
## ENSG00000168615.13 ADAM9 5.047116e-59 1.366380e-55
## ENSG00000166033.13 HTRA1 7.596758e-59 1.828118e-55
## ENSG00000169385.3 RNASE2 1.086669e-58 2.353508e-55
mean(abs(dge$stat))
## [1] 3.040762
surgmale <- dge
dim(subset(surgmale,padj<0.05))
## [1] 11793 82
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ PG_number + Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + timepoint )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000108950.12 FAM20A 1528.1426 2.7390970 0.2890367 9.476642
## ENSG00000163221.9 S100A12 16539.1280 1.7317637 0.2119146 8.171988
## ENSG00000132170.24 PPARG 149.7573 2.1957935 0.2737136 8.022231
## ENSG00000137959.17 IFI44L 1295.0650 -2.1478366 0.2917842 -7.361045
## ENSG00000088827.13 SIGLEC1 1468.6407 -2.0825545 0.2913658 -7.147559
## ENSG00000170439.7 METTL7B 166.2718 3.3486625 0.4788782 6.992723
## ENSG00000183019.7 MCEMP1 5938.1573 1.5733238 0.2251231 6.988727
## ENSG00000165092.13 ALDH1A1 518.0640 -2.2354764 0.3218900 -6.944846
## ENSG00000174705.13 SH3PXD2B 504.6428 2.3935512 0.3493225 6.851982
## ENSG00000116574.6 RHOU 1117.3491 0.9279911 0.1416458 6.551490
## pvalue padj
## ENSG00000108950.12 FAM20A 2.625986e-21 5.246195e-17
## ENSG00000163221.9 S100A12 3.033483e-16 3.030146e-12
## ENSG00000132170.24 PPARG 1.038414e-15 6.915143e-12
## ENSG00000137959.17 IFI44L 1.824764e-13 9.113785e-10
## ENSG00000088827.13 SIGLEC1 8.833456e-13 3.529496e-09
## ENSG00000170439.7 METTL7B 2.696019e-12 7.916782e-09
## ENSG00000183019.7 MCEMP1 2.773925e-12 7.916782e-09
## ENSG00000165092.13 ALDH1A1 3.788738e-12 9.461427e-09
## ENSG00000174705.13 SH3PXD2B 7.283397e-12 1.616752e-08
## ENSG00000116574.6 RHOU 5.696589e-11 1.138065e-07
mean(abs(dge$stat))
## [1] 0.9878333
surgmale_adj <- dge
dim(subset(surgmale_adj,padj<0.05))
## [1] 487 82
table(chr2gene[match(sapply(strsplit(rownames(surgmale_adj)," "),"[[",1),chr2gene$V2),1])
##
## chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr2 chr20
## 2146 816 1148 1130 380 814 743 1109 1354 334 1553 1416 548
## chr21 chr22 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrM chrX chrY
## 264 606 1150 758 937 1084 1120 730 805 18 647 48
(dim(subset(surgmale,padj<0.05))[1] - dim(subset(surgmale_adj,padj<0.05))[1]) / dim(subset(surgmale,padj<0.05))[1]
## [1] 0.9587043
surgmale_up <- rownames(subset(surgmale,padj<0.05 & log2FoldChange >0))
surgmale_dn <- rownames(subset(surgmale,padj<0.05 & log2FoldChange <0))
surgfemale_up <- rownames(subset(surgfemale,padj<0.05 & log2FoldChange >0))
surgfemale_dn <- rownames(subset(surgfemale,padj<0.05 & log2FoldChange <0))
v1 <- list("male_up"=surgmale_up, "male_dn"=surgmale_dn,
"female_up"=surgfemale_up,"female_dn"=surgfemale_dn)
plot(euler(v1),quantities = TRUE)
common=3541+3402
uniq=1700+472+684+3114
common/(common+uniq) #54% common
## [1] 0.5376752
TODO: Look at cell composition by infection status.
It looks like CCL3 is positively associated with infection outcome at T0, while CD177 is associated with no infection. CCL3 is involved in macrophage activation, while CD177 is a neutrophil activator. After correction for cell types and clinical covariates, the picture CXCL2, CCL3 and PRRG3 remain associated with infection.
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 21
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 304 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000277632.2 CCL3 559.74862 2.1662530 0.3345312 6.475489
## ENSG00000081041.9 CXCL2 53.13096 2.1990827 0.3779392 5.818615
## ENSG00000232810.4 TNF 428.02933 1.2897952 0.2419915 5.329919
## ENSG00000204936.10 CD177 172.96072 -2.5210703 0.4954814 -5.088123
## ENSG00000115590.14 IL1R2 713.50209 -2.5648448 0.5141818 -4.988206
## ENSG00000177606.8 JUN 4050.31490 1.5953388 0.3206681 4.975047
## ENSG00000123358.20 NR4A1 1665.75695 1.3430362 0.2847870 4.715932
## ENSG00000137331.12 IER3 881.33195 0.9049066 0.1921988 4.708180
## ENSG00000145632.15 PLK2 64.29472 0.9055198 0.1969684 4.597284
## ENSG00000146122.17 DAAM2 54.32940 -2.1255210 0.4886383 -4.349887
## pvalue padj
## ENSG00000277632.2 CCL3 9.450545e-11 2.072977e-06
## ENSG00000081041.9 CXCL2 5.933706e-09 6.507792e-05
## ENSG00000232810.4 TNF 9.825632e-08 7.184175e-04
## ENSG00000204936.10 CD177 3.616246e-07 1.983059e-03
## ENSG00000115590.14 IL1R2 6.094254e-07 2.384767e-03
## ENSG00000177606.8 JUN 6.523183e-07 2.384767e-03
## ENSG00000123358.20 NR4A1 2.406071e-06 6.852998e-03
## ENSG00000137331.12 IER3 2.499384e-06 6.852998e-03
## ENSG00000145632.15 PLK2 4.280331e-06 1.043212e-02
## ENSG00000146122.17 DAAM2 1.362079e-05 2.987720e-02
mean(abs(dge$stat))
## [1] 0.8396572
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 27 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000130032.17 PRRG3 37.78635 1.4726086 0.2445894 6.020737
## ENSG00000277632.2 CCL3 559.74862 1.7221417 0.3518844 4.894056
## ENSG00000081041.9 CXCL2 53.13096 1.8603134 0.3958621 4.699397
## ENSG00000287970.1 CH17-98J9.1 15.30396 2.2967264 0.4953308 4.636753
## ENSG00000237973.1 MTCO1P12 33.56076 1.0869755 0.2385261 4.557051
## ENSG00000110436.13 SLC1A2 29.47504 0.9824034 0.2157723 4.552965
## ENSG00000109321.11 AREG 132.29571 2.0856862 0.4588181 4.545780
## ENSG00000079215.15 SLC1A3 219.79787 1.2256380 0.2719961 4.506087
## ENSG00000154099.18 DNAAF1 32.52378 0.8387759 0.1875360 4.472612
## ENSG00000276085.1 CCL3L1 308.64747 1.7359990 0.3916430 4.432606
## pvalue padj
## ENSG00000130032.17 PRRG3 1.736247e-09 3.808457e-05
## ENSG00000277632.2 CCL3 9.877874e-07 1.083356e-02
## ENSG00000081041.9 CXCL2 2.609306e-06 1.715072e-02
## ENSG00000287970.1 CH17-98J9.1 3.539251e-06 1.715072e-02
## ENSG00000237973.1 MTCO1P12 5.187685e-06 1.715072e-02
## ENSG00000110436.13 SLC1A2 5.289518e-06 1.715072e-02
## ENSG00000109321.11 AREG 5.473218e-06 1.715072e-02
## ENSG00000079215.15 SLC1A3 6.603414e-06 1.810574e-02
## ENSG00000154099.18 DNAAF1 7.726981e-06 1.883237e-02
## ENSG00000276085.1 CCL3L1 9.310094e-06 2.042169e-02
mean(abs(dge$stat))
## [1] 0.7424039
infec_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 17 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 53.13096 2.6792765 0.4027935 6.651737
## ENSG00000277632.2 CCL3 559.74862 2.1983492 0.3538976 6.211823
## ENSG00000130032.17 PRRG3 37.78635 1.3506034 0.2517614 5.364616
## ENSG00000154099.18 DNAAF1 32.52378 0.8881684 0.1897045 4.681851
## ENSG00000276085.1 CCL3L1 308.64747 1.7480191 0.3933633 4.443777
## ENSG00000222043.2 AC079305.10 21.70668 0.9359953 0.2240505 4.177608
## ENSG00000166592.12 RRAD 16.82374 0.9340806 0.2265840 4.122446
## ENSG00000112137.18 PHACTR1 249.38474 0.3878710 0.0957853 4.049379
## ENSG00000161835.11 TAMALIN 430.67506 0.6506758 0.1671846 3.891959
## ENSG00000168502.17 MTCL1 43.92701 0.6761268 0.1737423 3.891549
## pvalue padj
## ENSG00000081041.9 CXCL2 2.896543e-11 6.353566e-07
## ENSG00000277632.2 CCL3 5.237324e-10 5.744035e-06
## ENSG00000130032.17 PRRG3 8.112159e-08 5.931340e-04
## ENSG00000154099.18 DNAAF1 2.842964e-06 1.559010e-02
## ENSG00000276085.1 CCL3L1 8.839307e-06 3.877804e-02
## ENSG00000222043.2 AC079305.10 2.945903e-05 1.076973e-01
## ENSG00000166592.12 RRAD 3.748697e-05 1.174681e-01
## ENSG00000112137.18 PHACTR1 5.135372e-05 1.408055e-01
## ENSG00000161835.11 TAMALIN 9.943791e-05 2.184865e-01
## ENSG00000168502.17 MTCL1 9.960634e-05 2.184865e-01
mean(abs(dge$stat))
## [1] 0.7232922
infec_t0_adj <- dge
Look at RPM of top genes in a box plot.
# make RPM
rpm <- apply(mx,2,function(x) { x/sum(x) *1e6} )
# separate by infection status
rpm_i0 <- rpm[,which(colnames(rpm) %in% rownames(ss2[which(ss2$infec==0),]))]
rpm_i1 <- rpm[,which(colnames(rpm) %in% rownames(ss2[which(ss2$infec==1),]))]
# get sig hits
top <- union(rownames(head(subset(infec_t0,padj<0.05),10)) , rownames(head(subset(infec_t0_adj,padj<0.05),10)) )
g <- top[1]
par(mfrow=c(2,3))
par(mar=c(2.1, 3.1, 2.1, 1.1))
lapply(top,function(g) {
g0 <- rpm_i0[which(rownames(rpm_i0) == g),]
g1 <- rpm_i1[which(rownames(rpm_i1) == g),]
gl <- list("Ctrl"=log10(g0+0.1),"Infec"=log10(g1+0.1))
boxplot(gl,cex=0,col="white",ylab="log10(RPM)")
beeswarm(gl,add=TRUE,pch=19)
mtext(g,cex=0.7)
})
## [[1]]
## NULL
##
## [[2]]
## NULL
##
## [[3]]
## NULL
##
## [[4]]
## NULL
##
## [[5]]
## NULL
##
## [[6]]
## NULL
##
## [[7]]
## NULL
##
## [[8]]
## NULL
##
## [[9]]
## NULL
##
## [[10]]
## NULL
par(mfrow=c(1,1))
par(mar= c(5.1, 4.1, 4.1, 2.1) )
Infection is associated with NFKB activation, but this is less clear after correction for cell types. After cell type correction, ZC3H12A and CD83 remain associated with infection.
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 77 21
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 140 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 45.457083 2.5452648 0.3389850 7.508487
## ENSG00000100906.11 NFKBIA 7744.851242 1.2083811 0.2045627 5.907143
## ENSG00000177606.8 JUN 2333.003081 1.5914940 0.2882386 5.521446
## ENSG00000125968.9 ID1 50.805092 1.9987675 0.3880398 5.150934
## ENSG00000112149.10 CD83 184.858033 1.2212714 0.2387950 5.114309
## ENSG00000125538.12 IL1B 539.354039 1.4830131 0.3087587 4.803146
## ENSG00000162772.17 ATF3 99.087468 1.3333849 0.2833832 4.705237
## ENSG00000132972.19 RNF17 9.554713 -4.8037268 1.0336382 -4.647397
## ENSG00000181649.8 PHLDA2 10.653718 1.3053781 0.2838003 4.599635
## ENSG00000183496.6 MEX3B 89.095978 0.5351826 0.1163839 4.598424
## pvalue padj
## ENSG00000081041.9 CXCL2 5.981471e-14 1.319931e-09
## ENSG00000100906.11 NFKBIA 3.480915e-09 3.840668e-05
## ENSG00000177606.8 JUN 3.362211e-08 2.473130e-04
## ENSG00000125968.9 ID1 2.591926e-07 1.389744e-03
## ENSG00000112149.10 CD83 3.148919e-07 1.389744e-03
## ENSG00000125538.12 IL1B 1.561916e-06 5.744468e-03
## ENSG00000162772.17 ATF3 2.535715e-06 7.993661e-03
## ENSG00000132972.19 RNF17 3.361503e-06 9.272286e-03
## ENSG00000181649.8 PHLDA2 4.232309e-06 9.393914e-03
## ENSG00000183496.6 MEX3B 4.256997e-06 9.393914e-03
mean(abs(dge$stat))
## [1] 0.9416072
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 19 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 45.45708 1.9177576 0.3443618 5.569019
## ENSG00000100906.11 NFKBIA 7744.85124 1.1001093 0.2196977 5.007377
## ENSG00000112149.10 CD83 184.85803 1.2101339 0.2539642 4.764979
## ENSG00000155090.15 KLF10 2270.44476 0.8164591 0.1841295 4.434156
## ENSG00000183496.6 MEX3B 89.09598 0.5545294 0.1252629 4.426924
## ENSG00000278196.3 IGLV2-8 167.59369 -1.4846410 0.3388031 -4.382017
## ENSG00000181649.8 PHLDA2 10.65372 1.2575257 0.2994339 4.199677
## ENSG00000177606.8 JUN 2333.00308 1.2431176 0.3028638 4.104543
## ENSG00000105697.9 HAMP 44.89965 0.9849616 0.2415030 4.078466
## ENSG00000107719.9 PALD1 80.25012 0.9632821 0.2407615 4.000980
## pvalue padj
## ENSG00000081041.9 CXCL2 2.561770e-08 0.0005653057
## ENSG00000100906.11 NFKBIA 5.517680e-07 0.0060879323
## ENSG00000112149.10 CD83 1.888734e-06 0.0138928949
## ENSG00000155090.15 KLF10 9.243361e-06 0.0421861780
## ENSG00000183496.6 MEX3B 9.558657e-06 0.0421861780
## ENSG00000278196.3 IGLV2-8 1.175855e-05 0.0432459737
## ENSG00000181649.8 PHLDA2 2.672963e-05 0.0842632443
## ENSG00000177606.8 JUN 4.051140e-05 0.1111535256
## ENSG00000105697.9 HAMP 4.533383e-05 0.1111535256
## ENSG00000107719.9 PALD1 6.308058e-05 0.1375554202
mean(abs(dge$stat))
## [1] 0.8005392
infec_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 17 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000203811.1 H3C14 21.12624 0.8409487 0.14963233 5.620100
## ENSG00000203852.3 H3C15 21.12624 0.8409487 0.14963233 5.620100
## ENSG00000163874.11 ZC3H12A 1241.05455 0.4046799 0.08262408 4.897845
## ENSG00000112149.10 CD83 184.85803 1.1174654 0.22913969 4.876787
## ENSG00000166716.10 ZNF592 3052.71249 0.2779405 0.05725090 4.854780
## ENSG00000163545.11 NUAK2 1213.80073 0.3931730 0.08265567 4.756757
## ENSG00000135540.11 NHSL1 81.38326 0.4925112 0.10577313 4.656298
## ENSG00000205189.12 ZBTB10 263.45971 0.5202030 0.11327291 4.592475
## ENSG00000183496.6 MEX3B 89.09598 0.5839444 0.12822548 4.554043
## ENSG00000155090.15 KLF10 2270.44476 0.7291764 0.16041129 4.545667
## pvalue padj
## ENSG00000203811.1 H3C14 1.908466e-08 0.0002105706
## ENSG00000203852.3 H3C15 1.908466e-08 0.0002105706
## ENSG00000163874.11 ZC3H12A 9.689361e-07 0.0053190451
## ENSG00000112149.10 CD83 1.078280e-06 0.0053190451
## ENSG00000166716.10 ZNF592 1.205203e-06 0.0053190451
## ENSG00000163545.11 NUAK2 1.967275e-06 0.0072353102
## ENSG00000135540.11 NHSL1 3.219464e-06 0.0101491291
## ENSG00000205189.12 ZBTB10 4.380195e-06 0.0115985264
## ENSG00000183496.6 MEX3B 5.262450e-06 0.0115985264
## ENSG00000155090.15 KLF10 5.476145e-06 0.0115985264
mean(abs(dge$stat))
## [1] 0.8969433
infec_eos_adj <- dge
Look at RPM of top genes in a box plot.
# make RPM
rpm <- apply(mx,2,function(x) { x/sum(x) *1e6} )
# separate by infection status
rpm_i0 <- rpm[,which(colnames(rpm) %in% rownames(ss2[which(ss2$infec==0),]))]
rpm_i1 <- rpm[,which(colnames(rpm) %in% rownames(ss2[which(ss2$infec==1),]))]
# get sig hits
top <- union(rownames(head(subset(infec_eos,padj<0.05),10)) , rownames(head(subset(infec_eos_adj,padj<0.05),10)) )
g <- top[1]
par(mfrow=c(2,3))
par(mar=c(2.1, 3.1, 2.1, 1.1))
lapply(top,function(g) {
g0 <- rpm_i0[which(rownames(rpm_i0) == g),]
g1 <- rpm_i1[which(rownames(rpm_i1) == g),]
gl <- list("Ctrl"=log10(g0+0.1),"Infec"=log10(g1+0.1))
boxplot(gl,cex=0,col="white",ylab="log10(RPM)")
beeswarm(gl,add=TRUE,pch=19)
mtext(g,cex=0.7)
})
## [[1]]
## NULL
##
## [[2]]
## NULL
##
## [[3]]
## NULL
##
## [[4]]
## NULL
##
## [[5]]
## NULL
##
## [[6]]
## NULL
##
## [[7]]
## NULL
##
## [[8]]
## NULL
##
## [[9]]
## NULL
##
## [[10]]
## NULL
##
## [[11]]
## NULL
##
## [[12]]
## NULL
##
## [[13]]
## NULL
par(mfrow=c(1,1))
par(mar= c(5.1, 4.1, 4.1, 2.1) )
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 19
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 132 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000183598.4 H3C13 14.00786 2.4730926 0.42424324 5.829421
## ENSG00000101292.8 PROKR2 11.87666 1.3793099 0.23850677 5.783106
## ENSG00000274276.4 CBSL 21.26459 2.9221416 0.51504791 5.673534
## ENSG00000240098.3 RN7SL351P 21.82756 1.3063648 0.24582354 5.314238
## ENSG00000272282.1 LINC02084 149.90716 -0.8521057 0.16940446 -5.030007
## ENSG00000253766.1 RP11-804N13.1 18.05579 0.7600895 0.15368667 4.945709
## ENSG00000203814.6 H2BC18 153.69664 1.8605421 0.37723414 4.932062
## ENSG00000114993.17 RTKN 36.98959 -0.8192192 0.16951098 -4.832838
## ENSG00000245750.10 DRAIC 63.92483 1.0817504 0.22623781 4.781475
## ENSG00000120053.12 GOT1 217.93193 -0.3333793 0.07019228 -4.749516
## pvalue padj
## ENSG00000183598.4 H3C13 5.561991e-09 7.663033e-05
## ENSG00000101292.8 PROKR2 7.333397e-09 7.663033e-05
## ENSG00000274276.4 CBSL 1.398817e-08 9.744627e-05
## ENSG00000240098.3 RN7SL351P 1.071046e-07 5.595946e-04
## ENSG00000272282.1 LINC02084 4.904610e-07 2.050029e-03
## ENSG00000253766.1 RP11-804N13.1 7.586732e-07 2.429245e-03
## ENSG00000203814.6 H2BC18 8.136617e-07 2.429245e-03
## ENSG00000114993.17 RTKN 1.346001e-06 3.516258e-03
## ENSG00000245750.10 DRAIC 1.740137e-06 4.040792e-03
## ENSG00000120053.12 GOT1 2.039045e-06 4.261399e-03
mean(abs(dge$stat))
## [1] 1.39089
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 32 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000183598.4 H3C13 14.00786 2.5169037 0.43246793 5.819862
## ENSG00000110436.13 SLC1A2 29.63018 1.3167429 0.24648776 5.342022
## ENSG00000263006.6 ROCK1P1 96.76662 -2.5091103 0.47412086 -5.292132
## ENSG00000272282.1 LINC02084 149.90716 -0.9347387 0.17964375 -5.203291
## ENSG00000114993.17 RTKN 36.98959 -0.9381823 0.18398572 -5.099213
## ENSG00000120053.12 GOT1 217.93193 -0.3637241 0.07595466 -4.788700
## ENSG00000101292.8 PROKR2 11.87666 1.1987355 0.25254711 4.746582
## ENSG00000100505.14 TRIM9 24.99684 1.2583687 0.26679635 4.716589
## ENSG00000274276.4 CBSL 21.26459 2.6082167 0.55726258 4.680409
## ENSG00000002726.21 AOC1 20.67108 1.5866839 0.34485261 4.601049
## pvalue padj
## ENSG00000183598.4 H3C13 5.889618e-09 NA
## ENSG00000110436.13 SLC1A2 9.191575e-08 0.001138439
## ENSG00000263006.6 ROCK1P1 1.208983e-07 0.001138439
## ENSG00000272282.1 LINC02084 1.957899e-07 0.001229104
## ENSG00000114993.17 RTKN 3.410692e-07 0.001605839
## ENSG00000120053.12 GOT1 1.678653e-06 0.006322816
## ENSG00000101292.8 PROKR2 2.068834e-06 NA
## ENSG00000100505.14 TRIM9 2.398313e-06 0.007527906
## ENSG00000274276.4 CBSL 2.863033e-06 0.007702786
## ENSG00000002726.21 AOC1 4.203675e-06 0.008706667
mean(abs(dge$stat))
## [1] 1.220618
infec_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 11 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000107719.9 PALD1 53.63349 1.3069650 0.2802571 4.663451
## ENSG00000270550.1 IGHV3-30 240.99832 -1.2196819 0.2760123 -4.418940
## ENSG00000110436.13 SLC1A2 29.63018 1.1149531 0.2608810 4.273799
## ENSG00000171236.10 LRG1 1365.81495 -0.6649035 0.1730714 -3.841787
## ENSG00000263006.6 ROCK1P1 96.76662 -1.8170221 0.4806323 -3.780483
## ENSG00000246560.2 UBE2D3-AS1 23.77403 -0.4975790 0.1322363 -3.762803
## ENSG00000120262.10 CCDC170 141.74081 0.4625635 0.1242062 3.724159
## ENSG00000278196.3 IGLV2-8 119.81421 -1.1859894 0.3200049 -3.706161
## ENSG00000137563.13 GGH 64.90089 -0.4969402 0.1342343 -3.702035
## ENSG00000211663.2 IGLV3-19 97.65317 -1.0127961 0.2751574 -3.680788
## pvalue padj
## ENSG00000107719.9 PALD1 3.109507e-06 0.06627293
## ENSG00000270550.1 IGHV3-30 9.918614e-06 0.10569771
## ENSG00000110436.13 SLC1A2 1.921703e-05 0.13652420
## ENSG00000171236.10 LRG1 1.221418e-04 0.47559855
## ENSG00000263006.6 ROCK1P1 1.565244e-04 0.47559855
## ENSG00000246560.2 UBE2D3-AS1 1.680196e-04 0.47559855
## ENSG00000120262.10 CCDC170 1.959676e-04 0.47559855
## ENSG00000278196.3 IGLV2-8 2.104247e-04 0.47559855
## ENSG00000137563.13 GGH 2.138768e-04 0.47559855
## ENSG00000211663.2 IGLV3-19 2.325143e-04 0.47559855
mean(abs(dge$stat))
## [1] 0.8319592
infec_pod1_adj <- dge
Look at RPM of top genes in a box plot.
# make RPM
rpm <- apply(mx,2,function(x) { x/sum(x) *1e6} )
# separate by infection status
rpm_i0 <- rpm[,which(colnames(rpm) %in% rownames(ss2[which(ss2$infec==0),]))]
rpm_i1 <- rpm[,which(colnames(rpm) %in% rownames(ss2[which(ss2$infec==1),]))]
# get sig hits
top <- union(rownames(head(subset(infec_pod1,padj<0.05),10)) , rownames(head(subset(infec_pod1_adj,padj<0.05),10)) )
g <- top[1]
par(mfrow=c(2,3))
par(mar=c(2.1, 3.1, 2.1, 1.1))
lapply(top,function(g) {
g0 <- rpm_i0[which(rownames(rpm_i0) == g),]
g1 <- rpm_i1[which(rownames(rpm_i1) == g),]
gl <- list("Ctrl"=log10(g0+0.1),"Infec"=log10(g1+0.1))
boxplot(gl,cex=0,col="white",ylab="log10(RPM)")
beeswarm(gl,add=TRUE,pch=19)
mtext(g,cex=0.7)
})
## [[1]]
## NULL
##
## [[2]]
## NULL
##
## [[3]]
## NULL
##
## [[4]]
## NULL
##
## [[5]]
## NULL
##
## [[6]]
## NULL
##
## [[7]]
## NULL
##
## [[8]]
## NULL
##
## [[9]]
## NULL
##
## [[10]]
## NULL
par(mfrow=c(1,1))
par(mar= c(5.1, 4.1, 4.1, 2.1) )
ss2 <- subset(ss2,crp_group==4)
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
ss2 <- subset(ss2,crp_group==4)
table(ss2$infec)
##
## 0 1
## 40 15
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 251 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 54.67578 2.4412085 0.4042774 6.038448
## ENSG00000277632.2 CCL3 566.52696 2.2413827 0.3723372 6.019765
## ENSG00000204936.10 CD177 132.67404 -2.7945103 0.5647378 -4.948332
## ENSG00000125538.12 IL1B 566.87608 1.4468628 0.3030643 4.774111
## ENSG00000122877.17 EGR2 185.00839 1.9815033 0.4184339 4.735523
## ENSG00000276085.1 CCL3L1 255.05022 2.0480682 0.4365353 4.691644
## ENSG00000271614.1 ATP2B1-AS1 335.81186 0.8540205 0.1881685 4.538595
## ENSG00000162772.17 ATF3 210.49710 1.7356235 0.3830255 4.531352
## ENSG00000145632.15 PLK2 67.63251 1.1523297 0.2602033 4.428575
## ENSG00000278196.3 IGLV2-8 176.52078 -1.6259988 0.3704989 -4.388674
## pvalue padj
## ENSG00000081041.9 CXCL2 1.556031e-09 1.841376e-05
## ENSG00000277632.2 CCL3 1.746705e-09 1.841376e-05
## ENSG00000204936.10 CD177 7.485213e-07 5.260608e-03
## ENSG00000125538.12 IL1B 1.805026e-06 9.213347e-03
## ENSG00000122877.17 EGR2 2.184914e-06 9.213347e-03
## ENSG00000276085.1 CCL3L1 2.710190e-06 9.523608e-03
## ENSG00000271614.1 ATP2B1-AS1 5.663039e-06 1.544596e-02
## ENSG00000162772.17 ATF3 5.860733e-06 1.544596e-02
## ENSG00000145632.15 PLK2 9.485759e-06 2.186665e-02
## ENSG00000278196.3 IGLV2-8 1.140439e-05 2.186665e-02
mean(abs(dge$stat))
## [1] 0.7545652
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 14 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000234200.2 U82671.8 9.544245 -30.0000000 3.4122477 -8.791859
## ENSG00000277632.2 CCL3 566.526963 2.1308743 0.3959351 5.381878
## ENSG00000081041.9 CXCL2 54.675784 2.3432113 0.4458406 5.255715
## ENSG00000276085.1 CCL3L1 255.050222 2.3337190 0.4600686 5.072546
## ENSG00000154099.18 DNAAF1 30.756835 1.0192608 0.2126929 4.792171
## ENSG00000125538.12 IL1B 566.876075 1.4998324 0.3335427 4.496672
## ENSG00000263089.1 RP11-166P13.4 30.702829 0.5983402 0.1366917 4.377298
## ENSG00000164047.6 CAMP 706.158203 -1.9909703 0.4795642 -4.151624
## ENSG00000229989.4 MIR181A1HG 21.621435 0.8412098 0.2043464 4.116586
## ENSG00000188396.4 DYNLT4 73.796111 0.8026334 0.1953743 4.108183
## pvalue padj
## ENSG00000234200.2 U82671.8 1.471055e-18 3.226760e-14
## ENSG00000277632.2 CCL3 7.371268e-08 8.084438e-04
## ENSG00000081041.9 CXCL2 1.474504e-07 1.078108e-03
## ENSG00000276085.1 CCL3L1 3.925277e-07 2.152524e-03
## ENSG00000154099.18 DNAAF1 1.649864e-06 7.237952e-03
## ENSG00000125538.12 IL1B 6.902519e-06 2.523446e-02
## ENSG00000263089.1 RP11-166P13.4 1.201595e-05 3.765284e-02
## ENSG00000164047.6 CAMP 3.301247e-05 8.611395e-02
## ENSG00000229989.4 MIR181A1HG 3.845253e-05 8.611395e-02
## ENSG00000188396.4 DYNLT4 3.987840e-05 8.611395e-02
mean(abs(dge$stat))
## [1] 0.7837256
infec_hi_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 37 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 54.67578 2.9887595 0.5656907 5.283381
## ENSG00000222043.2 AC079305.10 22.44175 1.2690003 0.2578361 4.921732
## ENSG00000263089.1 RP11-166P13.4 30.70283 0.6578750 0.1455111 4.521134
## ENSG00000154099.18 DNAAF1 30.75683 1.0395823 0.2349983 4.423787
## ENSG00000225450.1 RP3-508I15.14 20.11515 0.8892843 0.2030298 4.380069
## ENSG00000230184.1 SMYD3-IT1 15.13075 0.9215237 0.2136594 4.313051
## ENSG00000260103.2 RP11-10O17.1 50.62430 0.7192927 0.1673209 4.298881
## ENSG00000271614.1 ATP2B1-AS1 335.81186 0.7270082 0.1774110 4.097876
## ENSG00000188396.4 DYNLT4 73.79611 0.8149057 0.1999687 4.075165
## ENSG00000125538.12 IL1B 566.87608 1.1953484 0.3113801 3.838872
## pvalue padj
## ENSG00000081041.9 CXCL2 1.268215e-07 0.00278183
## ENSG00000222043.2 AC079305.10 8.578154e-07 0.00940809
## ENSG00000263089.1 RP11-166P13.4 6.150928e-06 0.04497353
## ENSG00000154099.18 DNAAF1 9.698570e-06 0.05204815
## ENSG00000225450.1 RP3-508I15.14 1.186418e-05 0.05204815
## ENSG00000230184.1 SMYD3-IT1 1.610172e-05 0.05379173
## ENSG00000260103.2 RP11-10O17.1 1.716627e-05 0.05379173
## ENSG00000271614.1 ATP2B1-AS1 4.169581e-05 0.11206758
## ENSG00000188396.4 DYNLT4 4.598168e-05 0.11206758
## ENSG00000125538.12 IL1B 1.236007e-04 0.27111815
mean(abs(dge$stat))
## [1] 0.761354
infec_hi_t0_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
ss2 <- subset(ss2,crp_group==4)
table(ss2$infec)
##
## 0 1
## 36 16
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 157 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 59.38579 2.2675913 0.4119587 5.504415
## ENSG00000233608.4 TWIST2 31.85175 2.4148913 0.5301118 4.555438
## ENSG00000100906.11 NFKBIA 8328.93207 1.2218830 0.2758840 4.428974
## ENSG00000211679.2 IGLC3 717.15136 -1.8750700 0.4673205 -4.012385
## ENSG00000270972.1 RP11-326C3.15 90.07297 1.3355253 0.3412391 3.913753
## ENSG00000164683.18 HEY1 11.35677 1.5525717 0.3972293 3.908503
## ENSG00000163734.4 CXCL3 32.53411 1.0431076 0.2788941 3.740157
## ENSG00000137331.12 IER3 1035.52928 0.9442025 0.2525552 3.738599
## ENSG00000203811.1 H3C14 25.08126 1.0987773 0.2961847 3.709770
## ENSG00000203852.3 H3C15 25.08126 1.0987773 0.2961847 3.709770
## pvalue padj
## ENSG00000081041.9 CXCL2 3.703965e-08 0.000817354
## ENSG00000233608.4 TWIST2 5.227662e-06 0.057679408
## ENSG00000100906.11 NFKBIA 9.468247e-06 0.069645265
## ENSG00000211679.2 IGLC3 6.010832e-05 0.331602548
## ENSG00000270972.1 RP11-326C3.15 9.087261e-05 0.341559968
## ENSG00000164683.18 HEY1 9.286989e-05 0.341559968
## ENSG00000163734.4 CXCL3 1.839053e-04 0.457774341
## ENSG00000137331.12 IER3 1.850487e-04 0.457774341
## ENSG00000203811.1 H3C14 2.074475e-04 0.457774341
## ENSG00000203852.3 H3C15 2.074475e-04 0.457774341
mean(abs(dge$stat))
## [1] 0.7518441
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 5 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 59.38579 2.0807484 0.4097794 5.077728
## ENSG00000100906.11 NFKBIA 8328.93207 1.2324683 0.2822394 4.366747
## ENSG00000233608.4 TWIST2 31.85175 2.1222317 0.5269734 4.027208
## ENSG00000211679.2 IGLC3 717.15136 -1.8493501 0.4775844 -3.872300
## ENSG00000158714.11 SLAMF8 139.06390 -0.5474946 0.1429130 -3.830963
## ENSG00000203814.6 H2BC18 155.89198 1.6969900 0.4522577 3.752263
## ENSG00000181649.8 PHLDA2 11.78556 1.2185954 0.3315817 3.675098
## ENSG00000278196.3 IGLV2-8 136.83437 -1.3444988 0.3664133 -3.669350
## ENSG00000137331.12 IER3 1035.52928 0.9396829 0.2570760 3.655272
## ENSG00000203811.1 H3C14 25.08126 1.1390341 0.3116235 3.655161
## pvalue padj
## ENSG00000081041.9 CXCL2 3.819752e-07 0.008429047
## ENSG00000100906.11 NFKBIA 1.261105e-05 0.139144070
## ENSG00000233608.4 TWIST2 5.644302e-05 0.415176056
## ENSG00000211679.2 IGLC3 1.078131e-04 0.469157816
## ENSG00000158714.11 SLAMF8 1.276425e-04 0.469157816
## ENSG00000203814.6 H2BC18 1.752454e-04 0.469157816
## ENSG00000181649.8 PHLDA2 2.377583e-04 0.469157816
## ENSG00000278196.3 IGLV2-8 2.431677e-04 0.469157816
## ENSG00000137331.12 IER3 2.569090e-04 0.469157816
## ENSG00000203811.1 H3C14 2.570207e-04 0.469157816
mean(abs(dge$stat))
## [1] 0.7599704
infec_hi_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 22 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000137331.12 IER3 1035.52928 0.7141676 0.15437720 4.626121
## ENSG00000203811.1 H3C14 25.08126 0.7919624 0.18066746 4.383536
## ENSG00000203852.3 H3C15 25.08126 0.7919624 0.18066746 4.383536
## ENSG00000166716.10 ZNF592 3122.46108 0.2702272 0.06886622 3.923944
## ENSG00000136244.12 IL6 13.47291 1.5966335 0.42245821 3.779388
## ENSG00000211821.2 TRDV2 32.69504 -1.7812902 0.47211784 -3.772978
## ENSG00000196652.12 ZKSCAN5 402.77747 0.2149270 0.05831216 3.685800
## ENSG00000253651.1 SOD1P3 25.72541 -0.7099419 0.19470290 -3.646283
## ENSG00000281383.1 CH507-513H4.5 12.92813 -1.1564546 0.31791056 -3.637673
## ENSG00000273018.7 FAM106A 605.82214 -0.8674392 0.23918780 -3.626603
## pvalue padj
## ENSG00000137331.12 IER3 3.725777e-06 0.08221671
## ENSG00000203811.1 H3C14 1.167681e-05 0.08589073
## ENSG00000203852.3 H3C15 1.167681e-05 0.08589073
## ENSG00000166716.10 ZNF592 8.711086e-05 0.48056882
## ENSG00000136244.12 IL6 1.572143e-04 0.55011063
## ENSG00000211821.2 TRDV2 1.613106e-04 0.55011063
## ENSG00000196652.12 ZKSCAN5 2.279850e-04 0.55011063
## ENSG00000253651.1 SOD1P3 2.660606e-04 0.55011063
## ENSG00000281383.1 CH507-513H4.5 2.751128e-04 0.55011063
## ENSG00000273018.7 FAM106A 2.871745e-04 0.55011063
mean(abs(dge$stat))
## [1] 0.7913034
infec_hi_eos_adj <- dge
Nothing that interesting found here.
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
ss2 <- subset(ss2,crp_group==4)
table(ss2$infec)
##
## 0 1
## 39 15
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 134 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000263006.6 ROCK1P1 143.51840 -2.8457150 0.6127849 -4.643905
## ENSG00000114993.17 RTKN 33.45040 -0.9727041 0.2246229 -4.330387
## ENSG00000133067.18 LGR6 332.98785 -1.0190232 0.2389664 -4.264295
## ENSG00000272282.1 LINC02084 133.42431 -0.8483765 0.1995052 -4.252403
## ENSG00000163564.15 PYHIN1 778.61688 -0.7263181 0.1722034 -4.217790
## ENSG00000188011.5 RTP5 110.63679 -1.2773824 0.3113386 -4.102871
## ENSG00000101292.8 PROKR2 14.88931 1.2783959 0.3141276 4.069671
## ENSG00000162062.15 TEDC2 12.78794 -1.0962136 0.2693922 -4.069211
## ENSG00000161249.21 DMKN 25.84487 -0.9221472 0.2284010 -4.037404
## ENSG00000205176.3 REXO1L1P 35.31082 -2.0578515 0.5150296 -3.995599
## pvalue padj
## ENSG00000263006.6 ROCK1P1 3.418846e-06 0.07286245
## ENSG00000114993.17 RTKN 1.488477e-05 0.10515679
## ENSG00000133067.18 LGR6 2.005344e-05 0.10515679
## ENSG00000272282.1 LINC02084 2.114893e-05 0.10515679
## ENSG00000163564.15 PYHIN1 2.467079e-05 0.10515679
## ENSG00000188011.5 RTP5 4.080540e-05 0.11460022
## ENSG00000101292.8 PROKR2 4.707960e-05 0.11460022
## ENSG00000162062.15 TEDC2 4.717255e-05 0.11460022
## ENSG00000161249.21 DMKN 5.404593e-05 0.11460022
## ENSG00000205176.3 REXO1L1P 6.453100e-05 0.11460022
mean(abs(dge$stat))
## [1] 0.9575974
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 11 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000272282.1 LINC02084 133.42431 -0.9863098 0.19601677 -5.031762
## ENSG00000274226.5 TBC1D3H 18.51873 15.6901825 3.32880289 4.713461
## ENSG00000163564.15 PYHIN1 778.61688 -0.7954948 0.17713864 -4.490803
## ENSG00000114993.17 RTKN 33.45040 -1.0353321 0.23289568 -4.445476
## ENSG00000274611.4 TBC1D3 56.78775 -14.8321478 3.35408796 -4.422111
## ENSG00000242516.2 LINC00960 319.14668 0.6196175 0.14078957 4.401018
## ENSG00000145649.8 GZMA 1281.54145 -0.8266697 0.19163297 -4.313817
## ENSG00000287644.1 CTD-2267D19.8 50.47381 1.0383331 0.24105860 4.307389
## ENSG00000164366.4 CCDC127 367.82694 -0.2970838 0.07000958 -4.243474
## ENSG00000214940.8 NPIPA8 119.02853 -3.2783231 0.77394460 -4.235863
## pvalue padj
## ENSG00000272282.1 LINC02084 4.859917e-07 0.01035794
## ENSG00000274226.5 TBC1D3H 2.435444e-06 0.02595331
## ENSG00000163564.15 PYHIN1 7.095507e-06 0.03827248
## ENSG00000114993.17 RTKN 8.769762e-06 0.03827248
## ENSG00000274611.4 TBC1D3 9.774142e-06 0.03827248
## ENSG00000242516.2 LINC00960 1.077440e-05 0.03827248
## ENSG00000145649.8 GZMA 1.604594e-05 0.04400948
## ENSG00000287644.1 CTD-2267D19.8 1.651930e-05 0.04400948
## ENSG00000164366.4 CCDC127 2.200859e-05 0.04852456
## ENSG00000214940.8 NPIPA8 2.276759e-05 0.04852456
mean(abs(dge$stat))
## [1] 1.055163
infec_hi_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 29 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000211675.2 IGLC1 434.36001 -1.2567577 0.26755308 -4.697228
## ENSG00000137563.13 GGH 74.09123 -0.6996762 0.16333393 -4.283716
## ENSG00000214940.8 NPIPA8 119.02853 -3.6034008 0.87843696 -4.102060
## ENSG00000014914.21 MTMR11 1006.38773 0.4366826 0.11421380 3.823378
## ENSG00000286482.1 RP3-395C13.2 58.43179 0.6167469 0.16419987 3.756074
## ENSG00000000460.17 C1orf112 186.09589 0.3118072 0.08349367 3.734501
## ENSG00000143228.13 NUF2 31.00317 -0.6012651 0.16638822 -3.613628
## ENSG00000242516.2 LINC00960 319.14668 0.5545684 0.15669239 3.539217
## ENSG00000225101.6 OR52K3P 332.46030 0.6931745 0.19617302 3.533485
## ENSG00000001630.17 CYP51A1 327.64662 -0.5637684 0.16018014 -3.519590
## pvalue padj
## ENSG00000211675.2 IGLC1 2.637165e-06 0.05620589
## ENSG00000137563.13 GGH 1.837974e-05 0.19586366
## ENSG00000214940.8 NPIPA8 4.094886e-05 0.29091434
## ENSG00000014914.21 MTMR11 1.316355e-04 0.66811827
## ENSG00000286482.1 RP3-395C13.2 1.725994e-04 0.66811827
## ENSG00000000460.17 C1orf112 1.880875e-04 0.66811827
## ENSG00000143228.13 NUF2 3.019420e-04 0.75627722
## ENSG00000242516.2 LINC00960 4.013152e-04 0.75627722
## ENSG00000225101.6 OR52K3P 4.101189e-04 0.75627722
## ENSG00000001630.17 CYP51A1 4.322141e-04 0.75627722
mean(abs(dge$stat))
## [1] 0.7980182
infec_hi_pod1_adj <- dge
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 21
ss2 <- subset(ss2,crp_group==1)
table(ss2$infec)
##
## 0 1
## 50 6
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 287 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE
## ENSG00000260287.4 TBC1D3G 28.40074 -22.6109180 4.0339406
## ENSG00000137563.13 GGH 62.69110 1.1709886 0.2328459
## ENSG00000274827.4 LINC01297 92.16914 1.2945366 0.2671907
## ENSG00000140488.16 CELF6 230.87659 -0.6575787 0.1586643
## ENSG00000099139.14 PCSK5 1203.88986 1.5452196 0.3805552
## ENSG00000225096.3 XXbac-BPG55C20.7 22.75294 1.2127085 0.3089678
## ENSG00000243708.11 PLA2G4B 1104.99936 -0.5151795 0.1313353
## ENSG00000205176.3 REXO1L1P 17.38896 1.5705138 0.4015593
## ENSG00000232810.4 TNF 441.02353 1.9155598 0.4949741
## ENSG00000242686.4 PDE6B-AS1 23.59407 -1.5222134 0.4013725
## stat pvalue padj
## ENSG00000260287.4 TBC1D3G -5.605169 2.080520e-08 0.0004553011
## ENSG00000137563.13 GGH 5.029028 4.929737e-07 0.0053941178
## ENSG00000274827.4 LINC01297 4.844991 1.266173e-06 0.0092363132
## ENSG00000140488.16 CELF6 -4.144465 3.406078e-05 0.1863465053
## ENSG00000099139.14 PCSK5 4.060434 4.898150e-05 0.2143822300
## ENSG00000225096.3 XXbac-BPG55C20.7 3.925031 8.671847e-05 0.2513932114
## ENSG00000243708.11 PLA2G4B -3.922628 8.758828e-05 0.2513932114
## ENSG00000205176.3 REXO1L1P 3.911038 9.190028e-05 0.2513932114
## ENSG00000232810.4 TNF 3.870021 1.088262e-04 0.2646169227
## ENSG00000242686.4 PDE6B-AS1 -3.792521 1.491258e-04 0.3173303951
mean(abs(dge$stat))
## [1] 0.7832468
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 21 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000237973.1 MTCO1P12 40.32129 3.002411 0.4119019 7.289141
## ENSG00000260287.4 TBC1D3G 42.10428 -29.665053 5.2096228 -5.694280
## ENSG00000274611.4 TBC1D3 82.13231 -29.166989 5.4345099 -5.366995
## ENSG00000152463.15 OLAH 21.52114 4.020448 0.7659512 5.248960
## ENSG00000239887.6 C1orf226 24.28151 2.631563 0.5103541 5.156348
## ENSG00000274827.4 LINC01297 92.16914 1.501144 0.2973802 5.047895
## ENSG00000130032.17 PRRG3 43.99543 2.456920 0.4944207 4.969290
## ENSG00000174705.13 SH3PXD2B 180.42091 1.863691 0.4011284 4.646121
## ENSG00000096060.15 FKBP5 6876.05993 1.399670 0.3153332 4.438702
## ENSG00000198929.13 NOS1AP 14.55122 2.218786 0.5050922 4.392834
## pvalue padj
## ENSG00000237973.1 MTCO1P12 3.119365e-13 6.842328e-09
## ENSG00000260287.4 TBC1D3G 1.238934e-08 1.358801e-04
## ENSG00000274611.4 TBC1D3 8.005912e-08 5.853656e-04
## ENSG00000152463.15 OLAH 1.529599e-07 8.387938e-04
## ENSG00000239887.6 C1orf226 2.518132e-07 1.104705e-03
## ENSG00000274827.4 LINC01297 4.467048e-07 1.633078e-03
## ENSG00000130032.17 PRRG3 6.719846e-07 2.105712e-03
## ENSG00000174705.13 SH3PXD2B 3.382342e-06 9.273960e-03
## ENSG00000096060.15 FKBP5 9.050294e-06 2.180410e-02
## ENSG00000198929.13 NOS1AP 1.118825e-05 2.180410e-02
mean(abs(dge$stat))
## [1] 0.7658495
infec_lo_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 35 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000237973.1 MTCO1P12 40.321285 2.9361037 0.4322990 6.791836
## ENSG00000130032.17 PRRG3 43.995433 2.4165029 0.5073411 4.763073
## ENSG00000280064.1 RP11-205M5.3 192.566181 -4.2862829 0.9480431 -4.521190
## ENSG00000174705.13 SH3PXD2B 180.420909 1.5718498 0.3863045 4.068940
## ENSG00000074803.20 SLC12A1 84.816349 -4.2756070 1.0595137 -4.035443
## ENSG00000185201.17 IFITM2 5170.750055 1.0596384 0.2667563 3.972309
## ENSG00000183873.18 SCN5A 22.637470 1.6069010 0.4062225 3.955717
## ENSG00000102962.5 CCL22 9.737015 1.3544661 0.3439459 3.938021
## ENSG00000274827.4 LINC01297 92.169136 1.1883708 0.3018729 3.936659
## ENSG00000251661.3 RP11-326C3.11 202.018525 -0.8809665 0.2263599 -3.891885
## pvalue padj
## ENSG00000237973.1 MTCO1P12 1.107154e-11 2.428543e-07
## ENSG00000130032.17 PRRG3 1.906667e-06 2.091137e-02
## ENSG00000280064.1 RP11-205M5.3 6.149298e-06 4.496161e-02
## ENSG00000174705.13 SH3PXD2B 4.722745e-05 2.013724e-01
## ENSG00000074803.20 SLC12A1 5.449943e-05 2.013724e-01
## ENSG00000185201.17 IFITM2 7.117918e-05 2.013724e-01
## ENSG00000183873.18 SCN5A 7.630555e-05 2.013724e-01
## ENSG00000102962.5 CCL22 8.215634e-05 2.013724e-01
## ENSG00000274827.4 LINC01297 8.262375e-05 2.013724e-01
## ENSG00000251661.3 RP11-326C3.11 9.946850e-05 2.142215e-01
mean(abs(dge$stat))
## [1] 0.7297472
infec_lo_t0_adj <- dge
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 77 21
ss2 <- subset(ss2,crp_group==1)
table(ss2$infec)
##
## 0 1
## 41 5
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 104 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000260287.4 TBC1D3G 61.94712 -23.0988811 3.1581305 -7.314100
## ENSG00000268734.1 CTB-61M7.2 23.54191 -22.0540131 3.1126254 -7.085341
## ENSG00000273513.1 TBC1D3K 28.79301 -22.3280408 3.2081270 -6.959837
## ENSG00000284554.2 CTA-150C2.22 15.49624 -21.4867745 3.4122270 -6.296995
## ENSG00000274611.4 TBC1D3 96.90177 -23.9534141 4.7239603 -5.070621
## ENSG00000112149.10 CD83 193.46983 2.1178279 0.4497498 4.708903
## ENSG00000081041.9 CXCL2 28.29751 2.6857651 0.6118343 4.389694
## ENSG00000123358.20 NR4A1 473.18397 1.5297318 0.3531655 4.331488
## ENSG00000269937.1 RP11-20I23.8 397.39661 0.9816369 0.2282927 4.299905
## ENSG00000261613.2 RP11-20I23.13 231.27242 0.8820845 0.2167940 4.068768
## pvalue padj
## ENSG00000260287.4 TBC1D3G 2.591127e-13 5.711622e-09
## ENSG00000268734.1 CTB-61M7.2 1.387021e-12 1.528705e-08
## ENSG00000273513.1 TBC1D3K 3.406667e-12 2.503105e-08
## ENSG00000284554.2 CTA-150C2.22 3.034719e-10 1.672358e-06
## ENSG00000274611.4 TBC1D3 3.965189e-07 1.748093e-03
## ENSG00000112149.10 CD83 2.490541e-06 9.149834e-03
## ENSG00000081041.9 CXCL2 1.135104e-05 3.574441e-02
## ENSG00000123358.20 NR4A1 1.481053e-05 4.080857e-02
## ENSG00000269937.1 RP11-20I23.8 1.708714e-05 4.185021e-02
## ENSG00000261613.2 RP11-20I23.13 4.726228e-05 1.041802e-01
mean(abs(dge$stat))
## [1] 0.7858115
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 6 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000275954.5 TBC1D3F 33.51534 -29.883726 2.6369270 -11.332784
## ENSG00000259948.2 RP11-326A19.5 19.78023 -23.914298 3.0688020 -7.792714
## ENSG00000260287.4 TBC1D3G 61.94712 -29.395781 4.0841725 -7.197487
## ENSG00000267303.1 CTD-2369P2.12 31.24200 -20.463790 2.9549062 -6.925360
## ENSG00000268734.1 CTB-61M7.2 23.54191 -24.973470 4.1729640 -5.984588
## ENSG00000284554.2 CTA-150C2.22 15.49624 -29.083720 4.9560115 -5.868372
## ENSG00000273513.1 TBC1D3K 28.79301 -23.718876 4.1649549 -5.694870
## ENSG00000107719.9 PALD1 108.35203 2.675816 0.4857488 5.508642
## ENSG00000274611.4 TBC1D3 96.90177 -30.000000 5.5380902 -5.417030
## ENSG00000259661.1 AC068831.15 24.17674 -29.953491 5.5405222 -5.406258
## pvalue padj
## ENSG00000275954.5 TBC1D3F 9.028801e-30 1.992386e-25
## ENSG00000259948.2 RP11-326A19.5 6.558479e-15 7.236297e-11
## ENSG00000260287.4 TBC1D3G 6.133219e-13 4.511391e-09
## ENSG00000267303.1 CTD-2369P2.12 4.348676e-12 2.399056e-08
## ENSG00000268734.1 CTB-61M7.2 2.169383e-09 9.574356e-06
## ENSG00000284554.2 CTA-150C2.22 4.400946e-09 1.618595e-05
## ENSG00000273513.1 TBC1D3K 1.234661e-08 3.892182e-05
## ENSG00000107719.9 PALD1 3.616131e-08 9.974646e-05
## ENSG00000274611.4 TBC1D3 6.059722e-08 1.420126e-04
## ENSG00000259661.1 AC068831.15 6.435520e-08 1.420126e-04
mean(abs(dge$stat))
## [1] 0.8521819
infec_lo_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 34 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000107719.9 PALD1 108.35203 2.7072742 0.5116337 5.291431
## ENSG00000179820.16 MYADM 2493.44944 0.9060031 0.1919774 4.719322
## ENSG00000205189.12 ZBTB10 299.56466 1.1745843 0.2663775 4.409473
## ENSG00000106571.15 GLI3 14.71645 1.7690416 0.4074469 4.341772
## ENSG00000152217.20 SETBP1 444.67981 0.9087408 0.2102917 4.321334
## ENSG00000232021.7 LEF1-AS1 131.81316 -1.1189064 0.2604249 -4.296465
## ENSG00000127124.16 HIVEP3 733.20898 0.9356917 0.2329499 4.016707
## ENSG00000204381.12 LAYN 41.75720 -2.1081428 0.5316181 -3.965521
## ENSG00000139610.2 CELA1 21.39799 -1.9637641 0.4998412 -3.928776
## ENSG00000165655.17 ZNF503 95.07212 1.0592072 0.2726987 3.884166
## pvalue padj
## ENSG00000107719.9 PALD1 1.213631e-07 0.002678121
## ENSG00000179820.16 MYADM 2.366317e-06 0.026108757
## ENSG00000205189.12 ZBTB10 1.036224e-05 0.063826322
## ENSG00000106571.15 GLI3 1.413384e-05 0.063826322
## ENSG00000152217.20 SETBP1 1.550891e-05 0.063826322
## ENSG00000232021.7 LEF1-AS1 1.735433e-05 0.063826322
## ENSG00000127124.16 HIVEP3 5.901696e-05 0.186046741
## ENSG00000204381.12 LAYN 7.323586e-05 0.196012808
## ENSG00000139610.2 CELA1 8.537936e-05 0.196012808
## ENSG00000165655.17 ZNF503 1.026817e-04 0.196012808
mean(abs(dge$stat))
## [1] 0.7993698
infec_lo_eos_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 19
ss2 <- subset(ss2,crp_group==1)
table(ss2$infec)
##
## 0 1
## 51 4
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 83 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000263244.2 RP11-473I1.9 136.22535 -22.023769 2.6787476 -8.221666
## ENSG00000275302.2 CCL4 434.08075 2.677717 0.4420782 6.057112
## ENSG00000260287.4 TBC1D3G 37.77301 -22.309321 3.8418428 -5.806932
## ENSG00000253797.2 UTP14C 15.75173 -21.093285 3.8247900 -5.514887
## ENSG00000277632.2 CCL3 340.81892 3.646567 0.6877949 5.301824
## ENSG00000118503.15 TNFAIP3 3568.09110 1.596738 0.3018511 5.289821
## ENSG00000081041.9 CXCL2 80.72454 4.040491 0.7679762 5.261219
## ENSG00000138738.11 PRDM5 60.80353 2.245809 0.4301535 5.220949
## ENSG00000115604.12 IL18R1 653.78759 1.682109 0.3237240 5.196119
## ENSG00000163661.4 PTX3 101.95976 1.676977 0.3240238 5.175474
## pvalue padj
## ENSG00000263244.2 RP11-473I1.9 2.006952e-16 4.271598e-12
## ENSG00000275302.2 CCL4 1.385870e-09 1.474843e-05
## ENSG00000260287.4 TBC1D3G 6.362784e-09 4.514183e-05
## ENSG00000253797.2 UTP14C 3.490041e-08 1.857051e-04
## ENSG00000277632.2 CCL3 1.146515e-07 4.343210e-04
## ENSG00000118503.15 TNFAIP3 1.224359e-07 4.343210e-04
## ENSG00000081041.9 CXCL2 1.431032e-07 4.351154e-04
## ENSG00000138738.11 PRDM5 1.780087e-07 4.735920e-04
## ENSG00000115604.12 IL18R1 2.034916e-07 4.812351e-04
## ENSG00000163661.4 PTX3 2.273337e-07 4.838569e-04
mean(abs(dge$stat))
## [1] 1.010522
# model with clinical covariates
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 18 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000284931.1 CTD-2643I7.4 182.253678 -19.687232 2.2731674 -8.660705
## ENSG00000263244.2 RP11-473I1.9 136.225348 -24.271770 2.9269177 -8.292604
## ENSG00000261759.1 RP11-626G11.3 17.177769 -16.471967 2.3549384 -6.994648
## ENSG00000107719.9 PALD1 66.882232 3.580906 0.5359352 6.681603
## ENSG00000273513.1 TBC1D3K 21.954589 -30.000000 4.7368710 -6.333295
## ENSG00000260287.4 TBC1D3G 37.773009 -29.994660 4.8750658 -6.152668
## ENSG00000242534.3 IGKV2D-28 8.309503 -29.976588 5.2334295 -5.727905
## ENSG00000254614.2 AP003068.23 433.793525 -1.990339 0.3648507 -5.455215
## ENSG00000183598.4 H3C13 7.695684 4.293642 0.7944061 5.404845
## ENSG00000273025.1 RP11-106M3.5 8.374942 -19.257691 3.8184518 -5.043324
## pvalue padj
## ENSG00000284931.1 CTD-2643I7.4 4.688548e-18 9.992702e-14
## ENSG00000263244.2 RP11-473I1.9 1.107954e-16 1.180692e-12
## ENSG00000261759.1 RP11-626G11.3 2.659248e-12 1.889218e-08
## ENSG00000107719.9 PALD1 2.363428e-11 1.259293e-07
## ENSG00000273513.1 TBC1D3K 2.399805e-10 1.022941e-06
## ENSG00000260287.4 TBC1D3G 7.619029e-10 2.706406e-06
## ENSG00000242534.3 IGKV2D-28 1.016783e-08 3.095815e-05
## ENSG00000254614.2 AP003068.23 4.891360e-08 1.303119e-04
## ENSG00000183598.4 H3C13 6.486450e-08 1.536063e-04
## ENSG00000273025.1 RP11-106M3.5 4.575129e-07 9.750972e-04
mean(abs(dge$stat))
## [1] 0.8394566
infec_lo_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 24 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000107719.9 PALD1 66.88223 3.5670498 0.5651323 6.311884
## ENSG00000254614.2 AP003068.23 433.79352 -2.0091426 0.3923548 -5.120729
## ENSG00000183246.8 RIMBP3C 13.03160 -6.7488223 1.6053959 -4.203837
## ENSG00000260238.6 PMF1-BGLAP 123.59233 -1.0347448 0.2465450 -4.196982
## ENSG00000137869.15 CYP19A1 19.03905 -4.2589501 1.0316283 -4.128377
## ENSG00000096696.15 DSP 80.75129 -5.7470014 1.4246909 -4.033858
## ENSG00000255200.1 PGAM1P8 82.13391 -1.5327583 0.3818111 -4.014442
## ENSG00000146232.17 NFKBIE 978.84088 -0.8316586 0.2107895 -3.945447
## ENSG00000279662.1 RP11-609N14.4 38.13266 1.1351499 0.2916021 3.892804
## ENSG00000171236.10 LRG1 623.62715 -1.2953112 0.3394857 -3.815510
## pvalue padj
## ENSG00000107719.9 PALD1 2.756589e-10 5.875119e-06
## ENSG00000254614.2 AP003068.23 3.043562e-07 3.243372e-03
## ENSG00000183246.8 RIMBP3C 2.624281e-05 1.441270e-01
## ENSG00000260238.6 PMF1-BGLAP 2.704960e-05 1.441270e-01
## ENSG00000137869.15 CYP19A1 3.653332e-05 1.557269e-01
## ENSG00000096696.15 DSP 5.486838e-05 1.814243e-01
## ENSG00000255200.1 PGAM1P8 5.958666e-05 1.814243e-01
## ENSG00000146232.17 NFKBIE 7.965124e-05 2.122008e-01
## ENSG00000279662.1 RP11-609N14.4 9.909221e-05 2.346614e-01
## ENSG00000171236.10 LRG1 1.359018e-04 2.524306e-01
mean(abs(dge$stat))
## [1] 0.894695
infec_lo_pod1_adj <- dge
CCL3 seems quite robust.
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 21
ss2 <- subset(ss2,treatment_group==1)
table(ss2$infec)
##
## 0 1
## 43 7
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 83 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000260287.4 TBC1D3G 23.44166 -22.5667772 3.4251829 -6.588488
## ENSG00000137331.12 IER3 1016.45739 1.6800322 0.2919380 5.754758
## ENSG00000147576.17 ADHFE1 934.81114 -0.6596945 0.1267333 -5.205377
## ENSG00000277632.2 CCL3 860.85616 2.9825889 0.5933781 5.026456
## ENSG00000182013.18 PNMA8A 11.92564 2.2127877 0.4552821 4.860256
## ENSG00000161905.13 ALOX15 23.63455 1.9571138 0.4039780 4.844604
## ENSG00000167766.19 ZNF83 2629.92337 -0.6390342 0.1392730 -4.588357
## ENSG00000234709.2 UPF3AP3 21.14844 0.9677018 0.2137213 4.527869
## ENSG00000232810.4 TNF 532.70238 2.0738894 0.4711473 4.401786
## ENSG00000099860.9 GADD45B 2291.47951 1.0463573 0.2385711 4.385935
## pvalue padj
## ENSG00000260287.4 TBC1D3G 4.443274e-11 9.746322e-07
## ENSG00000137331.12 IER3 8.676609e-09 9.516071e-05
## ENSG00000147576.17 ADHFE1 1.936035e-07 1.415564e-03
## ENSG00000277632.2 CCL3 4.996276e-07 2.739833e-03
## ENSG00000182013.18 PNMA8A 1.172340e-06 4.637947e-03
## ENSG00000161905.13 ALOX15 1.268643e-06 4.637947e-03
## ENSG00000167766.19 ZNF83 4.467481e-06 1.399917e-02
## ENSG00000234709.2 UPF3AP3 5.958154e-06 1.633651e-02
## ENSG00000232810.4 TNF 1.073636e-05 2.533245e-02
## ENSG00000099860.9 GADD45B 1.154887e-05 2.533245e-02
mean(abs(dge$stat))
## [1] 0.8864522
# model with clinical covariates
# including crp_group in the model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 34 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000274611.4 TBC1D3 61.02093 -29.8751994 5.0803431 -5.880548
## ENSG00000273117.1 INSIG1-DT 109.77167 1.0347456 0.2115610 4.891004
## ENSG00000233673.7 ANAPC1P1 32.40291 -2.0654713 0.4283281 -4.822171
## ENSG00000167766.19 ZNF83 2629.92337 -0.8179630 0.1696786 -4.820660
## ENSG00000147576.17 ADHFE1 934.81114 -0.6784607 0.1420030 -4.777792
## ENSG00000237754.1 RP11-521C10.1 13.72225 -1.7214553 0.3762302 -4.575537
## ENSG00000279359.1 RP11-36D19.9 48.17870 2.5914247 0.5818684 4.453627
## ENSG00000284606.1 RP11-556O5.7 154.73434 -0.8365505 0.1905379 -4.390467
## ENSG00000234420.7 ZNF37BP 1994.48390 -0.6039811 0.1377019 -4.386149
## ENSG00000054598.9 FOXC1 29.68356 2.1647702 0.4973820 4.352329
## pvalue padj
## ENSG00000274611.4 TBC1D3 4.089113e-09 0.0000896947
## ENSG00000273117.1 INSIG1-DT 1.003227e-06 0.0077751224
## ENSG00000233673.7 ANAPC1P1 1.420039e-06 0.0077751224
## ENSG00000167766.19 ZNF83 1.430839e-06 0.0077751224
## ENSG00000147576.17 ADHFE1 1.772310e-06 0.0077751224
## ENSG00000237754.1 RP11-521C10.1 4.749993e-06 0.0173651821
## ENSG00000279359.1 RP11-36D19.9 8.443181e-06 0.0264573100
## ENSG00000284606.1 RP11-556O5.7 1.131074e-05 0.0281195177
## ENSG00000234420.7 ZNF37BP 1.153753e-05 0.0281195177
## ENSG00000054598.9 FOXC1 1.346988e-05 0.0295461927
mean(abs(dge$stat))
## [1] 1.094275
infec_a_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 64 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000185201.17 IFITM2 4666.12989 1.4240423 0.2766663 5.147148
## ENSG00000167766.19 ZNF83 2629.92337 -0.6483684 0.1400962 -4.628023
## ENSG00000147576.17 ADHFE1 934.81114 -0.6560167 0.1479958 -4.432671
## ENSG00000255439.6 RP11-196G11.1 124.80644 1.0393206 0.2390768 4.347225
## ENSG00000237754.1 RP11-521C10.1 13.72225 -1.6625727 0.3841827 -4.327557
## ENSG00000277632.2 CCL3 860.85616 3.1002874 0.7179283 4.318380
## ENSG00000214413.9 BBIP1 996.88114 -0.5444700 0.1266365 -4.299473
## ENSG00000269711.1 CTD-3214H19.16 167.76977 1.7714729 0.4122681 4.296895
## ENSG00000099860.9 GADD45B 2291.47951 1.0125171 0.2390706 4.235222
## ENSG00000001631.16 KRIT1 1422.42000 -0.5380767 0.1274059 -4.223325
## pvalue padj
## ENSG00000185201.17 IFITM2 2.644764e-07 0.003776723
## ENSG00000167766.19 ZNF83 3.691735e-06 0.026358989
## ENSG00000147576.17 ADHFE1 9.307297e-06 0.030379214
## ENSG00000255439.6 RP11-196G11.1 1.378711e-05 0.030379214
## ENSG00000237754.1 RP11-521C10.1 1.507722e-05 NA
## ENSG00000277632.2 CCL3 1.571784e-05 0.030379214
## ENSG00000214413.9 BBIP1 1.712049e-05 0.030379214
## ENSG00000269711.1 CTD-3214H19.16 1.732069e-05 0.030379214
## ENSG00000099860.9 GADD45B 2.283265e-05 0.030379214
## ENSG00000001631.16 KRIT1 2.407242e-05 0.030379214
mean(abs(dge$stat))
## [1] 1.105629
infec_a_t0_adj <- dge
CCL3 seems quite robust.
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 77 21
ss2 <- subset(ss2,treatment_group==1)
table(ss2$infec)
##
## 0 1
## 35 7
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 101 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000100906.11 NFKBIA 9734.50485 1.9018267 0.2801957 6.787494
## ENSG00000167173.19 C15orf39 2198.74076 1.1854728 0.1754536 6.756617
## ENSG00000081041.9 CXCL2 53.31381 3.7502477 0.5670770 6.613296
## ENSG00000044574.9 HSPA5 7137.27302 1.4389376 0.2276968 6.319535
## ENSG00000125968.9 ID1 57.20937 3.4112787 0.5672157 6.014077
## ENSG00000125538.12 IL1B 436.12157 3.0466361 0.5084909 5.991526
## ENSG00000105697.9 HAMP 58.47161 2.5690836 0.4306887 5.965059
## ENSG00000185650.10 ZFP36L1 6744.81308 0.9258669 0.1563981 5.919937
## ENSG00000162772.17 ATF3 76.10013 2.0524677 0.3561658 5.762675
## ENSG00000163251.4 FZD5 85.90925 2.1821102 0.3871234 5.636730
## pvalue padj
## ENSG00000100906.11 NFKBIA 1.140979e-11 1.467806e-07
## ENSG00000167173.19 C15orf39 1.412506e-11 1.467806e-07
## ENSG00000081041.9 CXCL2 3.758564e-11 2.603808e-07
## ENSG00000044574.9 HSPA5 2.623513e-10 1.363112e-06
## ENSG00000125968.9 ID1 1.809150e-09 7.200661e-06
## ENSG00000125538.12 IL1B 2.078813e-09 7.200661e-06
## ENSG00000105697.9 HAMP 2.445451e-09 7.260544e-06
## ENSG00000185650.10 ZFP36L1 3.220658e-09 8.366868e-06
## ENSG00000162772.17 ATF3 8.279089e-09 1.911826e-05
## ENSG00000163251.4 FZD5 1.733091e-08 3.475487e-05
mean(abs(dge$stat))
## [1] 1.330536
# model with clinical covariates
# including crp_group in the model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 27 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000132972.19 RNF17 13.11765 -18.8470748 2.0399879 -9.238817
## ENSG00000167173.19 C15orf39 2198.74076 1.2877505 0.1865780 6.901942
## ENSG00000100906.11 NFKBIA 9734.50485 1.7301002 0.3252893 5.318651
## ENSG00000162772.17 ATF3 76.10013 2.1917122 0.4165951 5.261014
## ENSG00000128203.7 ASPHD2 196.20728 0.7694590 0.1510066 5.095531
## ENSG00000162783.11 IER5 2200.29728 0.9752990 0.1924133 5.068770
## ENSG00000126003.7 PLAGL2 959.67879 0.9304217 0.1873070 4.967362
## ENSG00000155090.15 KLF10 2110.66056 1.6217528 0.3285993 4.935350
## ENSG00000081041.9 CXCL2 53.31381 3.2453213 0.6641432 4.886478
## ENSG00000185650.10 ZFP36L1 6744.81308 0.8335282 0.1711427 4.870369
## pvalue padj
## ENSG00000132972.19 RNF17 2.492379e-20 NA
## ENSG00000167173.19 C15orf39 5.129622e-12 6.271989e-08
## ENSG00000100906.11 NFKBIA 1.045394e-07 6.391017e-04
## ENSG00000162772.17 ATF3 1.432635e-07 NA
## ENSG00000128203.7 ASPHD2 3.477647e-07 1.223907e-03
## ENSG00000162783.11 IER5 4.003947e-07 1.223907e-03
## ENSG00000126003.7 PLAGL2 6.786985e-07 1.630411e-03
## ENSG00000155090.15 KLF10 8.000710e-07 1.630411e-03
## ENSG00000081041.9 CXCL2 1.026557e-06 NA
## ENSG00000185650.10 ZFP36L1 1.113899e-06 1.754794e-03
mean(abs(dge$stat))
## [1] 1.129203
infec_a_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 50 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000167173.19 C15orf39 2198.74076 1.1001375 0.1618267 6.798246
## ENSG00000128203.7 ASPHD2 196.20728 0.7344183 0.1474197 4.981819
## ENSG00000126003.7 PLAGL2 959.67879 0.7834552 0.1586995 4.936720
## ENSG00000162783.11 IER5 2200.29728 0.9946933 0.2029167 4.901978
## ENSG00000159403.18 C1R 88.12334 0.7921169 0.1670592 4.741535
## ENSG00000183779.7 ZNF703 228.78586 1.3184889 0.2781708 4.739854
## ENSG00000138744.16 NAAA 2419.06871 0.7627905 0.1611521 4.733358
## ENSG00000256039.2 LINC02446 229.89248 -1.6703850 0.3567986 -4.681591
## ENSG00000100906.11 NFKBIA 9734.50485 1.1479049 0.2469534 4.648266
## ENSG00000164086.10 DUSP7 1004.95339 0.6072267 0.1326262 4.578483
## pvalue padj
## ENSG00000167173.19 C15orf39 1.059005e-11 1.249520e-07
## ENSG00000128203.7 ASPHD2 6.298919e-07 2.798624e-03
## ENSG00000126003.7 PLAGL2 7.944723e-07 2.798624e-03
## ENSG00000162783.11 IER5 9.487667e-07 2.798624e-03
## ENSG00000159403.18 C1R 2.121050e-06 NA
## ENSG00000183779.7 ZNF703 2.138721e-06 4.342736e-03
## ENSG00000138744.16 NAAA 2.208358e-06 4.342736e-03
## ENSG00000256039.2 LINC02446 2.846567e-06 4.798092e-03
## ENSG00000100906.11 NFKBIA 3.347376e-06 4.936960e-03
## ENSG00000164086.10 DUSP7 4.683604e-06 6.140205e-03
mean(abs(dge$stat))
## [1] 0.9829517
infec_a_eos_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 19
ss2 <- subset(ss2,treatment_group==1)
table(ss2$infec)
##
## 0 1
## 43 6
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 87 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000263244.2 RP11-473I1.9 96.95970 -24.0528429 2.6558198 -9.056655
## ENSG00000183598.4 H3C13 16.79863 4.1384840 0.5559402 7.444118
## ENSG00000167536.14 DHRS13 630.19092 2.0049535 0.2875645 6.972187
## ENSG00000138738.11 PRDM5 93.01623 2.6786805 0.3850836 6.956100
## ENSG00000173757.10 STAT5B 9447.69305 0.9030826 0.1320213 6.840429
## ENSG00000197324.9 LRP10 8557.80925 1.3644779 0.2004622 6.806658
## ENSG00000182541.18 LIMK2 4132.98151 1.4101831 0.2115824 6.664936
## ENSG00000169180.12 XPO6 8932.98053 1.5194701 0.2293696 6.624550
## ENSG00000100505.14 TRIM9 25.92048 2.3314459 0.3562837 6.543791
## ENSG00000089351.15 GRAMD1A 5423.79264 1.1032986 0.1686106 6.543470
## pvalue padj
## ENSG00000263244.2 RP11-473I1.9 1.345128e-19 2.864719e-15
## ENSG00000183598.4 H3C13 9.759450e-14 1.039235e-09
## ENSG00000167536.14 DHRS13 3.120512e-12 1.862531e-08
## ENSG00000138738.11 PRDM5 3.498204e-12 1.862531e-08
## ENSG00000173757.10 STAT5B 7.895652e-12 3.363074e-08
## ENSG00000197324.9 LRP10 9.989212e-12 3.545671e-08
## ENSG00000182541.18 LIMK2 2.647805e-11 8.055759e-08
## ENSG00000169180.12 XPO6 3.483072e-11 9.272373e-08
## ENSG00000100505.14 TRIM9 5.997847e-11 1.280113e-07
## ENSG00000089351.15 GRAMD1A 6.010765e-11 1.280113e-07
mean(abs(dge$stat))
## [1] 1.394202
# model with clinical covariates
# including crp_group in the model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 34 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000183598.4 H3C13 16.79863 4.6624411 0.6164893 7.562890
## ENSG00000089351.15 GRAMD1A 5423.79264 1.3137004 0.2044049 6.426953
## ENSG00000197324.9 LRP10 8557.80925 1.5022401 0.2436527 6.165496
## ENSG00000167536.14 DHRS13 630.19092 2.1052288 0.3439057 6.121529
## ENSG00000136156.15 ITM2B 15392.40802 1.0002585 0.1660492 6.023867
## ENSG00000100505.14 TRIM9 25.92048 2.5550712 0.4286519 5.960713
## ENSG00000173757.10 STAT5B 9447.69305 0.9290043 0.1559647 5.956505
## ENSG00000100368.14 CSF2RB 8859.06475 1.5426925 0.2636278 5.851783
## ENSG00000160710.18 ADAR 11737.71868 0.6009997 0.1035677 5.802964
## ENSG00000068383.19 INPP5A 516.82621 0.9573477 0.1658020 5.774042
## pvalue padj
## ENSG00000183598.4 H3C13 3.942104e-14 7.912985e-10
## ENSG00000089351.15 GRAMD1A 1.301870e-10 1.306622e-06
## ENSG00000197324.9 LRP10 7.026244e-10 4.650995e-06
## ENSG00000167536.14 DHRS13 9.268161e-10 4.650995e-06
## ENSG00000136156.15 ITM2B 1.702978e-09 6.836777e-06
## ENSG00000100505.14 TRIM9 2.511396e-09 7.389443e-06
## ENSG00000173757.10 STAT5B 2.576899e-09 7.389443e-06
## ENSG00000100368.14 CSF2RB 4.863322e-09 1.220268e-05
## ENSG00000160710.18 ADAR 6.515278e-09 1.453124e-05
## ENSG00000068383.19 INPP5A 7.739192e-09 1.553488e-05
mean(abs(dge$stat))
## [1] 1.34301
infec_a_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 49 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000260903.3 XKR7 37.17335 1.2606218 0.2438369 5.169939
## ENSG00000173706.14 HEG1 656.38340 0.8946055 0.1784842 5.012239
## ENSG00000165028.12 NIPSNAP3B 111.36215 -1.2701025 0.2753142 -4.613284
## ENSG00000179981.11 TSHZ1 949.99984 0.5436810 0.1206120 4.507685
## ENSG00000104341.17 LAPTM4B 80.97473 1.3055591 0.3067875 4.255582
## ENSG00000233695.2 GAS6-AS1 813.60121 -1.5766908 0.3777491 -4.173910
## ENSG00000205571.14 SMN2 451.86820 -2.2984493 0.5722870 -4.016253
## ENSG00000071205.12 ARHGAP10 210.82833 0.7773440 0.1944938 3.996755
## ENSG00000186715.11 MST1L 49.85308 -1.7227578 0.4326843 -3.981558
## ENSG00000175764.16 TTLL11 241.50442 0.6236036 0.1594900 3.909986
## pvalue padj
## ENSG00000260903.3 XKR7 2.341702e-07 0.00499087
## ENSG00000173706.14 HEG1 5.380041e-07 0.00573324
## ENSG00000165028.12 NIPSNAP3B 3.963570e-06 0.02815852
## ENSG00000179981.11 TSHZ1 6.553873e-06 0.03492067
## ENSG00000104341.17 LAPTM4B 2.085063e-05 0.08887790
## ENSG00000233695.2 GAS6-AS1 2.994159e-05 0.10635750
## ENSG00000205571.14 SMN2 5.913089e-05 0.16213246
## ENSG00000071205.12 ARHGAP10 6.421675e-05 0.16213246
## ENSG00000186715.11 MST1L 6.846489e-05 0.16213246
## ENSG00000175764.16 TTLL11 9.230140e-05 0.18252081
mean(abs(dge$stat))
## [1] 0.9468285
infec_a_pod1_adj <- dge
CXCL2 seems quite robust.
mx <- xt0f
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 21
ss2 <- subset(ss2,treatment_group==2)
table(ss2$infec)
##
## 0 1
## 47 14
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 396 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000081041.9 CXCL2 35.63507 1.9439307 0.3665574 5.303209
## ENSG00000234745.13 HLA-B 62101.52290 -0.9532047 0.2093888 -4.552320
## ENSG00000281383.1 CH507-513H4.5 15.70766 -1.7158317 0.4215774 -4.070027
## ENSG00000146242.9 TPBG 42.16738 1.0944994 0.2693250 4.063862
## ENSG00000204936.10 CD177 87.28790 -2.0084278 0.5006215 -4.011869
## ENSG00000232117.2 LINC00384 11.61340 -1.1159899 0.2801978 -3.982864
## ENSG00000112149.10 CD83 411.63539 1.2198721 0.3066732 3.977759
## ENSG00000287087.1 CTD-2081C10.8 11.44057 0.7842126 0.1973394 3.973928
## ENSG00000008438.5 PGLYRP1 479.87684 -2.0009651 0.5116740 -3.910625
## ENSG00000162772.17 ATF3 160.05841 1.3211995 0.3410898 3.873466
## pvalue padj
## ENSG00000081041.9 CXCL2 1.137846e-07 0.002495865
## ENSG00000234745.13 HLA-B 5.305747e-06 0.058190781
## ENSG00000281383.1 CH507-513H4.5 4.700759e-05 0.193842544
## ENSG00000146242.9 TPBG 4.826743e-05 0.193842544
## ENSG00000204936.10 CD177 6.024001e-05 0.193842544
## ENSG00000232117.2 LINC00384 6.808968e-05 0.193842544
## ENSG00000112149.10 CD83 6.956769e-05 0.193842544
## ENSG00000287087.1 CTD-2081C10.8 7.069708e-05 0.193842544
## ENSG00000008438.5 PGLYRP1 9.205770e-05 0.224365070
## ENSG00000162772.17 ATF3 1.072986e-04 0.234109215
mean(abs(dge$stat))
## [1] 1.012181
# model with clinical covariates
# including crp_group in the model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 15 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000130032.17 PRRG3 42.866897 1.6572563 0.3153272 5.255672
## ENSG00000081041.9 CXCL2 35.635073 1.8135859 0.3700844 4.900466
## ENSG00000278599.6 TBC1D3E 8.722472 17.2282728 3.6374850 4.736314
## ENSG00000234745.13 HLA-B 62101.522900 -1.0339314 0.2234914 -4.626269
## ENSG00000183260.7 ABHD16B 92.010146 -2.6572265 0.6007539 -4.423153
## ENSG00000286666.1 RP11-301G21.2 23.242360 0.6818548 0.1645395 4.144018
## ENSG00000276107.1 THBS1-IT1 9.083385 2.6551134 0.6415470 4.138611
## ENSG00000110436.13 SLC1A2 32.480826 1.2773138 0.3109516 4.107758
## ENSG00000232117.2 LINC00384 11.613398 -1.1541619 0.2818112 -4.095514
## ENSG00000287087.1 CTD-2081C10.8 11.440574 0.8546777 0.2088369 4.092561
## pvalue padj
## ENSG00000130032.17 PRRG3 1.474852e-07 0.003235088
## ENSG00000081041.9 CXCL2 9.560962e-07 0.010485985
## ENSG00000278599.6 TBC1D3E 2.176398e-06 0.015913096
## ENSG00000234745.13 HLA-B 3.723124e-06 0.020416683
## ENSG00000183260.7 ABHD16B 9.727079e-06 0.042672697
## ENSG00000286666.1 RP11-301G21.2 3.412726e-05 0.093582511
## ENSG00000276107.1 THBS1-IT1 3.494155e-05 0.093582511
## ENSG00000110436.13 SLC1A2 3.995187e-05 0.093582511
## ENSG00000232117.2 LINC00384 4.212327e-05 0.093582511
## ENSG00000287087.1 CTD-2081C10.8 4.266356e-05 0.093582511
mean(abs(dge$stat))
## [1] 1.052494
infec_b_t0 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 43 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000130032.17 PRRG3 42.86690 1.5875029 0.3591955 4.419607
## ENSG00000025434.19 NR1H3 257.06365 -0.4481272 0.1115340 -4.017852
## ENSG00000065923.10 SLC9A7 517.06274 0.4089127 0.1051076 3.890421
## ENSG00000234745.13 HLA-B 62101.52290 -0.8637873 0.2227917 -3.877107
## ENSG00000284554.2 CTA-150C2.22 11.74190 -9.6088060 2.4784090 -3.877006
## ENSG00000203668.3 CHML 556.47066 0.4844837 0.1251990 3.869708
## ENSG00000116194.13 ANGPTL1 13.35445 1.1580486 0.2996963 3.864074
## ENSG00000274134.1 MIR6774 17.89025 0.9233534 0.2430764 3.798615
## ENSG00000165949.12 IFI27 104.98135 -2.1383061 0.5643889 -3.788710
## ENSG00000281162.2 LINC01127 418.70683 -0.6907730 0.1826310 -3.782342
## pvalue padj
## ENSG00000130032.17 PRRG3 9.888033e-06 0.216894
## ENSG00000025434.19 NR1H3 5.873116e-05 0.340781
## ENSG00000065923.10 SLC9A7 1.000703e-04 0.340781
## ENSG00000234745.13 HLA-B 1.057060e-04 0.340781
## ENSG00000284554.2 CTA-150C2.22 1.057498e-04 0.340781
## ENSG00000203668.3 CHML 1.089656e-04 0.340781
## ENSG00000116194.13 ANGPTL1 1.115113e-04 0.340781
## ENSG00000274134.1 MIR6774 1.455071e-04 0.340781
## ENSG00000165949.12 IFI27 1.514314e-04 0.340781
## ENSG00000281162.2 LINC01127 1.553595e-04 0.340781
mean(abs(dge$stat))
## [1] 0.8981908
infec_b_t0_adj <- dge
CXCL2 seems quite robust.
mx <- xeosf
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 77 21
ss2 <- subset(ss2,treatment_group==2)
table(ss2$infec)
##
## 0 1
## 42 14
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 150 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000281383.1 CH507-513H4.5 15.11952 -1.5596348 0.3445350 -4.526782
## ENSG00000281741.2 CH17-118O6.6 37.17409 -1.9855820 0.4741593 -4.187584
## ENSG00000273221.1 RP5-1180E21.5 81.37678 -0.8909263 0.2145724 -4.152100
## ENSG00000154099.18 DNAAF1 32.13411 0.8499840 0.2056066 4.134030
## ENSG00000225840.2 AC010970.2 16407.05996 -1.1705777 0.2837503 -4.125379
## ENSG00000081041.9 CXCL2 39.75527 1.5920647 0.3885345 4.097615
## ENSG00000280614.1 CH507-513H4.4 14151.94639 -1.1096045 0.2717873 -4.082621
## ENSG00000280800.1 CH507-513H4.6 14151.94639 -1.1096045 0.2717873 -4.082621
## ENSG00000281181.1 CH507-513H4.3 14151.94639 -1.1096045 0.2717873 -4.082621
## ENSG00000269737.2 RP11-345P4.6 233.89006 0.8001306 0.2043683 3.915141
## pvalue padj
## ENSG00000281383.1 CH507-513H4.5 5.988877e-06 0.1091839
## ENSG00000281741.2 CH17-118O6.6 2.819395e-05 0.1091839
## ENSG00000273221.1 RP5-1180E21.5 3.294378e-05 0.1091839
## ENSG00000154099.18 DNAAF1 3.564568e-05 0.1091839
## ENSG00000225840.2 AC010970.2 3.701246e-05 0.1091839
## ENSG00000081041.9 CXCL2 4.174295e-05 0.1091839
## ENSG00000280614.1 CH507-513H4.4 4.453054e-05 0.1091839
## ENSG00000280800.1 CH507-513H4.6 4.453054e-05 0.1091839
## ENSG00000281181.1 CH507-513H4.3 4.453054e-05 0.1091839
## ENSG00000269737.2 RP11-345P4.6 9.035137e-05 0.1993784
mean(abs(dge$stat))
## [1] 0.8730035
# model with clinical covariates
# including crp_group in the model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 6 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000281383.1 CH507-513H4.5 15.11952 -1.5796688 0.36836297 -4.288348
## ENSG00000286219.1 NOTCH2NLC 1082.05581 0.4138437 0.10731975 3.856174
## ENSG00000107719.9 PALD1 91.84364 1.1870472 0.31322499 3.789759
## ENSG00000273221.1 RP5-1180E21.5 81.37678 -0.8633386 0.22911710 -3.768111
## ENSG00000281741.2 CH17-118O6.6 37.17409 -1.9488395 0.52020549 -3.746288
## ENSG00000269737.2 RP11-345P4.6 233.89006 0.7876876 0.21762814 3.619420
## ENSG00000035141.8 FAM136A 931.14355 -0.2470166 0.06841642 -3.610487
## ENSG00000164687.11 FABP5 137.10393 -0.5587196 0.15776894 -3.541379
## ENSG00000225840.2 AC010970.2 16407.05996 -1.0565297 0.29924408 -3.530662
## ENSG00000173915.16 ATP5MK 540.05479 -0.4281250 0.12149361 -3.523848
## pvalue padj
## ENSG00000281383.1 CH507-513H4.5 1.800066e-05 0.3972205
## ENSG00000286219.1 NOTCH2NLC 1.151754e-04 0.6737808
## ENSG00000107719.9 PALD1 1.507936e-04 0.6737808
## ENSG00000273221.1 RP5-1180E21.5 1.644877e-04 0.6737808
## ENSG00000281741.2 CH17-118O6.6 1.794706e-04 0.6737808
## ENSG00000269737.2 RP11-345P4.6 2.952644e-04 0.6737808
## ENSG00000035141.8 FAM136A 3.056227e-04 0.6737808
## ENSG00000164687.11 FABP5 3.980413e-04 0.6737808
## ENSG00000225840.2 AC010970.2 4.145211e-04 0.6737808
## ENSG00000173915.16 ATP5MK 4.253286e-04 0.6737808
mean(abs(dge$stat))
## [1] 0.8610267
infec_b_eos <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 50 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000281741.2 CH17-118O6.6 37.17409 -2.5980798 0.54862114 -4.735654
## ENSG00000143507.18 DUSP10 291.39486 0.3943481 0.09982852 3.950254
## ENSG00000203811.1 H3C14 21.14449 0.7947711 0.20119669 3.950219
## ENSG00000203852.3 H3C15 21.14449 0.7947711 0.20119669 3.950219
## ENSG00000271383.8 NBPF19 10978.51599 0.5342483 0.13755661 3.883843
## ENSG00000256427.2 RP11-118B22.2 128.34067 -0.7733993 0.20016420 -3.863824
## ENSG00000286219.1 NOTCH2NLC 1082.05581 0.4591214 0.11929688 3.848562
## ENSG00000154099.18 DNAAF1 32.13411 0.8247610 0.22924963 3.597655
## ENSG00000237248.5 LINC00987 156.01886 -0.6388614 0.17992509 -3.550708
## ENSG00000164850.15 GPER1 311.74828 -1.1288952 0.32071826 -3.519897
## pvalue padj
## ENSG00000281741.2 CH17-118O6.6 2.183500e-06 0.04818329
## ENSG00000143507.18 DUSP10 7.806818e-05 0.37455018
## ENSG00000203811.1 H3C14 7.807960e-05 0.37455018
## ENSG00000203852.3 H3C15 7.807960e-05 0.37455018
## ENSG00000271383.8 NBPF19 1.028184e-04 0.37455018
## ENSG00000256427.2 RP11-118B22.2 1.116255e-04 0.37455018
## ENSG00000286219.1 NOTCH2NLC 1.188132e-04 0.37455018
## ENSG00000154099.18 DNAAF1 3.210995e-04 0.81114477
## ENSG00000237248.5 LINC00987 3.841968e-04 0.81114477
## ENSG00000164850.15 GPER1 4.317148e-04 0.81114477
mean(abs(dge$stat))
## [1] 0.7381023
infec_b_eos_adj <- dge
mx <- xpod1f
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 19
ss2 <- subset(ss2,treatment_group==2)
table(ss2$infec)
##
## 0 1
## 47 13
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ infec )
## converting counts to integer mode
res <- DESeq(dds)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
## -- replacing outliers and refitting for 376 genes
## -- DESeq argument 'minReplicatesForReplace' = 7
## -- original counts are preserved in counts(dds)
## estimating dispersions
## fitting model and testing
z <- results(res)
vsd <- vst(dds, blind=FALSE)
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000129535.13 NRL 57.14780 -0.5609943 0.11187913 -5.014289
## ENSG00000120053.12 GOT1 218.89056 -0.4240765 0.08922375 -4.752955
## ENSG00000234426.3 RP11-459O1.2 46.99935 1.8732252 0.40963550 4.572907
## ENSG00000119673.15 ACOT2 232.09990 -0.4556809 0.10389743 -4.385872
## ENSG00000225411.3 RP11-764K9.1 73.11837 0.8918663 0.20961454 4.254792
## ENSG00000196421.9 C20orf204 61.17974 -1.1312499 0.26662111 -4.242912
## ENSG00000133321.11 PLAAT4 965.23264 -0.5970574 0.14308900 -4.172629
## ENSG00000114993.17 RTKN 37.68934 -0.9210491 0.22358050 -4.119541
## ENSG00000287586.1 RP1-77N19.1 13.29168 1.5289914 0.37478338 4.079667
## ENSG00000176571.12 CNBD1 68.78388 0.7885928 0.19437260 4.057119
## pvalue padj
## ENSG00000129535.13 NRL 5.322982e-07 0.01134487
## ENSG00000120053.12 GOT1 2.004648e-06 0.02136253
## ENSG00000234426.3 RP11-459O1.2 4.810026e-06 0.03417203
## ENSG00000119673.15 ACOT2 1.155219e-05 0.06155297
## ENSG00000225411.3 RP11-764K9.1 2.092434e-05 0.07837427
## ENSG00000196421.9 C20orf204 2.206379e-05 0.07837427
## ENSG00000133321.11 PLAAT4 3.011045e-05 0.09167771
## ENSG00000114993.17 RTKN 3.796275e-05 0.09666658
## ENSG00000287586.1 RP1-77N19.1 4.510026e-05 0.09666658
## ENSG00000176571.12 CNBD1 4.968172e-05 0.09666658
mean(abs(dge$stat))
## [1] 1.00484
# model with clinical covariates
# including crp_group in the model
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 18 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000129535.13 NRL 57.14780 -0.6161299 0.11710335 -5.261420
## ENSG00000120053.12 GOT1 218.89056 -0.4810951 0.09453293 -5.089180
## ENSG00000272282.1 LINC02084 148.15378 -0.9851775 0.20484157 -4.809461
## ENSG00000263006.6 ROCK1P1 125.27169 -2.6815259 0.57322844 -4.677936
## ENSG00000135069.14 PSAT1 74.18620 -0.8171086 0.17803113 -4.589695
## ENSG00000097021.20 ACOT7 165.93568 -0.7004217 0.15839008 -4.422131
## ENSG00000110436.13 SLC1A2 36.33599 1.4581064 0.34068871 4.279879
## ENSG00000205927.5 OLIG2 10.78384 1.2707752 0.30036315 4.230796
## ENSG00000184221.13 OLIG1 631.09997 0.9568114 0.23161817 4.130986
## ENSG00000116133.13 DHCR24 323.22445 -0.5030564 0.12223204 -4.115586
## pvalue padj
## ENSG00000129535.13 NRL 1.429470e-07 0.003046630
## ENSG00000120053.12 GOT1 3.596144e-07 0.003832231
## ENSG00000272282.1 LINC02084 1.513381e-06 0.010751565
## ENSG00000263006.6 ROCK1P1 2.897771e-06 0.015440047
## ENSG00000135069.14 PSAT1 4.438942e-06 0.018921433
## ENSG00000097021.20 ACOT7 9.773208e-06 0.034716062
## ENSG00000110436.13 SLC1A2 1.869952e-05 0.056934689
## ENSG00000205927.5 OLIG2 2.328657e-05 0.062038342
## ENSG00000184221.13 OLIG1 3.612108e-05 0.082310279
## ENSG00000116133.13 DHCR24 3.861975e-05 0.082310279
mean(abs(dge$stat))
## [1] 0.9990416
infec_b_pod1 <- dge
# model with clinical and cell covariates
# Monocytes.C NK T.CD8.Memory T.CD4.Naive Neutrophils.LD
dds <- DESeqDataSetFromMatrix(countData = mx , colData = ss2,
design = ~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS +
Monocytes.C + NK + T.CD8.Memory + T.CD4.Naive + Neutrophils.LD + crp_group + infec )
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
## the design formula contains one or more numeric variables with integer values,
## specifying a model with increasing fold change for higher values.
## did you mean for this to be a factor? if so, first convert
## this variable to a factor using the factor() function
## the design formula contains one or more numeric variables that have mean or
## standard deviation larger than 5 (an arbitrary threshold to trigger this message).
## Including numeric variables with large mean can induce collinearity with the intercept.
## Users should center and scale numeric variables in the design to improve GLM convergence.
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
res <- DESeq(dds)
## estimating size factors
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
## final dispersion estimates
## fitting model and testing
## 23 rows did not converge in beta, labelled in mcols(object)$betaConv. Use larger maxit argument with nbinomWaldTest
z <- results(res)
vsd <- vst(dds, blind=FALSE)
## Note: levels of factors in the design contain characters other than
## letters, numbers, '_' and '.'. It is recommended (but not required) to use
## only letters, numbers, and delimiters '_' or '.', as these are safe characters
## for column names in R. [This is a message, not a warning or an error]
zz <- cbind(as.data.frame(z),assay(vsd))
dge <- as.data.frame(zz[order(zz$pvalue),])
head(dge[order(dge$pvalue),1:6],10)
## baseMean log2FoldChange lfcSE stat
## ENSG00000129535.13 NRL 57.14780 -0.6392066 0.12101568 -5.282015
## ENSG00000183010.17 PYCR1 33.69785 -0.9937114 0.19500612 -5.095796
## ENSG00000120053.12 GOT1 218.89056 -0.4072097 0.09348147 -4.356047
## ENSG00000258056.2 RP11-644F5.11 165.61102 -0.4330740 0.10509836 -4.120654
## ENSG00000135069.14 PSAT1 74.18620 -0.7265525 0.17680194 -4.109415
## ENSG00000000460.17 C1orf112 201.19205 0.3343957 0.08706870 3.840596
## ENSG00000275793.1 RIMBP3 94.73468 -0.7030986 0.18596807 -3.780749
## ENSG00000242611.2 AC093627.8 18.55195 -2.1900148 0.58037179 -3.773469
## ENSG00000246560.2 UBE2D3-AS1 25.20489 -0.5878541 0.15596389 -3.769168
## ENSG00000185875.13 THNSL1 73.66433 -0.5985449 0.16202704 -3.694105
## pvalue padj
## ENSG00000129535.13 NRL 1.277709e-07 0.002723182
## ENSG00000183010.17 PYCR1 3.472788e-07 0.003700777
## ENSG00000120053.12 GOT1 1.324322e-05 0.094084267
## ENSG00000258056.2 RP11-644F5.11 3.777985e-05 0.169081506
## ENSG00000135069.14 PSAT1 3.966628e-05 0.169081506
## ENSG00000000460.17 C1orf112 1.227360e-04 0.382420268
## ENSG00000275793.1 RIMBP3 1.563572e-04 0.382420268
## ENSG00000242611.2 AC093627.8 1.609934e-04 0.382420268
## ENSG00000246560.2 UBE2D3-AS1 1.637927e-04 0.382420268
## ENSG00000185875.13 THNSL1 2.206626e-04 0.382420268
mean(abs(dge$stat))
## [1] 0.8716056
infec_b_pod1_adj <- dge
mx <- dec2
ss2 <- as.data.frame(cbind(ss_t0,sscell_t0))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 21
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
ss2$infec <- as.numeric(ss2$infec) -1
design <- model.matrix(~ ss2$infec)
fit <- lmFit(mx, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
bl_t0 <- topTable(fit,number=Inf)
## Removing intercept from test coefficients
bl_t0
## logFC AveExpr t P.Value adj.P.Val
## B.Naive 1.485781854 3.2262095 2.447814145 0.01593594 0.2709109
## Plasmablasts -0.013258764 0.1173774 -1.663095622 0.09910846 0.7976001
## T.CD8.Memory -2.098142898 8.1875958 -1.483562645 0.14075296 0.7976001
## T.CD8.Naive -0.791486921 14.5112581 -0.912231404 0.36361974 0.9539393
## T.gd.Vd2 -0.068599005 2.0044018 -0.838526580 0.40353333 0.9539393
## Monocytes.C 1.361884574 20.8800448 0.755627101 0.45146993 0.9539393
## NK 0.563023333 4.7371188 0.690214470 0.49149693 0.9539393
## Neutrophils.LD -0.868694955 3.2388988 -0.568220212 0.57103073 0.9539393
## mDCs -0.031981746 0.7950039 -0.563555244 0.57419120 0.9539393
## T.gd.non.Vd2 -0.014173517 0.3715104 -0.484466194 0.62900762 0.9539393
## Monocytes.NC.I 0.431889032 10.5254623 0.433847984 0.66523912 0.9539393
## pDCs -0.033540735 0.2066387 -0.422659273 0.67336895 0.9539393
## B.Memory 0.233854949 4.0288477 0.319990801 0.74957557 0.9604696
## T.CD4.Memory -0.202910589 10.6556812 -0.237540456 0.81267450 0.9604696
## Basophils.LD 0.078462071 1.5507565 0.192797372 0.84747317 0.9604696
## T.CD4.Naive -0.034433715 11.0483327 -0.023787443 0.98106478 0.9965077
## MAIT 0.002327032 3.9148616 0.004386756 0.99650774 0.9965077
## B
## B.Naive -3.075068
## Plasmablasts -4.273299
## T.CD8.Memory -4.487441
## T.CD8.Naive -5.011785
## T.gd.Vd2 -5.061622
## Monocytes.C -5.112736
## NK -5.149361
## Neutrophils.LD -5.208894
## mDCs -5.210943
## T.gd.non.Vd2 -5.243129
## Monocytes.NC.I -5.261191
## pDCs -5.264910
## B.Memory -5.294568
## T.CD4.Memory -5.312457
## Basophils.LD -5.319951
## T.CD4.Naive -5.334207
## MAIT -5.334420
subset(bl_t0,P.Value<0.05)
## logFC AveExpr t P.Value adj.P.Val B
## B.Naive 1.485782 3.22621 2.447814 0.01593594 0.2709109 -3.075068
# model with clinical covariates
ss3 <- ss2[,c("sexD", "wound_typeOP", "duration_sx", "ethnicityCAT", "ageCS", "crp_group", "infec")]
#design <- model.matrix(~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec, ss2 )
design <- model.matrix(~ sexD + ethnicityCAT + ageCS + infec, ss2 )
fit <- lmFit(mx, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
bl_t0 <- topTable(fit,coef="infec",number=Inf)
bl_t0
## logFC AveExpr t P.Value adj.P.Val
## B.Naive 1.357759364 3.2262095 2.132738308 0.03525085 0.5067739
## T.CD8.Memory -2.245450340 8.1875958 -1.571628947 0.11901209 0.5067739
## mDCs -0.075346792 0.7950039 -1.362257528 0.17599966 0.5067739
## NK 1.063898824 4.7371188 1.329623085 0.18649217 0.5067739
## T.CD8.Naive -1.045819917 14.5112581 -1.297296290 0.19734113 0.5067739
## Plasmablasts -0.010129114 0.1173774 -1.266658258 0.20804926 0.5067739
## Monocytes.C 2.294681743 20.8800448 1.264913664 0.20867162 0.5067739
## T.gd.Vd2 -0.089927045 2.0044018 -1.057315464 0.29276742 0.6221308
## pDCs -0.044848971 0.2066387 -0.537962040 0.59174186 0.9836734
## T.gd.non.Vd2 -0.016060796 0.3715104 -0.531558555 0.59614219 0.9836734
## T.CD4.Naive -0.660438040 11.0483327 -0.444785602 0.65738012 0.9836734
## MAIT -0.173299037 3.9148616 -0.347928916 0.72858243 0.9836734
## B.Memory -0.225996855 4.0288477 -0.316531218 0.75222083 0.9836734
## Monocytes.NC.I -0.211606320 10.5254623 -0.218957471 0.82710354 0.9942229
## T.CD4.Memory 0.066364341 10.6556812 0.077420343 0.93843487 0.9942229
## Basophils.LD 0.004967296 1.5507565 0.011642681 0.99073279 0.9942229
## Neutrophils.LD 0.011251659 3.2388988 0.007257702 0.99422286 0.9942229
## B
## B.Naive -4.558305
## T.CD8.Memory -4.579649
## mDCs -4.586112
## NK -4.587042
## T.CD8.Naive -4.587943
## Plasmablasts -4.588777
## Monocytes.C -4.588824
## T.gd.Vd2 -4.593972
## pDCs -4.602926
## T.gd.non.Vd2 -4.603001
## T.CD4.Naive -4.603925
## MAIT -4.604763
## B.Memory -4.604991
## Monocytes.NC.I -4.605562
## T.CD4.Memory -4.606021
## Basophils.LD -4.606086
## Neutrophils.LD -4.606086
subset(bl_t0,P.Value<0.05)
## logFC AveExpr t P.Value adj.P.Val B
## B.Naive 1.357759 3.22621 2.132738 0.03525085 0.5067739 -4.558305
mx <- dec2
ss2 <- as.data.frame(cbind(ss_eos,sscell_eos))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 77 21
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
ss2$infec <- as.numeric(ss2$infec) -1
design <- model.matrix(~ ss2$infec)
fit <- lmFit(mx, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
bl_eos <- topTable(fit,number=Inf)
## Removing intercept from test coefficients
bl_eos
## logFC AveExpr t P.Value adj.P.Val
## T.CD8.Naive -1.954288396 15.1249961 -2.0747393 0.04054616 0.4169905
## B.Naive 0.928996873 2.5375035 1.7812462 0.07787020 0.4169905
## MAIT -0.957805426 3.6323260 -1.7544315 0.08238137 0.4169905
## mDCs -0.079644169 0.5948358 -1.6694567 0.09811542 0.4169905
## B.Memory -0.868682889 3.1195397 -1.5103534 0.13406726 0.4558287
## Plasmablasts -0.011988478 0.1201618 -1.3388267 0.18362474 0.4739035
## Monocytes.NC.I 0.849164716 6.2272662 1.1990483 0.23330730 0.4739035
## Neutrophils.LD 2.775590100 7.0969781 1.1750892 0.24273932 0.4739035
## T.CD4.Memory -1.076926445 10.7921585 -1.1547997 0.25089010 0.4739035
## T.gd.Vd2 -0.093444647 1.9156645 -0.8782520 0.38188595 0.6031229
## NK -1.129549232 7.3243506 -0.8628550 0.39025601 0.6031229
## Monocytes.C 1.442358484 23.6779806 0.5947583 0.55333163 0.7838865
## pDCs -0.007956471 0.1671414 -0.3536932 0.72430410 0.8661475
## T.CD4.Naive 0.458724399 9.1757025 0.3316219 0.74086011 0.8661475
## Basophils.LD -0.041957134 1.1171489 -0.2386489 0.81186014 0.8661475
## T.gd.non.Vd2 0.005999223 0.3327072 0.2265726 0.82121281 0.8661475
## T.CD8.Memory -0.238590506 7.0435384 -0.1689825 0.86614747 0.8661475
## B
## T.CD8.Naive -4.535140
## B.Naive -4.555385
## MAIT -4.557100
## mDCs -4.562384
## B.Memory -4.571642
## Plasmablasts -4.580669
## Monocytes.NC.I -4.587275
## Neutrophils.LD -4.588339
## T.CD4.Memory -4.589223
## T.gd.Vd2 -4.599805
## NK -4.600313
## Monocytes.C -4.607740
## pDCs -4.612111
## T.CD4.Naive -4.612401
## Basophils.LD -4.613418
## T.gd.non.Vd2 -4.613526
## T.CD8.Memory -4.613963
subset(bl_eos,P.Value<0.05)
## logFC AveExpr t P.Value adj.P.Val B
## T.CD8.Naive -1.954288 15.125 -2.074739 0.04054616 0.4169905 -4.53514
# model with clinical covariates
ss3 <- ss2[,c("sexD", "wound_typeOP", "duration_sx", "ethnicityCAT", "ageCS", "crp_group", "infec")]
#design <- model.matrix(~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec, ss2 )
design <- model.matrix(~ sexD + ethnicityCAT + ageCS + infec, ss2 )
fit <- lmFit(mx, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
topTable(fit,coef="infec")
## logFC AveExpr t P.Value adj.P.Val B
## T.CD8.Naive -1.95984109 15.1249961 -2.147611 0.03422174 0.3160929 -4.531220
## mDCs -0.09047923 0.5948358 -1.825721 0.07095105 0.3160929 -4.553452
## B.Memory -1.02027865 3.1195397 -1.756456 0.08214808 0.3160929 -4.557830
## B.Naive 0.91403437 2.5375035 1.698184 0.09266010 0.3160929 -4.561397
## MAIT -0.91804390 3.6323260 -1.696558 0.09296851 0.3160929 -4.561495
## Plasmablasts -0.01373720 0.1201618 -1.486451 0.14038426 0.3630459 -4.573448
## NK -1.89747742 7.3243506 -1.452794 0.14948948 0.3630459 -4.575230
## Monocytes.C 2.85942899 23.6779806 1.194302 0.23525535 0.4687720 -4.587648
## Monocytes.NC.I 0.83666376 6.2272662 1.133862 0.25963060 0.4687720 -4.590223
## T.CD4.Memory -1.05025753 10.7921585 -1.096065 0.27574822 0.4687720 -4.591768
bl_eos <- topTable(fit,coef="infec",number=Inf)
bl_eos
## logFC AveExpr t P.Value adj.P.Val
## T.CD8.Naive -1.959841094 15.1249961 -2.1476108 0.03422174 0.3160929
## mDCs -0.090479231 0.5948358 -1.8257213 0.07095105 0.3160929
## B.Memory -1.020278651 3.1195397 -1.7564562 0.08214808 0.3160929
## B.Naive 0.914034370 2.5375035 1.6981842 0.09266010 0.3160929
## MAIT -0.918043901 3.6323260 -1.6965580 0.09296851 0.3160929
## Plasmablasts -0.013737200 0.1201618 -1.4864508 0.14038426 0.3630459
## NK -1.897477424 7.3243506 -1.4527941 0.14948948 0.3630459
## Monocytes.C 2.859428993 23.6779806 1.1943017 0.23525535 0.4687720
## Monocytes.NC.I 0.836663758 6.2272662 1.1338622 0.25963060 0.4687720
## T.CD4.Memory -1.050257528 10.7921585 -1.0960653 0.27574822 0.4687720
## Neutrophils.LD 2.455781819 7.0969781 0.9948028 0.32233389 0.4981524
## T.gd.Vd2 -0.097351154 1.9156645 -0.8704465 0.38619302 0.5471068
## T.gd.non.Vd2 0.009793401 0.3327072 0.3756348 0.70800381 0.7488790
## pDCs -0.008619337 0.1671414 -0.3675820 0.71398108 0.7488790
## T.CD4.Naive 0.505644379 9.1757025 0.3550411 0.72332511 0.7488790
## Basophils.LD -0.062377239 1.1171489 -0.3415051 0.73345773 0.7488790
## T.CD8.Memory -0.462883963 7.0435384 -0.3210237 0.74887903 0.7488790
## B
## T.CD8.Naive -4.531220
## mDCs -4.553452
## B.Memory -4.557830
## B.Naive -4.561397
## MAIT -4.561495
## Plasmablasts -4.573448
## NK -4.575230
## Monocytes.C -4.587648
## Monocytes.NC.I -4.590223
## T.CD4.Memory -4.591768
## Neutrophils.LD -4.595662
## T.gd.Vd2 -4.599948
## T.gd.non.Vd2 -4.611435
## pDCs -4.611547
## T.CD4.Naive -4.611717
## Basophils.LD -4.611894
## T.CD8.Memory -4.612149
subset(bl_eos,P.Value<0.05)
## logFC AveExpr t P.Value adj.P.Val B
## T.CD8.Naive -1.959841 15.125 -2.147611 0.03422174 0.3160929 -4.53122
mx <- dec2
ss2 <- as.data.frame(cbind(ss_pod1,sscell_pod1))
ss2$infec <- factor(infec[match(ss2$PG_number,infec$PG_number),"infection30d"])
table(ss2$infec)
##
## 0 1
## 90 19
mx <- mx[,colnames(mx) %in% rownames(ss2)]
# base model
ss2$infec <- as.numeric(ss2$infec) -1
design <- model.matrix(~ ss2$infec)
fit <- lmFit(mx, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
bl_eos <- topTable(fit,number=Inf)
## Removing intercept from test coefficients
bl_eos
## logFC AveExpr t P.Value adj.P.Val
## Neutrophils.LD 7.383561154 6.04021961 2.8906243 0.004635183 0.04559221
## mDCs -0.147177994 0.56031255 -2.8402997 0.005363790 0.04559221
## Monocytes.NC.I -1.562991987 6.80694095 -2.2609764 0.025712050 0.12205786
## T.CD8.Memory -1.836842825 4.94705245 -2.2162038 0.028719497 0.12205786
## B.Naive -1.130456941 2.67567424 -2.1160802 0.036575644 0.12435719
## T.gd.Vd2 -0.126542498 1.96550576 -2.0202968 0.045763056 0.12966199
## MAIT -0.753010120 2.56410707 -1.9040002 0.059501973 0.14450479
## Plasmablasts -0.009622197 0.11220385 -1.8034825 0.074027871 0.15730923
## T.CD8.Naive -1.304776589 13.87974798 -1.7275268 0.086857286 0.16406376
## T.CD4.Naive -1.832981825 5.63243153 -1.6236129 0.107297960 0.17316669
## NK -0.694609309 2.51929775 -1.6017813 0.112049036 0.17316669
## pDCs -0.017440865 0.08142436 -1.3547992 0.178263669 0.25254020
## B.Memory -0.689658924 2.49671698 -1.2954699 0.197848592 0.25872508
## Monocytes.C 2.332479245 35.58800977 0.8822890 0.379529170 0.46085685
## Basophils.LD 0.089728502 0.60330084 0.5905224 0.556054451 0.63019504
## T.CD4.Memory 0.297015802 13.12416487 0.3966971 0.692353280 0.73562536
## T.gd.non.Vd2 0.003327371 0.40288944 0.1541699 0.877755986 0.87775599
## B
## Neutrophils.LD -2.138812
## mDCs -2.262089
## Monocytes.NC.I -3.555972
## T.CD8.Memory -3.645021
## B.Naive -3.838286
## T.gd.Vd2 -4.015499
## MAIT -4.220450
## Plasmablasts -4.388462
## T.CD8.Naive -4.509743
## T.CD4.Naive -4.667674
## NK -4.699673
## pDCs -5.032772
## B.Memory -5.104804
## Monocytes.C -5.519100
## Basophils.LD -5.718212
## T.CD4.Memory -5.807159
## T.gd.non.Vd2 -5.869351
subset(bl_eos,P.Value<0.05)
## logFC AveExpr t P.Value adj.P.Val B
## Neutrophils.LD 7.3835612 6.0402196 2.890624 0.004635183 0.04559221 -2.138812
## mDCs -0.1471780 0.5603126 -2.840300 0.005363790 0.04559221 -2.262089
## Monocytes.NC.I -1.5629920 6.8069409 -2.260976 0.025712050 0.12205786 -3.555972
## T.CD8.Memory -1.8368428 4.9470524 -2.216204 0.028719497 0.12205786 -3.645021
## B.Naive -1.1304569 2.6756742 -2.116080 0.036575644 0.12435719 -3.838286
## T.gd.Vd2 -0.1265425 1.9655058 -2.020297 0.045763056 0.12966199 -4.015499
# model with clinical covariates
ss3 <- ss2[,c("sexD", "wound_typeOP", "duration_sx", "ethnicityCAT", "ageCS", "crp_group", "infec")]
design <- model.matrix(~ sexD + ethnicityCAT + ageCS + infec, ss2 )
#design <- model.matrix(~ sexD + wound_typeOP + duration_sx + ethnicityCAT + ageCS + crp_group + infec, ss2 )
design <- model.matrix(~ sexD + ethnicityCAT + ageCS + infec, ss2 )
fit <- lmFit(mx, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
bl_pod1 <- topTable(fit,coef="infec",number=Inf)
subset(bl_pod1,P.Value<0.05)
## logFC AveExpr t P.Value adj.P.Val B
## Neutrophils.LD 8.1537013 6.0402196 3.042204 0.002969841 0.04113201 -1.799746
## mDCs -0.1533597 0.5603126 -2.878483 0.004839061 0.04113201 -2.220296
## T.CD8.Naive -1.7651999 13.8797480 -2.539088 0.012574308 0.06425516 -3.030962
## T.CD8.Memory -2.0884487 4.9470524 -2.456531 0.015661482 0.06425516 -3.214747
## NK -1.0169202 2.5192977 -2.369188 0.019647626 0.06425516 -3.403322
## MAIT -0.9129348 2.5641071 -2.312760 0.022678293 0.06425516 -3.521910
## B.Naive -1.2049898 2.6756742 -2.214346 0.028958021 0.07032662 -3.722578
## T.gd.Vd2 -0.1332896 1.9655058 -2.119442 0.036401229 0.07735261 -3.908594
For reproducibility
save.image("qc_dge_infec.Rdata") #should be "qc.Rdata"
sessionInfo()
## R version 4.5.0 (2025-04-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0 LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Australia/Melbourne
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] beeswarm_0.4.0 limma_3.64.0
## [3] eulerr_7.0.2 MASS_7.3-65
## [5] mitch_1.20.0 DESeq2_1.48.0
## [7] SummarizedExperiment_1.38.0 Biobase_2.68.0
## [9] MatrixGenerics_1.20.0 matrixStats_1.5.0
## [11] GenomicRanges_1.60.0 GenomeInfoDb_1.44.0
## [13] IRanges_2.42.0 S4Vectors_0.46.0
## [15] BiocGenerics_0.54.0 generics_0.1.3
## [17] dplyr_1.1.4 WGCNA_1.73
## [19] fastcluster_1.2.6 dynamicTreeCut_1.63-1
## [21] reshape2_1.4.4 gplots_3.2.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 rstudioapi_0.17.1 jsonlite_2.0.0
## [4] magrittr_2.0.3 farver_2.1.2 rmarkdown_2.29
## [7] vctrs_0.6.5 memoise_2.0.1.9000 base64enc_0.1-3
## [10] htmltools_0.5.8.1 S4Arrays_1.8.0 SparseArray_1.8.0
## [13] Formula_1.2-5 sass_0.4.10 bslib_0.9.0
## [16] KernSmooth_2.23-26 htmlwidgets_1.6.4 plyr_1.8.9
## [19] echarts4r_0.4.5 impute_1.82.0 cachem_1.1.0
## [22] mime_0.13 lifecycle_1.0.4 iterators_1.0.14
## [25] pkgconfig_2.0.3 Matrix_1.7-3 R6_2.6.1
## [28] fastmap_1.2.0 GenomeInfoDbData_1.2.14 shiny_1.10.0
## [31] digest_0.6.37 colorspace_2.1-1 GGally_2.2.1
## [34] AnnotationDbi_1.70.0 Hmisc_5.2-3 RSQLite_2.3.9
## [37] polyclip_1.10-7 httr_1.4.7 abind_1.4-8
## [40] compiler_4.5.0 bit64_4.6.0-1 doParallel_1.0.17
## [43] htmlTable_2.4.3 backports_1.5.0 BiocParallel_1.42.0
## [46] DBI_1.2.3 ggstats_0.9.0 DelayedArray_0.34.1
## [49] gtools_3.9.5 caTools_1.18.3 tools_4.5.0
## [52] foreign_0.8-90 httpuv_1.6.16 nnet_7.3-20
## [55] glue_1.8.0 promises_1.3.2 polylabelr_0.3.0
## [58] grid_4.5.0 checkmate_2.3.2 cluster_2.1.8.1
## [61] gtable_0.3.6 preprocessCore_1.70.0 tidyr_1.3.1
## [64] data.table_1.17.0 xml2_1.3.8 XVector_0.48.0
## [67] foreach_1.5.2 pillar_1.10.2 stringr_1.5.1
## [70] later_1.4.2 splines_4.5.0 lattice_0.22-7
## [73] survival_3.8-3 bit_4.6.0 tidyselect_1.2.1
## [76] GO.db_3.21.0 locfit_1.5-9.12 Biostrings_2.76.0
## [79] knitr_1.50 gridExtra_2.3 svglite_2.1.3
## [82] xfun_0.52 statmod_1.5.0 stringi_1.8.7
## [85] UCSC.utils_1.4.0 yaml_2.3.10 statnet.common_4.11.0
## [88] kableExtra_1.4.0 evaluate_1.0.3 codetools_0.2-20
## [91] tcltk_4.5.0 tibble_3.2.1 cli_3.6.5
## [94] rpart_4.1.24 xtable_1.8-4 systemfonts_1.2.2
## [97] jquerylib_0.1.4 network_1.19.0 dichromat_2.0-0.1
## [100] Rcpp_1.0.14 coda_0.19-4.1 png_0.1-8
## [103] parallel_4.5.0 ggplot2_3.5.2 blob_1.2.4
## [106] bitops_1.0-9 viridisLite_0.4.2 scales_1.4.0
## [109] purrr_1.0.4 crayon_1.5.3 rlang_1.1.6
## [112] KEGGREST_1.48.0