Source: https://github.com/markziemann/GeneNameErrors2020
View the reports: http://ziemann-lab.net/public/gene_name_errors/
Gene name errors result when data are imported improperly into MS Excel and other spreadsheet programs (Zeeberg et al, 2004). Certain gene names like MARCH3, SEPT2 and DEC1 are converted into date format. These errors are surprisingly common in supplementary data files in the field of genomics (Ziemann et al, 2016). This could be considered a small error because it only affects a small number of genes, however it is symptomtic of poor data processing methods. The purpose of this script is to identify gene name errors present in supplementary files of PubMed Central articles in the previous month.
library("XML")
library("jsonlite")
library("xml2")
library("reutils")
library("readxl")
Here I will be getting PubMed Central IDs for the previous month.
Start with figuring out the date to search PubMed Central.
CURRENT_MONTH=format(Sys.time(), "%m")
CURRENT_YEAR=format(Sys.time(), "%Y")
if (CURRENT_MONTH == "01") {
PREV_YEAR=as.character(as.numeric(format(Sys.time(), "%Y"))-1)
PREV_MONTH="12"
} else {
PREV_YEAR=CURRENT_YEAR
PREV_MONTH=as.character(as.numeric(format(Sys.time(), "%m"))-1)
}
DATE=paste(PREV_YEAR,"/",PREV_MONTH,sep="")
DATE
## [1] "2021/11"
Let’s see how many PMC IDs we have in the past month.
QUERY ='((genom*[Abstract]))'
ESEARCH_RES <- esearch(term=QUERY, db = "pmc", rettype = "uilist", retmode = "xml", retstart = 0,
retmax = 5000000, usehistory = TRUE, webenv = NULL, querykey = NULL, sort = NULL, field = NULL,
datetype = NULL, reldate = NULL, mindate = DATE, maxdate = DATE)
pmc <- efetch(ESEARCH_RES,retmode="text",rettype="uilist",outfile="pmcids.txt")
## Retrieving UIDs 1 to 500
## Retrieving UIDs 501 to 1000
## Retrieving UIDs 1001 to 1500
## Retrieving UIDs 1501 to 2000
## Retrieving UIDs 2001 to 2500
## Retrieving UIDs 2501 to 3000
## Retrieving UIDs 3001 to 3500
pmc <- read.table(pmc)
pmc <- paste("PMC",pmc$V1,sep="")
NUM_ARTICLES=length(pmc)
NUM_ARTICLES
## [1] 3152
writeLines(pmc,con="pmc.txt")
Now run the bash script. Note that false positives can occur (~1.5%) and these results have not been verified by a human.
Here are some definitions:
NUM_XLS = Number of supplementary Excel files in this set of PMC articles.
NUM_XLS_ARTICLES = Number of articles matching the PubMed Central search which have supplementary Excel files.
GENELISTS = The gene lists found in the Excel files. Each Excel file is counted once even it has multiple gene lists.
NUM_GENELISTS = The number of Excel files with gene lists.
NUM_GENELIST_ARTICLES = The number of PMC articles with supplementary Excel gene lists.
ERROR_GENELISTS = Files suspected to contain gene name errors. The dates and five-digit numbers indicate transmogrified gene names.
NUM_ERROR_GENELISTS = Number of Excel gene lists with errors.
NUM_ERROR_GENELIST_ARTICLES = Number of articles with supplementary Excel gene name errors.
ERROR_PROPORTION = This is the proportion of articles with Excel gene lists that have errors.
system("./gene_names.sh pmc.txt")
results <- readLines("results.txt")
XLS <- results[grep("XLS",results,ignore.case=TRUE)]
NUM_XLS = length(XLS)
NUM_XLS
## [1] 3675
NUM_XLS_ARTICLES = length(unique(sapply(strsplit(XLS," "),"[[",1)))
NUM_XLS_ARTICLES
## [1] 688
GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>2]
#GENELISTS
NUM_GENELISTS <- length(unique(sapply(strsplit(GENELISTS," "),"[[",2)))
NUM_GENELISTS
## [1] 484
NUM_GENELIST_ARTICLES <- length(unique(sapply(strsplit(GENELISTS," "),"[[",1)))
NUM_GENELIST_ARTICLES
## [1] 236
ERROR_GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>3]
#ERROR_GENELISTS
NUM_ERROR_GENELISTS = length(ERROR_GENELISTS)
NUM_ERROR_GENELISTS
## [1] 195
GENELIST_ERROR_ARTICLES <- unique(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
GENELIST_ERROR_ARTICLES
## [1] "PMC8629385" "PMC8616896" "PMC8610377" "PMC8609283" "PMC8609233"
## [6] "PMC8610329" "PMC8614379" "PMC8582303" "PMC8605608" "PMC8604288"
## [11] "PMC8603481" "PMC8602823" "PMC8602260" "PMC8599854" "PMC8598162"
## [16] "PMC8575900" "PMC8599367" "PMC8581057" "PMC8591166" "PMC8590766"
## [21] "PMC8556363" "PMC8585772" "PMC8560584" "PMC8581662" "PMC8581465"
## [26] "PMC8626465" "PMC8574973" "PMC8575309" "PMC8561166" "PMC8552660"
## [31] "PMC8571097" "PMC8603206" "PMC8567755" "PMC8567320" "PMC8566369"
## [36] "PMC8608327" "PMC8581407" "PMC8564494" "PMC8564051" "PMC8563320"
## [41] "PMC8561425" "PMC8556884" "PMC8546856" "PMC8602306" "PMC8599678"
## [46] "PMC8566521" "PMC8580886" "PMC8597285" "PMC8596853" "PMC8581766"
## [51] "PMC8592943" "PMC8586148" "PMC8560912" "PMC8524723" "PMC8588809"
## [56] "PMC8567280" "PMC8577010" "PMC8576023" "PMC8575036" "PMC8573080"
## [61] "PMC8559480" "PMC8552790" "PMC8570510" "PMC8548210" "PMC8560872"
## [66] "PMC8531462" "PMC8564479" "PMC8563994" "PMC8558579" "PMC8485671"
NUM_ERROR_GENELIST_ARTICLES <- length(GENELIST_ERROR_ARTICLES)
NUM_ERROR_GENELIST_ARTICLES
## [1] 70
ERROR_PROPORTION = NUM_ERROR_GENELIST_ARTICLES / NUM_GENELIST_ARTICLES
ERROR_PROPORTION
## [1] 0.2966102
Here you can have a look at all the gene lists detected in the past month, as well as those with errors. The dates are obvious errors, these are commonly dates in September, March, December and October. The five-digit numbers represent dates as they are encoded in the Excel internal format. The five digit number is the number of days since 1900. If you were to take these numbers and put them into Excel and format the cells as dates, then these will also mostly map to dates in September, March, December and October.
#GENELISTS
ERROR_GENELISTS
## [1] "PMC8629385 /pmc/articles/PMC8629385/bin/pgen.1009910.s020.xlsx Hsapiens 2 43892 43892"
## [2] "PMC8629385 /pmc/articles/PMC8629385/bin/pgen.1009910.s021.xlsx Hsapiens 1 43901"
## [3] "PMC8629385 /pmc/articles/PMC8629385/bin/pgen.1009910.s022.xlsx Hsapiens 6 43892 43898 43892 43893 43901 43900"
## [4] "PMC8629385 /pmc/articles/PMC8629385/bin/pgen.1009910.s026.xlsx Hsapiens 1 43899"
## [5] "PMC8616896 /pmc/articles/PMC8616896/bin/41598_2021_2343_MOESM4_ESM.xlsx Hsapiens 8 44448 44445 44441 44261 44440 44257 44446 44450"
## [6] "PMC8616896 /pmc/articles/PMC8616896/bin/41598_2021_2343_MOESM4_ESM.xlsx Hsapiens 1 44261"
## [7] "PMC8616896 /pmc/articles/PMC8616896/bin/41598_2021_2343_MOESM4_ESM.xlsx Hsapiens 1 44446"
## [8] "PMC8616896 /pmc/articles/PMC8616896/bin/41598_2021_2343_MOESM4_ESM.xlsx Ggallus 1 44261"
## [9] "PMC8616896 /pmc/articles/PMC8616896/bin/41598_2021_2343_MOESM4_ESM.xlsx Ggallus 1 44446"
## [10] "PMC8610377 /pmc/articles/PMC8610377/bin/mmc3.xlsx Hsapiens 10 37681 39508 40422 38777 39326 40787 39142 37316 36951 37865"
## [11] "PMC8610377 /pmc/articles/PMC8610377/bin/mmc3.xlsx Hsapiens 2 38412 37865"
## [12] "PMC8610377 /pmc/articles/PMC8610377/bin/mmc3.xlsx Hsapiens 2 38412 37865"
## [13] "PMC8610377 /pmc/articles/PMC8610377/bin/mmc3.xlsx Hsapiens 10 37681 39508 40422 38777 39326 40787 39142 37316 36951 37865"
## [14] "PMC8610377 /pmc/articles/PMC8610377/bin/mmc3.xlsx Hsapiens 1 37865"
## [15] "PMC8610377 /pmc/articles/PMC8610377/bin/mmc3.xlsx Hsapiens 1 38412"
## [16] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc11.xlsx Hsapiens 26 37500 39326 39692 38596 37316 40057 39142 40787 37865 37316 40603 38412 38961 36951 36951 37681 38047 38777 39508 39873 37135 40422 37226 40238 41153 41883"
## [17] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc11.xlsx Hsapiens 27 37226 36951 37316 36951 40238 40603 37316 37681 38047 38412 38777 39142 39508 39873 37135 40422 40787 41153 41883 37500 37865 38231 38596 38961 39326 39692 40057"
## [18] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc11.xlsx Hsapiens 17 36951 36951 37681 38047 38412 38777 39142 39873 40787 37500 37865 38231 38596 38961 39326 39692 40057"
## [19] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc2.xlsx Hsapiens 121 37316 37316 36951 36951 39508 39508 38412 38412 39873 37135 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 40057 40057 40057 40057 40057 37316 37316 37316 37316 37316 37316 40422 39142 39142 39142 38047 37500 37500 37500 37500 37500 37500 37500 37500 37500 37500 37500 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 37865 37865 37865 37865 37865 37865 40787 40787 40787 40787 40787 36951 38777 38777 40603 40603 37681 39692 39692 39692 39692 39692 39692 39692 39326 39326 39326 39326 39326 39326 39326 39326 38961 38961 38961 38961"
## [20] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc2.xlsx Hsapiens 97 37316 37316 36951 39508 38412 38412 39873 37135 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 38231 40057 40057 40057 40057 37316 37316 37316 37316 40422 39142 39142 39142 38047 37500 37500 37500 37500 37500 37500 37500 37500 38596 38596 38596 38596 38596 38596 37865 37865 37865 40787 40787 40787 36951 38777 40603 40603 37681 39692 39692 39692 39692 39692 39692 39692 39326 39326 39326 39326 39326 39326 39326 38961 38961"
## [21] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc2.xlsx Hsapiens 21 36951 38412 40057 40057 40422 39142 39142 39142 37500 37500 37500 37500 38596 37865 40787 40787 39692 39692 39326 39326 38961"
## [22] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc2.xlsx Hsapiens 19 38412 40057 40057 39142 37500 37500 37500 38596 38596 37865 40787 40787 37681 39692 39326 39326 39326 39326 38961"
## [23] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc4.xlsx Hsapiens 26 38596 37865 39326 39508 37316 38961 40787 40057 39692 37500 37316 36951 39142 38412 38777 40603 37135 40422 37681 38047 39873 36951 37226 40238 41153 41883"
## [24] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc4.xlsx Hsapiens 21 38596 37865 39508 37500 38961 36951 39326 37316 37316 40787 38412 40057 38777 39142 40422 39692 40603 37681 38047 39873 37135"
## [25] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc4.xlsx Hsapiens 21 38596 37865 39508 37500 38961 36951 39326 37316 37316 40787 38412 40057 38777 39142 40422 39692 40603 37681 38047 39873 37135"
## [26] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc5.xlsx Hsapiens 22 37316 40603 37681 38047 38412 38777 39873 37135 37500 38596 37316 37865 39326 39692 39142 40787 38961 36951 36951 39508 40422 40057"
## [27] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc5.xlsx Hsapiens 23 36951 37316 36951 40603 37316 37681 38047 38412 38777 39142 39508 39873 37135 40422 40787 37500 37865 38231 38596 38961 39326 39692 40057"
## [28] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc5.xlsx Hsapiens 17 36951 36951 37681 38047 38412 38777 39142 39873 40787 37500 37865 38231 38596 38961 39326 39692 40057"
## [29] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc5.xlsx Hsapiens 22 38596 37865 39326 38231 37316 39508 38961 40787 37316 37500 40057 38412 38777 39142 40603 38047 39873 39692 37135 40422 36951 37681"
## [30] "PMC8609283 /pmc/articles/PMC8609283/bin/mmc9.xlsx Hsapiens 10 38596 39508 37865 38231 40057 39692 38412 40787 37500 36951"
## [31] "PMC8609233 /pmc/articles/PMC8609233/bin/mmc2.xlsx Mmusculus 2 44082 44075"
## [32] "PMC8609233 /pmc/articles/PMC8609233/bin/mmc5.xlsx Mmusculus 2 43531 43529"
## [33] "PMC8609233 /pmc/articles/PMC8609233/bin/mmc6.xlsx Mmusculus 1 43531"
## [34] "PMC8609233 /pmc/articles/PMC8609233/bin/mmc7.xlsx Rnorvegicus 2 43531 43529"
## [35] "PMC8610329 /pmc/articles/PMC8610329/bin/mmc3.xlsx Hsapiens 1 44084"
## [36] "PMC8610329 /pmc/articles/PMC8610329/bin/mmc3.xlsx Hsapiens 1 44084"
## [37] "PMC8610329 /pmc/articles/PMC8610329/bin/mmc3.xlsx Hsapiens 1 44084"
## [38] "PMC8610329 /pmc/articles/PMC8610329/bin/mmc3.xlsx Hsapiens 1 44084"
## [39] "PMC8614379 zip/Supplemental_File5.xlsx Hsapiens 7 43901 43893 43898 43899 44077 44075 44084"
## [40] "PMC8614379 zip/Supplemental_File5.xlsx Hsapiens 4 43891 43893 43897 44080"
## [41] "PMC8582303 /pmc/articles/PMC8582303/bin/peerj-09-12353-s001.xlsx Hsapiens 1 44256"
## [42] "PMC8582303 /pmc/articles/PMC8582303/bin/peerj-09-12353-s001.xlsx Hsapiens 1 44256"
## [43] "PMC8605608 /pmc/articles/PMC8605608/bin/13073_2021_1002_MOESM2_ESM.xlsx Hsapiens 29 44089 44089 43898 43898 43898 43898 43895 43895 43895 43895 43897 43897 43897 43897 43897 43897 43897 43897 43897 43896 43896 43896 43896 43896 43896 43896 43896 43893 43893"
## [44] "PMC8604288 /pmc/articles/PMC8604288/bin/pgen.1009875.s017.xlsx Scerevisiae 1 45413"
## [45] "PMC8603481 /pmc/articles/PMC8603481/bin/12915_2021_1165_MOESM9_ESM.xlsx Rnorvegicus 1 44449"
## [46] "PMC8602823 /pmc/articles/PMC8602823/bin/DataSheet_2.xlsx Hsapiens 1 44257"
## [47] "PMC8602823 /pmc/articles/PMC8602823/bin/DataSheet_2.xlsx Hsapiens 1 44257"
## [48] "PMC8602823 /pmc/articles/PMC8602823/bin/DataSheet_2.xlsx Hsapiens 1 44257"
## [49] "PMC8602823 /pmc/articles/PMC8602823/bin/DataSheet_2.xlsx Hsapiens 1 44257"
## [50] "PMC8602260 /pmc/articles/PMC8602260/bin/41467_2021_26871_MOESM3_ESM.xlsx Hsapiens 57 43899 43899 44080 43892 44083 43892 44083 43892 44083 44083 44083 43892 44083 44083 44083 43892 44083 44085 44085 44083 44076 43899 43899 44082 43896 43898 44088 43893 43899 44083 44083 44081 43891 43898 44083 43896 44080 44080 44083 43896 43896 43899 43899 44083 44085 43891 44084 44083 44083 43898 43893 44085 44083 44083 44080 44083 44083"
## [51] "PMC8602260 /pmc/articles/PMC8602260/bin/41467_2021_26871_MOESM3_ESM.xlsx Hsapiens 55 44088 43897 43900 44078 44083 43900 43901 44078 43900 44078 43895 43897 43897 44086 44088 44088 44088 44088 44088 44088 44088 44088 44088 44088 44088 44088 44088 44088 43900 43900 44088 44088 44088 44088 44088 44075 44075 43899 43900 43901 43894 43900 44084 43895 44077 43897 43894 44076 44085 43899 43891 43897 43901 44083 43901"
## [52] "PMC8599854 /pmc/articles/PMC8599854/bin/41467_2021_26821_MOESM4_ESM.xlsx Hsapiens 33 43711 43711 43711 43525 43531 43531 43531 43717 43800 43717 43800 43531 43531 43531 43712 43525 43525 43525 43525 43525 43525 43712 43717 43711 43525 43712 43717 43711 43525 43712 43717 43711 43525"
## [53] "PMC8599854 /pmc/articles/PMC8599854/bin/41467_2021_26821_MOESM4_ESM.xlsx Hsapiens 33 43711 43711 43711 43525 43531 43531 43531 43717 43800 43717 43800 43531 43531 43531 43712 43525 43525 43525 43525 43525 43525 43712 43717 43711 43525 43712 43717 43711 43525 43712 43717 43711 43525"
## [54] "PMC8598162 /pmc/articles/PMC8598162/bin/elife-65110-fig1-data1.xlsx Mmusculus 19 44075 44081 43891 44078 44080 44082 44079 43897 44084 43898 43895 44083 43899 44085 44089 44076 44077 43892 43896"
## [55] "PMC8598162 /pmc/articles/PMC8598162/bin/elife-65110-fig5-data1.xlsx Mmusculus 27 43891 43891 43900 43901 43892 43892 43893 43894 43895 43896 43897 43898 43899 44075 44084 44085 44086 44088 44076 44076 44077 44078 44079 44080 44081 44082 44083"
## [56] "PMC8575900 /pmc/articles/PMC8575900/bin/41467_2021_26787_MOESM5_ESM.xlsx Hsapiens 2 40969 41156"
## [57] "PMC8575900 /pmc/articles/PMC8575900/bin/41467_2021_26787_MOESM5_ESM.xlsx Hsapiens 7 41163 41163 41163 41158 41162 40975 41161"
## [58] "PMC8575900 /pmc/articles/PMC8575900/bin/41467_2021_26787_MOESM5_ESM.xlsx Ggallus 1 41161"
## [59] "PMC8575900 /pmc/articles/PMC8575900/bin/41467_2021_26787_MOESM5_ESM.xlsx Hsapiens 1 41244"
## [60] "PMC8599367 /pmc/articles/PMC8599367/bin/Table2.XLSX Hsapiens 4 44451 44256 44257 44447"
## [61] "PMC8581057 /pmc/articles/PMC8581057/bin/mmc3.xlsx Ggallus 11 44256 44448 44444 44262 44445 44257 44258 44449 44264 44260 44441"
## [62] "PMC8591166 /pmc/articles/PMC8591166/bin/Table_2.XLSX Hsapiens 76 43897 44078 44080 43900 44081 43898 43897 43897 43899 43891 44081 44085 44077 44082 44078 44082 44083 44082 44078 44078 44082 44078 44083 44088 44085 44082 44082 44084 44085 44082 44077 44076 43893 44085 44077 43892 43897 44082 44083 44080 43891 43897 43898 43897 43894 44078 43896 44083 44083 43897 44084 43897 44082 43895 44086 44081 43901 44075 44083 44083 44078 44085 43892 44078 44085 43893 44081 44085 44079 43897 44080 44086 44081 44085 44080 44078"
## [63] "PMC8591166 /pmc/articles/PMC8591166/bin/Table_2.XLSX Ggallus 76 44262 44259 44446 44256 44258 44443 44440 44453 44260 44262 44262 44262 44443 44443 44441 44451 44451 44263 44263 44445 44445 44445 44445 44444 44446 44446 44446 44446 44264 44442 44442 44442 44450 44450 44450 44450 44450 44450 44450 44450 44258 44447 44447 44447 44447 44447 44447 44447 44447 44447 44257 44449 44449 44257 44256 44262 44262 44262 44262 44448 44448 44448 44448 44448 44448 44448 44443 44443 44443 44443 44443 44443 44265 44261 44266 44262"
## [64] "PMC8591166 /pmc/articles/PMC8591166/bin/Table_3.XLSX Hsapiens 2 44262 44443"
## [65] "PMC8591166 /pmc/articles/PMC8591166/bin/Table_3.XLSX Hsapiens 1 44078"
## [66] "PMC8591166 /pmc/articles/PMC8591166/bin/Table_3.XLSX Hsapiens 1 43897"
## [67] "PMC8590766 /pmc/articles/PMC8590766/bin/12885_2021_8974_MOESM9_ESM.xlsx Mmusculus 58 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446 44453 44446"
## [68] "PMC8556363 /pmc/articles/PMC8556363/bin/41467_2021_26489_MOESM5_ESM.xlsx Hsapiens 15 44443 44447 44441 44446 44261 44258 44448 44264 44450 44257 44265 44256 44440 44260 44262"
## [69] "PMC8585772 /pmc/articles/PMC8585772/bin/DataSheet2.XLSX Hsapiens 28 44256 44266 44445 44453 44259 44265 44261 44440 44262 44443 44448 44442 44263 44444 44447 44451 44257 44449 44450 44256 44258 44441 44257 44446 44260 44531 44454 44264"
## [70] "PMC8560584 /pmc/articles/PMC8560584/bin/molce-44-10-746-supple2.xlsx Athaliana 1 44076"
## [71] "PMC8581662 /pmc/articles/PMC8581662/bin/Table_1.XLSX Hsapiens 11 43164 43160 43161 43162 43163 43165 43166 43167 43168 43169 43170"
## [72] "PMC8581465 /pmc/articles/PMC8581465/bin/Table_1.xlsx Mmusculus 3 44259 44258 44260"
## [73] "PMC8581465 /pmc/articles/PMC8581465/bin/Table_1.xlsx Mmusculus 2 44258 44260"
## [74] "PMC8581465 /pmc/articles/PMC8581465/bin/Table_1.xlsx Rnorvegicus 1 44259"
## [75] "PMC8581465 /pmc/articles/PMC8581465/bin/Table_2.xlsx Mmusculus 17 43530 43527 43535 43714 43717 43717 43715 43718 43525 43528 43712 43716 43717 43530 43712 43535 43717"
## [76] "PMC8626465 /pmc/articles/PMC8626465/bin/41598_2021_2463_MOESM5_ESM.xlsx Hsapiens 1 37834"
## [77] "PMC8626465 /pmc/articles/PMC8626465/bin/41598_2021_2463_MOESM5_ESM.xlsx Hsapiens 1 37469"
## [78] "PMC8574973 /pmc/articles/PMC8574973/bin/supplementary_table1_bbab272.xlsx Hsapiens 2 38961 40057"
## [79] "PMC8575309 /pmc/articles/PMC8575309/bin/ppat.1010013.s005.xlsx Hsapiens 26 43900 44087 43901 44078 44079 44076 43896 44080 44083 44075 44081 43892 44077 43895 43891 44085 43899 43894 44086 43893 44088 43898 44084 43897 44166 43892"
## [80] "PMC8575309 /pmc/articles/PMC8575309/bin/ppat.1010013.s005.xlsx Hsapiens 1 2-Mar"
## [81] "PMC8575309 /pmc/articles/PMC8575309/bin/ppat.1010013.s007.xlsx Hsapiens 13 40422 37500 40787 39692 39326 41883 38961 41153 37135 38231 40057 38596 37865"
## [82] "PMC8561166 /pmc/articles/PMC8561166/bin/mmc2.xlsx Hsapiens 9 43527 43527 43716 43722 43715 43800 43800 43800 43800"
## [83] "PMC8552660 /pmc/articles/PMC8552660/bin/MSB-17-e10260-s009.xlsx Hsapiens 5 7-Mar 1-Mar 6-Mar 5-Mar 2-Mar"
## [84] "PMC8552660 /pmc/articles/PMC8552660/bin/MSB-17-e10260-s009.xlsx Hsapiens 10 44080 44076 44081 43891 44082 43892 44085 43895 44084 44083"
## [85] "PMC8571097 /pmc/articles/PMC8571097/bin/41588_2021_941_MOESM3_ESM.xlsx Hsapiens 1 44262"
## [86] "PMC8571097 /pmc/articles/PMC8571097/bin/41588_2021_941_MOESM3_ESM.xlsx Hsapiens 26 44257 44256 44263 44260 44264 44451 44440 44443 44265 44448 44257 44449 44262 44259 44441 44442 44450 44256 44261 44266 44258 44447 44446 44453 44531 44445"
## [87] "PMC8603206 /pmc/articles/PMC8603206/bin/mmc4.xlsx Hsapiens 4 40787 39326 37500 40057"
## [88] "PMC8603206 /pmc/articles/PMC8603206/bin/mmc4.xlsx Hsapiens 31 40787 40057 39326 40057 37500 40787 39326 39326 40787 40057 37500 39326 37500 37500 37500 37500 39326 37500 40057 40787 39326 40057 40057 40787 37500 37500 40057 40057 37500 37500 40057"
## [89] "PMC8603206 /pmc/articles/PMC8603206/bin/mmc4.xlsx Hsapiens 8 39692 37865 40422 40787 39326 37500 40057 40057"
## [90] "PMC8603206 /pmc/articles/PMC8603206/bin/mmc4.xlsx Hsapiens 101 40787 40787 39692 40787 40057 39326 39326 40057 40057 40422 40787 37500 40787 40787 40057 37500 39326 39326 40057 39692 40787 40787 39326 40787 40057 40057 40057 39692 37500 40422 40057 40787 37500 38412 40057 39326 39326 39326 39326 40057 40422 37500 37500 39692 37500 37500 40787 40057 37500 37500 37500 37500 39326 39326 40787 40057 37865 37500 37500 37500 40057 40057 40057 40057 39692 40057 40787 40057 40057 40057 39692 39692 40422 39326 40057 40422 37500 39326 40057 40787 40057 40787 38961 37500 40422 37500 37500 38961 39326 37500 37500 40057 37500 40422 40422 37500 39326 40057 37500 40057 40057"
## [91] "PMC8567755 /pmc/articles/PMC8567755/bin/Table_1.xlsx Hsapiens 11 44256 44266 44256 44266 44266 44256 44266 44256 44256 44256 44266"
## [92] "PMC8567320 /pmc/articles/PMC8567320/bin/Table3.XLSX Hsapiens 27 44266 44453 44256 44259 44261 44262 44450 44441 44448 44265 44449 44443 44451 44444 44445 44263 44258 44440 44442 44257 44260 44264 44454 44447 44256 44257 44446"
## [93] "PMC8566369 /pmc/articles/PMC8566369/bin/Table5.XLSX Hsapiens 28 44454 44257 44256 44449 44262 44259 44441 44450 44256 44261 44266 44258 44447 44446 44453 44531 44263 44260 44264 44451 44440 44443 44265 44448 44257 44444 44442 44445"
## [94] "PMC8608327 /pmc/articles/PMC8608327/bin/pone.0259674.s015.xls Hsapiens 33 40787 40787 39692 39692 40787 40057 39692 40787 39326 37500 37500 37500 40057 37500 42248 37500 39326 40057 40057 39326 37135 38231 41153 38231 37865 38961 38961 38231 38961 38961 40422 38961 40422"
## [95] "PMC8581407 /pmc/articles/PMC8581407/bin/thnov11p9884s9.xlsx Hsapiens 1 40978"
## [96] "PMC8564494 /pmc/articles/PMC8564494/bin/DataSheet1.xlsx Hsapiens 1 39517"
## [97] "PMC8564051 /pmc/articles/PMC8564051/bin/Data_Sheet_2.xlsx Mmusculus 24 44445 44446 44256 44440 44263 44453 44450 44454 44262 44260 44258 44257 44451 44261 44442 44266 44265 44448 44443 44447 44264 44449 44259 44441"
## [98] "PMC8564051 /pmc/articles/PMC8564051/bin/Data_Sheet_4.xlsx Mmusculus 24 44445 44446 44256 44440 44263 44453 44450 44454 44262 44260 44258 44257 44451 44261 44442 44266 44265 44448 44443 44447 44264 44449 44259 44441"
## [99] "PMC8563320 /pmc/articles/PMC8563320/bin/noab102_suppl_supplementary_table_s3.xlsx Hsapiens 14 44089 43898 43900 43892 44084 43892 44083 43901 44081 44088 44078 43897 44080 43895"
## [100] "PMC8563320 /pmc/articles/PMC8563320/bin/noab102_suppl_supplementary_table_s3.xlsx Hsapiens 11 44081 44078 43896 44077 44075 43894 43891 43899 43892 43893 44080"
## [101] "PMC8563320 /pmc/articles/PMC8563320/bin/noab102_suppl_supplementary_table_s3.xlsx Hsapiens 13 43894 43899 44084 44078 43896 43892 44080 43901 44085 44083 44082 43900 43893"
## [102] "PMC8561425 /pmc/articles/PMC8561425/bin/supplementary_tables_ddab181.xlsx Hsapiens 2 43897 43896"
## [103] "PMC8561425 /pmc/articles/PMC8561425/bin/supplementary_tables_ddab181.xlsx Hsapiens 1 43895"
## [104] "PMC8561425 /pmc/articles/PMC8561425/bin/supplementary_tables_ddab181.xlsx Hsapiens 1 43895"
## [105] "PMC8556884 /pmc/articles/PMC8556884/bin/12876_2021_1940_MOESM2_ESM.xlsx Hsapiens 1 43714"
## [106] "PMC8556884 /pmc/articles/PMC8556884/bin/12876_2021_1940_MOESM3_ESM.xlsx Hsapiens 3 43710 43526 43714"
## [107] "PMC8556884 /pmc/articles/PMC8556884/bin/12876_2021_1940_MOESM4_ESM.xlsx Hsapiens 1 43714"
## [108] "PMC8546856 /pmc/articles/PMC8546856/bin/mmc2.xlsx Mmusculus 1 44084"
## [109] "PMC8602306 /pmc/articles/PMC8602306/bin/41467_2021_26949_MOESM4_ESM.xlsx Hsapiens 24 43722 43709 43531 43525 43718 43712 43711 43526 43527 43530 43719 43534 43717 43529 43714 43533 43710 43720 43532 43713 43528 43800 43535 43716"
## [110] "PMC8602306 /pmc/articles/PMC8602306/bin/41467_2021_26949_MOESM6_ESM.xlsx Hsapiens 27 38596 41883 38047 38412 39326 39692 37226 40787 37681 40057 39873 39508 38231 37316 37500 38777 39142 37135 40603 38961 40422 37316 36951 36951 37865 40238 41153"
## [111] "PMC8602306 /pmc/articles/PMC8602306/bin/41467_2021_26949_MOESM6_ESM.xlsx Hsapiens 27 41883 37226 38961 39873 39692 40422 40057 39508 39326 37500 40787 36951 40603 36951 37865 38777 38412 39142 37681 37135 37316 38047 37316 38596 38231 40238 41153"
## [112] "PMC8602306 /pmc/articles/PMC8602306/bin/41467_2021_26949_MOESM6_ESM.xlsx Hsapiens 27 44257 44442 44443 44257 44446 44445 44262 44450 44264 44451 44259 44256 44261 44453 44447 44263 44441 44531 44265 44258 44440 44266 44448 44444 44256 44449 44260"
## [113] "PMC8599678 /pmc/articles/PMC8599678/bin/41598_2021_1850_MOESM2_ESM.xls Hsapiens 6 44263 44445 44257 44442 44264 44531"
## [114] "PMC8599678 /pmc/articles/PMC8599678/bin/41598_2021_1850_MOESM4_ESM.xls Hsapiens 1 44259"
## [115] "PMC8599678 /pmc/articles/PMC8599678/bin/41598_2021_1850_MOESM4_ESM.xls Hsapiens 6 44263 44445 44257 44442 44264 44531"
## [116] "PMC8599678 /pmc/articles/PMC8599678/bin/41598_2021_1850_MOESM7_ESM.xls Hsapiens 1 44259"
## [117] "PMC8566521 /pmc/articles/PMC8566521/bin/41467_2021_26567_MOESM4_ESM.xlsx Mmusculus 3 44083 43892 43898"
## [118] "PMC8566521 /pmc/articles/PMC8566521/bin/41467_2021_26567_MOESM6_ESM.xlsx Mmusculus 9 44083 44083 44078 43895 44083 44083 44081 44083 43893"
## [119] "PMC8580886 /pmc/articles/PMC8580886/bin/mmc2.xls Hsapiens 3 43891 43891 44086"
## [120] "PMC8597285 /pmc/articles/PMC8597285/bin/12864_2021_8129_MOESM1_ESM.xlsx Hsapiens 1 41883"
## [121] "PMC8596853 /pmc/articles/PMC8596853/bin/41586_2021_4103_MOESM4_ESM.xlsx Hsapiens 1 44263"
## [122] "PMC8581766 /pmc/articles/PMC8581766/bin/ANA-90-455-s005.xlsx Hsapiens 17 37135 38961 39692 37500 38596 38412 40238 37681 39142 40787 38231 40422 37316 41153 39326 39508 37865"
## [123] "PMC8581766 /pmc/articles/PMC8581766/bin/ANA-90-455-s006.xlsx Hsapiens 17 44082 43898 44076 44078 44079 44085 44080 43897 43900 44086 44084 44081 44075 43895 43892 43893 44077"
## [124] "PMC8592943 /pmc/articles/PMC8592943/bin/DataSheet2.XLSX Hsapiens 1 44166"
## [125] "PMC8592943 /pmc/articles/PMC8592943/bin/DataSheet2.XLSX Hsapiens 1 44166"
## [126] "PMC8592943 /pmc/articles/PMC8592943/bin/DataSheet2.XLSX Hsapiens 1 44166"
## [127] "PMC8592943 /pmc/articles/PMC8592943/bin/DataSheet2.XLSX Hsapiens 1 44166"
## [128] "PMC8592943 /pmc/articles/PMC8592943/bin/DataSheet2.XLSX Hsapiens 4 43894 43901 43899 43897"
## [129] "PMC8592943 /pmc/articles/PMC8592943/bin/DataSheet2.XLSX Hsapiens 3 43894 43901 43897"
## [130] "PMC8586148 /pmc/articles/PMC8586148/bin/41467_2021_26850_MOESM11_ESM.xlsx Scerevisiae 2 44081 43922"
## [131] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM12_ESM.xlsx Hsapiens 23 44261 44442 44448 44259 44257 44258 44256 44441 44256 44450 44443 44257 44440 44263 44264 44262 44445 44447 44453 44260 44446 44449 44444"
## [132] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM12_ESM.xlsx Hsapiens 23 44447 44442 44443 44444 44257 44445 44263 44259 44258 44448 44260 44256 44440 44262 44449 44261 44453 44441 44446 44257 44264 44256 44450"
## [133] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM12_ESM.xlsx Hsapiens 23 44256 44444 44445 44443 44442 44258 44257 44264 44448 44262 44256 44447 44257 44450 44441 44261 44446 44259 44440 44449 44260 44453 44263"
## [134] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 3 44076 43899 43895"
## [135] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 1 44084"
## [136] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 9 44076 43891 43899 44166 44077 44081 44078 43900 44088"
## [137] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 5 44083 43892 43891 43900 44085"
## [138] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 7 44076 43893 43891 43896 43901 44086 44075"
## [139] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 5 44088 43901 43896 43894 43899"
## [140] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 5 44082 44166 43896 43893 43901"
## [141] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 3 43895 44085 43891"
## [142] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM7_ESM.xlsx Hsapiens 3 44083 43901 43896"
## [143] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 20 44257 44261 44447 44256 44440 44442 44446 44259 44441 44443 44454 44450 44444 44449 44451 44448 44262 44445 44453 44260"
## [144] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 20 44257 44259 44450 44443 44448 44261 44262 44454 44453 44449 44446 44442 44441 44451 44440 44447 44260 44256 44445 44444"
## [145] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 20 44261 44442 44257 44256 44440 44447 44259 44449 44441 44446 44445 44454 44453 44450 44451 44448 44262 44260 44444 44443"
## [146] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Ggallus 10 44080 44077 44079 44075 43892 44082 43893 44076 44078 44081"
## [147] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 20 44257 44261 44447 44256 44440 44442 44446 44259 44441 44443 44454 44450 44444 44449 44451 44448 44262 44445 44453 44260"
## [148] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 12 44440 44445 44448 44447 44441 44444 44450 44449 44443 44446 44454 44262"
## [149] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 7 43893 44077 44079 43892 44080 44078 44082"
## [150] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 12 44075 43891 44079 44080 43893 44078 43899 44077 43898 44084 43896 44076"
## [151] "PMC8560912 /pmc/articles/PMC8560912/bin/41467_2021_26502_MOESM8_ESM.xlsx Hsapiens 12 43892 44077 44080 43893 43899 44079 44075 44082 44085 43891 43898 44084"
## [152] "PMC8524723 /pmc/articles/PMC8524723/bin/41467_2021_26282_MOESM4_ESM.xlsx Hsapiens 9 44446 44448 44440 44445 44441 44261 44262 44260 44450"
## [153] "PMC8524723 /pmc/articles/PMC8524723/bin/41467_2021_26282_MOESM8_ESM.xlsx Hsapiens 20 44256 44450 44440 44441 44445 44443 44260 44262 44448 44447 44446 44265 44256 44257 44264 44258 44263 44257 44261 44444"
## [154] "PMC8588809 /pmc/articles/PMC8588809/bin/NIHMS1746756-supplement-Supplemental_tables.xlsx Hsapiens 27 43710 43712 43526 43530 43531 43528 43723 43715 43711 43526 43716 43717 43719 43533 43714 43713 43800 43525 43525 43535 43527 43529 43532 43709 43718 43720 43722"
## [155] "PMC8588809 /pmc/articles/PMC8588809/bin/NIHMS1746756-supplement-Supplemental_tables.xlsx Hsapiens 24 43891 44076 43896 44089 43894 43892 44079 43897 43892 44084 44081 43895 44083 43901 44078 44075 43891 43899 43893 43898 44080 44085 44077 44082"
## [156] "PMC8567280 /pmc/articles/PMC8567280/bin/EMBR-22-e53014-s005.xlsx Hsapiens 2 39508 37316"
## [157] "PMC8567280 /pmc/articles/PMC8567280/bin/EMBR-22-e53014-s008.xlsx Hsapiens 1 36923"
## [158] "PMC8577010 /pmc/articles/PMC8577010/bin/12864_2021_8141_MOESM1_ESM.xlsx Hsapiens 20 44257 44442 44256 44445 44257 44261 44449 44440 44444 44263 44264 44443 44447 44258 44262 44441 44450 44446 44256 44260"
## [159] "PMC8576023 /pmc/articles/PMC8576023/bin/41598_2021_1410_MOESM6_ESM.xlsx Hsapiens 1 44454"
## [160] "PMC8576023 /pmc/articles/PMC8576023/bin/41598_2021_1410_MOESM6_ESM.xlsx Hsapiens 1 44454"
## [161] "PMC8575036 /pmc/articles/PMC8575036/bin/supptable1_tcgapancancer_cox_regression_bbab256.xlsx Hsapiens 2 38047 41153"
## [162] "PMC8575036 /pmc/articles/PMC8575036/bin/supptable2_ppicomutation_tcgadata_bbab256.xlsx Hsapiens 2 39326 37500"
## [163] "PMC8575036 /pmc/articles/PMC8575036/bin/supptable5_livercancer_dnamethyl_cox_regression_bbab256.xlsx Hsapiens 1 37681"
## [164] "PMC8573080 /pmc/articles/PMC8573080/bin/Table1.XLSX Hsapiens 1 37135"
## [165] "PMC8559480 /pmc/articles/PMC8559480/bin/CAM4-10-7831-s002.xlsx Hsapiens 1 43901"
## [166] "PMC8552790 /pmc/articles/PMC8552790/bin/peerj-09-12369-s003.xlsx Hsapiens 4 44261 44450 44450 44446"
## [167] "PMC8552790 /pmc/articles/PMC8552790/bin/peerj-09-12369-s003.xlsx Hsapiens 5 44261 44446 44446 44448 44448"
## [168] "PMC8552790 /pmc/articles/PMC8552790/bin/peerj-09-12369-s011.xlsx Hsapiens 664 44447 44448 44450 44262 44447 44448 44262 44448 44262 44447 44262 44445 44258 44262 44448 44447 44260 44264 44448 44447 44450 44446 44449 44262 44441 44446 44441 44262 44262 44442 44261 44441 44259 44262 44441 44448 44261 44447 44448 44262 44450 44262 44450 44262 44260 44454 44446 44264 44441 44262 44446 44447 44441 44256 44263 44441 44263 44261 44261 44263 44264 44450 44441 44261 44446 44447 44262 44262 44263 44264 44442 44441 44261 44454 44441 44262 44454 44262 44450 44262 44450 44262 44261 44447 44261 44441 44444 44260 44442 44262 44260 44264 44441 44444 44261 44442 44262 44442 44260 44448 44441 44262 44261 44442 44441 44447 44262 44261 44450 44260 44441 44450 44441 44261 44261 44447 44450 44441 44258 44261 44441 44441 44261 44445 44260 44257 44262 44441 44261 44264 44260 44454 44446 44264 44454 44262 44441 44450 44448 44449 44441 44442 44441 44262 44260 44262 44441 44447 44448 44264 44450 44264 44263 44261 44441 44263 44262 44264 44446 44264 44262 44450 44447 44446 44262 44258 44261 44442 44262 44441 44450 44262 44441 44449 44261 44441 44256 44450 44262 44263 44261 44448 44441 44260 44446 44262 44447 44261 44450 44447 44261 44264 44450 44263 44446 44454 44454 44261 44446 44441 44260 44442 44262 44261 44450 44261 44450 44260 44442 44262 44258 44258 44263 44259 44260 44449 44256 44441 44261 44447 44260 44446 44448 44441 44450 44448 44261 44450 44441 44446 44447 44261 44450 44256 44450 44441 44256 44256 44441 44450 44262 44261 44446 44264 44259 44262 44261 44446 44264 44262 44262 44258 44442 44442 44454 44262 44441 44260 44446 44263 44442 44441 44261 44448 44261 44444 44260 44441 44256 44261 44446 44262 44263 44442 44256 44447 44441 44260 44256 44257 44447 44441 44450 44260 44441 44260 44445 44261 44262 44264 44259 44263 44447 44262 44442 44261 44441 44450 44448 44447 44262 44260 44454 44454 44441 44450 44262 44442 44261 44454 44449 44448 44262 44447 44441 44263 44446 44261 44446 44263 44261 44264 44448 44261 44447 44262 44448 44264 44259 44258 44264 44263 44260 44262 44441 44261 44260 44441 44441 44259 44262 44448 44442 44261 44441 44263 44441 44450 44262 44264 44262 44261 44446 44260 44261 44454 44261 44449 44454 44450 44262 44261 44449 44259 44262 44261 44446 44442 44264 44447 44441 44450 44260 44261 44261 44450 44261 44441 44256 44441 44441 44446 44261 44260 44262 44263 44441 44450 44260 44446 44450 44260 44446 44447 44449 44263 44261 44450 44441 44261 44262 44444 44441 44450 44261 44260 44447 44441 44256 44261 44450 44441 44260 44454 44441 44261 44261 44261 44446 44442 44262 44264 44442 44441 44450 44261 44450 44261 44442 44450 44264 44446 44444 44441 44454 44446 44261 44441 44442 44447 44262 44441 44261 44443 44260 44450 44442 44441 44256 44454 44447 44448 44441 44261 44450 44264 44450 44262 44263 44261 44446 44450 44261 44264 44454 44450 44443 44441 44446 44261 44454 44262 44261 44262 44443 44261 44450 44256 44450 44264 44262 44450 44261 44441 44263 44261 44261 44445 44263 44446 44441 44257 44262 44454 44264 44264 44441 44263 44447 44441 44261 44441 44448 44445 44261 44444 44262 44262 44450 44446 44261 44441 44447 44261 44441 44440 44446 44262 44441 44450 44264 44261 44450 44263 44450 44449 44260 44256 44261 44449 44261 44262 44446 44262 44258 44449 44441 44261 44442 44261 44441 44262 44261 44446 44450 44262 44261 44441 44454 44256 44441 44441 44261 44453 44261 44445 44261 44442 44447 44453 44261 44445 44257 44446 44448 44441 44442 44450 44258 44261 44441 44454 44441 44264 44264 44261 44446 44261 44441 44447 44441 44448 44260 44262 44261 44441 44261 44262 44261 44261 44263 44264 44444 44449 44262 44262 44450 44261 44261 44262 44262 44261 44261 44441 44441 44261 44256 44447 44447 44261 44261 44262 44264 44454 44264 44261 44441 44261 44447 44441 44264 44441 44261 44264 44448 44262 44263 44261 44446 44450 44441 44261 44256 44450 44443 44447 44261 44448 44256 44261 44448 44261 44257 44441 44261 44446 44441 44454 44264 44441 44450 44262 44264 44263 44448 44263 44262 44264 44262 44448 44441 44262 44264 44447"
## [169] "PMC8570510 /pmc/articles/PMC8570510/bin/pgen.1009865.s019.xlsx Hsapiens 4 43893 43891 43892 44078"
## [170] "PMC8570510 /pmc/articles/PMC8570510/bin/pgen.1009865.s021.xlsx Hsapiens 41 44257 44257 44257 44259 44258 44258 44256 44257 44258 44443 44446 44450 44446 44264 44261 44440 44441 44531 44263 44264 44263 44446 44256 44260 44261 44449 44450 44450 44440 44263 44449 44446 44446 44531 44531 44441 44446 44446 44446 44531 44440"
## [171] "PMC8548210 /pmc/articles/PMC8548210/bin/LSA-2021-01127_TableS1.xlsx Hsapiens 3 44531 44531 44263"
## [172] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 43 43530 43717 43717 43710 43719 43710 43710 43529 43717 43717 43717 43717 43717 43710 43717 43710 43530 43718 43717 43717 43717 43525 43719 43717 43526 43717 43717 43710 43718 43717 43717 43530 43717 43719 43717 43718 43710 43710 43717 43530 43718 43717 43718"
## [173] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 22 43713 43713 43526 43526 43717 43717 43719 43719 43715 43718 43718 43716 43716 43716 43710 43531 43531 43714 43714 43714 43529 43530"
## [174] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 58 43525 43710 43530 43717 43717 43710 43531 43719 43531 43531 43710 43719 43710 43717 43717 43710 43710 43717 43710 43525 43530 43717 43710 43710 43717 43710 43525 43526 43527 43719 43719 43717 43526 43710 43719 43717 43717 43710 43718 43710 43531 43717 43717 43530 43530 43717 43719 43717 43717 43530 43718 43710 43710 43530 43526 43718 43717 43718"
## [175] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 13 43527 43526 43526 43717 43719 43718 43718 43718 43525 43710 43531 43531 43530"
## [176] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 10 43717 43717 43719 43718 43718 43718 43525 43710 43529 43530"
## [177] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 55 43525 43715 43719 43715 43717 43717 43715 43715 43710 43531 43719 43531 43531 43710 43531 43719 43710 43529 43529 43717 43717 43710 43710 43717 43710 43525 43717 43710 43717 43710 43717 43525 43526 43719 43710 43719 43717 43719 43717 43717 43710 43715 43531 43717 43717 43717 43719 43717 43717 43710 43715 43710 43526 43717 43717"
## [178] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 11 43526 43526 43717 43719 43715 43525 43710 43531 43531 43529 43529"
## [179] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 46 43525 43530 43717 43717 43531 43719 43531 43531 43710 43719 43710 43529 43717 43717 43710 43717 43525 43530 43717 43710 43717 43710 43525 43719 43710 43719 43717 43710 43719 43717 43710 43710 43531 43717 43717 43530 43530 43717 43719 43717 43530 43710 43715 43530 43717 43718"
## [180] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 9 43717 43719 43718 43525 43710 43531 43531 43529 43530"
## [181] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 68 43525 43715 43715 43530 43710 43717 43717 43715 43715 43710 43531 43719 43531 43531 43710 43531 43719 43710 43717 43717 43710 43710 43717 43710 43525 43530 43717 43717 43717 43710 43717 43710 43525 43526 43719 43710 43717 43526 43710 43717 43714 43714 43717 43710 43533 43710 43715 43531 43714 43717 43714 43717 43714 43530 43530 43717 43719 43717 43530 43710 43715 43714 43710 43710 43530 43526 43717 43717"
## [182] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 14 43526 43526 43717 43719 43715 43525 43710 43533 43531 43531 43714 43714 43714 43530"
## [183] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM2_ESM.xlsx Hsapiens 85 43525 43715 43713 43719 43715 43530 43717 43717 43717 43717 43715 43715 43710 43531 43719 43719 43531 43531 43710 43531 43719 43719 43710 43529 43717 43717 43717 43713 43710 43719 43710 43717 43710 43525 43530 43717 43710 43716 43713 43710 43717 43526 43716 43719 43719 43710 43713 43717 43526 43719 43717 43714 43714 43716 43716 43717 43710 43718 43715 43531 43714 43717 43714 43717 43714 43530 43717 43714 43719 43719 43716 43717 43717 43717 43716 43718 43710 43716 43714 43710 43717 43530 43526 43718 43717"
## [184] "PMC8560872 /pmc/articles/PMC8560872/bin/41421_2021_337_MOESM8_ESM.xlsx Hsapiens 5 2021-03-01 2021-03-08 2021-03-05 2021-09-09 2021-03-07"
## [185] "PMC8531462 /pmc/articles/PMC8531462/bin/mmc4.xlsx Hsapiens 23 44531 44266 44531 44266 44266 44266 44259 44264 44266 44266 44259 44266 44259 44265 44266 44264 44259 44266 44266 44259 44264 44258 44266"
## [186] "PMC8564479 /pmc/articles/PMC8564479/bin/Table_1.xlsx Hsapiens 1 44443"
## [187] "PMC8564479 /pmc/articles/PMC8564479/bin/Table_1.xlsx Hsapiens 1 44443"
## [188] "PMC8564479 /pmc/articles/PMC8564479/bin/Table_1.xlsx Hsapiens 1 44443"
## [189] "PMC8563994 /pmc/articles/PMC8563994/bin/DataSheet_2.xlsx Hsapiens 15 43525 43535 43526 43528 43529 43530 43531 43532 43719 43710 43711 43713 43714 43715 43716"
## [190] "PMC8563994 /pmc/articles/PMC8563994/bin/DataSheet_2.xlsx Hsapiens 1 43711"
## [191] "PMC8558579 /pmc/articles/PMC8558579/bin/esab036_suppl_supplementary_data.xlsx Hsapiens 6 44261 44449 44256 44262 44260 44451"
## [192] "PMC8558579 /pmc/articles/PMC8558579/bin/esab036_suppl_supplementary_data.xlsx Hsapiens 6 44261 44449 44256 44262 44260 44451"
## [193] "PMC8558579 /pmc/articles/PMC8558579/bin/esab036_suppl_supplementary_data.xlsx Hsapiens 5 44261 44449 44256 44262 44451"
## [194] "PMC8485671 /pmc/articles/PMC8485671/bin/2019_238147_GRASSI_SUPPL_TABLES.xlsx Hsapiens 19 37681 38412 39142 40787 38961 39508 37500 38777 37316 38596 37226 36951 37316 36951 42248 40422 39326 39692 40057"
## [195] "PMC8485671 /pmc/articles/PMC8485671/bin/2019_238147_GRASSI_SUPPL_TABLES.xlsx Hsapiens 190 40787 40787 40787 38961 40787 40787 38961 40787 38961 38777 37500 37500 38777 38777 40787 39142 39142 39142 39142 39142 38777 40787 39142 40787 38777 37500 37500 37500 38961 38777 38777 38777 39508 38961 38961 38961 38961 38961 38961 38961 38961 38961 38961 37681 37681 38777 38777 38777 38777 38777 38777 38777 38777 38777 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 37500 37500 37500 37500 37500 37500 37500 39142 39142 39142 39142 39142 39142 40422 40422 40422 37316 37316 40057 39508 38961 38777 40787 40787 40787 38412 40787 40787 40057 39326 40787 40787 37226 39326 39692 37500 37500 38961 39326 38777 38777 38777 38777 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 40787 36951 36951 38596 37500 37500 37500 37500 37500 37500 37500 39142 39142 39142 39142 39142 40422 40422 40422 40422 37316 40057 40057 38412 38412 39508 39508 39508 39508 39508 42248 36951 37316 37316 37500 37500 37500 38961 40787 40787 39142 38961 38961 39326 39326 38777 38777 38777 40787 40787 40787 40787 37500 37500 39142 39142 39142 37316 37316 40057 40057 40057 40057 38412 39508"
Let’s investigate the errors in more detail.
# By species
SPECIES <- sapply(strsplit(ERROR_GENELISTS," "),"[[",3)
table(SPECIES)
## SPECIES
## Athaliana Ggallus Hsapiens Mmusculus Rnorvegicus Scerevisiae
## 1 6 169 14 3 2
par(mar=c(5,12,4,2))
barplot(table(SPECIES),horiz=TRUE,las=1)
par(mar=c(5,5,4,2))
# Number of affected Excel files per paper
DIST <- table(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
DIST
##
## PMC8485671 PMC8524723 PMC8531462 PMC8546856 PMC8548210 PMC8552660 PMC8552790
## 2 2 1 1 1 2 3
## PMC8556363 PMC8556884 PMC8558579 PMC8559480 PMC8560584 PMC8560872 PMC8560912
## 1 3 3 1 1 13 21
## PMC8561166 PMC8561425 PMC8563320 PMC8563994 PMC8564051 PMC8564479 PMC8564494
## 1 3 3 2 2 3 1
## PMC8566369 PMC8566521 PMC8567280 PMC8567320 PMC8567755 PMC8570510 PMC8571097
## 1 2 2 1 1 2 2
## PMC8573080 PMC8574973 PMC8575036 PMC8575309 PMC8575900 PMC8576023 PMC8577010
## 1 1 3 3 4 2 1
## PMC8580886 PMC8581057 PMC8581407 PMC8581465 PMC8581662 PMC8581766 PMC8582303
## 1 1 1 4 1 2 2
## PMC8585772 PMC8586148 PMC8588809 PMC8590766 PMC8591166 PMC8592943 PMC8596853
## 1 1 2 1 5 6 1
## PMC8597285 PMC8598162 PMC8599367 PMC8599678 PMC8599854 PMC8602260 PMC8602306
## 1 2 1 4 2 2 4
## PMC8602823 PMC8603206 PMC8603481 PMC8604288 PMC8605608 PMC8608327 PMC8609233
## 4 4 1 1 1 1 4
## PMC8609283 PMC8610329 PMC8610377 PMC8614379 PMC8616896 PMC8626465 PMC8629385
## 15 4 6 2 5 2 4
summary(as.numeric(DIST))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.000 2.000 2.786 3.000 21.000
hist(DIST,main="Number of affected Excel files per paper")
# PMC Articles with the most errors
DIST_DF <- as.data.frame(DIST)
DIST_DF <- DIST_DF[order(-DIST_DF$Freq),,drop=FALSE]
head(DIST_DF,20)
## Var1 Freq
## 14 PMC8560912 21
## 64 PMC8609283 15
## 13 PMC8560872 13
## 48 PMC8592943 6
## 66 PMC8610377 6
## 47 PMC8591166 5
## 68 PMC8616896 5
## 33 PMC8575900 4
## 39 PMC8581465 4
## 53 PMC8599678 4
## 56 PMC8602306 4
## 57 PMC8602823 4
## 58 PMC8603206 4
## 63 PMC8609233 4
## 65 PMC8610329 4
## 70 PMC8629385 4
## 7 PMC8552790 3
## 9 PMC8556884 3
## 10 PMC8558579 3
## 16 PMC8561425 3
MOST_ERR_FILES = as.character(DIST_DF[1,1])
MOST_ERR_FILES
## [1] "PMC8560912"
# Number of errors per paper
NERR <- as.numeric(sapply(strsplit(ERROR_GENELISTS," "),"[[",4))
names(NERR) <- sapply(strsplit(ERROR_GENELISTS," "),"[[",1)
NERR <-tapply(NERR, names(NERR), sum)
NERR
## PMC8485671 PMC8524723 PMC8531462 PMC8546856 PMC8548210 PMC8552660 PMC8552790
## 209 29 23 1 3 15 673
## PMC8556363 PMC8556884 PMC8558579 PMC8559480 PMC8560584 PMC8560872 PMC8560912
## 15 5 17 1 1 439 243
## PMC8561166 PMC8561425 PMC8563320 PMC8563994 PMC8564051 PMC8564479 PMC8564494
## 9 4 38 16 48 3 1
## PMC8566369 PMC8566521 PMC8567280 PMC8567320 PMC8567755 PMC8570510 PMC8571097
## 28 12 3 27 11 45 27
## PMC8573080 PMC8574973 PMC8575036 PMC8575309 PMC8575900 PMC8576023 PMC8577010
## 1 2 5 40 11 2 20
## PMC8580886 PMC8581057 PMC8581407 PMC8581465 PMC8581662 PMC8581766 PMC8582303
## 3 11 1 23 11 34 2
## PMC8585772 PMC8586148 PMC8588809 PMC8590766 PMC8591166 PMC8592943 PMC8596853
## 28 2 51 58 156 11 1
## PMC8597285 PMC8598162 PMC8599367 PMC8599678 PMC8599854 PMC8602260 PMC8602306
## 1 46 4 14 66 112 105
## PMC8602823 PMC8603206 PMC8603481 PMC8604288 PMC8605608 PMC8608327 PMC8609233
## 4 144 1 1 29 33 7
## PMC8609283 PMC8610329 PMC8610377 PMC8614379 PMC8616896 PMC8626465 PMC8629385
## 490 4 26 11 12 2 10
hist(NERR,main="number of errors per PMC article")
NERR_DF <- as.data.frame(NERR)
NERR_DF <- NERR_DF[order(-NERR_DF$NERR),,drop=FALSE]
head(NERR_DF,20)
## NERR
## PMC8552790 673
## PMC8609283 490
## PMC8560872 439
## PMC8560912 243
## PMC8485671 209
## PMC8591166 156
## PMC8603206 144
## PMC8602260 112
## PMC8602306 105
## PMC8599854 66
## PMC8590766 58
## PMC8588809 51
## PMC8564051 48
## PMC8598162 46
## PMC8570510 45
## PMC8575309 40
## PMC8563320 38
## PMC8581766 34
## PMC8608327 33
## PMC8524723 29
MOST_ERR = rownames(NERR_DF)[1]
MOST_ERR
## [1] "PMC8552790"
GENELIST_ERROR_ARTICLES <- gsub("PMC","",GENELIST_ERROR_ARTICLES)
### JSON PARSING is more reliable than XML
ARTICLES <- esummary( GENELIST_ERROR_ARTICLES , db="pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLES,as= "parsed")
ARTICLE_DATA <- ARTICLE_DATA$result
ARTICLE_DATA <- ARTICLE_DATA[2:length(ARTICLE_DATA)]
JOURNALS <- unlist(lapply(ARTICLE_DATA,function(x) {x$fulljournalname} ))
JOURNALS_TABLE <- table(JOURNALS)
JOURNALS_TABLE <- JOURNALS_TABLE[order(-JOURNALS_TABLE)]
length(JOURNALS_TABLE)
## [1] 46
NUM_JOURNALS=length(JOURNALS_TABLE)
par(mar=c(5,25,4,2))
barplot(head(JOURNALS_TABLE,10), horiz=TRUE, las=1,
xlab="Articles with gene name errors in supp files",
main="Top journals this month")
Congrats to our Journal of the Month winner!
JOURNAL_WINNER <- names(head(JOURNALS_TABLE,1))
JOURNAL_WINNER
## [1] "Nature Communications"
There are two categories:
Paper with the most suplementary files affected by gene name errors (MOST_ERR_FILES)
Paper with the most gene names converted to dates (MOST_ERR)
Sometimes, one paper can win both categories. Congrats to our winners.
MOST_ERR_FILES <- gsub("PMC","",MOST_ERR_FILES)
ARTICLES <- esummary( MOST_ERR_FILES , db="pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLES,as= "parsed")
ARTICLE_DATA <- ARTICLE_DATA[2]
ARTICLE_DATA
## $result
## $result$uids
## [1] "8560912"
##
## $result$`8560912`
## $result$`8560912`$uid
## [1] "8560912"
##
## $result$`8560912`$pubdate
## [1] "2021 Nov 1"
##
## $result$`8560912`$epubdate
## [1] "2021 Nov 1"
##
## $result$`8560912`$printpubdate
## [1] ""
##
## $result$`8560912`$source
## [1] "Nat Commun"
##
## $result$`8560912`$authors
## name authtype
## 1 Lehmann BD Author
## 2 Colaprico A Author
## 3 Silva TC Author
## 4 Chen J Author
## 5 An H Author
## 6 Ban Y Author
## 7 Huang H Author
## 8 Wang L Author
## 9 James JL Author
## 10 Balko JM Author
## 11 Gonzalez-Ericsson PI Author
## 12 Sanders ME Author
## 13 Zhang B Author
## 14 Pietenpol JA Author
## 15 Chen XS Author
##
## $result$`8560912`$title
## [1] "Multi-omics analysis identifies therapeutic vulnerabilities in triple-negative breast cancer subtypes"
##
## $result$`8560912`$volume
## [1] "12"
##
## $result$`8560912`$issue
## [1] ""
##
## $result$`8560912`$pages
## [1] "6276"
##
## $result$`8560912`$articleids
## idtype value
## 1 pmid 34725325
## 2 doi 10.1038/s41467-021-26502-6
## 3 pmcid PMC8560912
##
## $result$`8560912`$fulljournalname
## [1] "Nature Communications"
##
## $result$`8560912`$sortdate
## [1] "2021/11/01 00:00"
##
## $result$`8560912`$pmclivedate
## [1] "2021/11/15"
MOST_ERR <- gsub("PMC","",MOST_ERR)
ARTICLE_DATA <- esummary(MOST_ERR,db = "pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLE_DATA,as= "parsed")
ARTICLE_DATA
## $header
## $header$type
## [1] "esummary"
##
## $header$version
## [1] "0.3"
##
##
## $result
## $result$uids
## [1] "8552790"
##
## $result$`8552790`
## $result$`8552790`$uid
## [1] "8552790"
##
## $result$`8552790`$pubdate
## [1] "2021 Oct 25"
##
## $result$`8552790`$epubdate
## [1] "2021 Oct 25"
##
## $result$`8552790`$printpubdate
## [1] ""
##
## $result$`8552790`$source
## [1] "PeerJ"
##
## $result$`8552790`$authors
## name authtype
## 1 Nie R Author
## 2 Niu W Author
## 3 Tang T Author
## 4 Zhang J Author
## 5 Zhang X Author
##
## $result$`8552790`$title
## [1] "Integrating microRNA expression, miRNA-mRNA regulation network and signal pathway: a novel strategy for lung cancer biomarker discovery"
##
## $result$`8552790`$volume
## [1] "9"
##
## $result$`8552790`$issue
## [1] ""
##
## $result$`8552790`$pages
## [1] "e12369"
##
## $result$`8552790`$articleids
## idtype value
## 1 pmid 34754623
## 2 doi 10.7717/peerj.12369
## 3 pmcid PMC8552790
##
## $result$`8552790`$fulljournalname
## [1] "PeerJ"
##
## $result$`8552790`$sortdate
## [1] "2021/10/25 00:00"
##
## $result$`8552790`$pmclivedate
## [1] "2021/11/08"
To plot the trend over the past 6-12 months.
url <- "http://ziemann-lab.net/public/gene_name_errors/"
doc <- htmlParse(url)
links <- xpathSApply(doc, "//a/@href")
links <- links[grep("html",links)]
links
## href href href
## "Report_2021-02.html" "Report_2021-03.html" "Report_2021-04.html"
## href href href
## "Report_2021-05.html" "Report_2021-06.html" "Report_2021-07.html"
## href href href
## "Report_2021-08.html" "Report_2021-09.html" "Report_2021-10.html"
## href
## "Report_2021-11.html"
unlink("online_files/",recursive=TRUE)
dir.create("online_files")
sapply(links, function(mylink) {
download.file(paste(url,mylink,sep=""),destfile=paste("online_files/",mylink,sep=""))
} )
## href href href href href href href href href href
## 0 0 0 0 0 0 0 0 0 0
myfilelist <- list.files("online_files/",full.names=TRUE)
trends <- sapply(myfilelist, function(myfilename) {
x <- readLines(myfilename)
# Num XL gene list articles
NUM_GENELIST_ARTICLES <- x[grep("NUM_GENELIST_ARTICLES",x)[3]+1]
NUM_GENELIST_ARTICLES <- sapply(strsplit(NUM_GENELIST_ARTICLES," "),"[[",3)
NUM_GENELIST_ARTICLES <- sapply(strsplit(NUM_GENELIST_ARTICLES,"<"),"[[",1)
NUM_GENELIST_ARTICLES <- as.numeric(NUM_GENELIST_ARTICLES)
# number of affected articles
NUM_ERROR_GENELIST_ARTICLES <- x[grep("NUM_ERROR_GENELIST_ARTICLES",x)[3]+1]
NUM_ERROR_GENELIST_ARTICLES <- sapply(strsplit(NUM_ERROR_GENELIST_ARTICLES," "),"[[",3)
NUM_ERROR_GENELIST_ARTICLES <- sapply(strsplit(NUM_ERROR_GENELIST_ARTICLES,"<"),"[[",1)
NUM_ERROR_GENELIST_ARTICLES <- as.numeric(NUM_ERROR_GENELIST_ARTICLES)
# Error proportion
ERROR_PROPORTION <- x[grep("ERROR_PROPORTION",x)[3]+1]
ERROR_PROPORTION <- sapply(strsplit(ERROR_PROPORTION," "),"[[",3)
ERROR_PROPORTION <- sapply(strsplit(ERROR_PROPORTION,"<"),"[[",1)
ERROR_PROPORTION <- as.numeric(ERROR_PROPORTION)
# number of journals
NUM_JOURNALS <- x[grep('JOURNALS_TABLE',x)[3]+1]
NUM_JOURNALS <- sapply(strsplit(NUM_JOURNALS," "),"[[",3)
NUM_JOURNALS <- sapply(strsplit(NUM_JOURNALS,"<"),"[[",1)
NUM_JOURNALS <- as.numeric(NUM_JOURNALS)
NUM_JOURNALS
res <- c(NUM_GENELIST_ARTICLES,NUM_ERROR_GENELIST_ARTICLES,ERROR_PROPORTION,NUM_JOURNALS)
return(res)
})
colnames(trends) <- sapply(strsplit(colnames(trends),"_"),"[[",3)
colnames(trends) <- gsub(".html","",colnames(trends))
trends <- as.data.frame(trends)
rownames(trends) <- c("NUM_GENELIST_ARTICLES","NUM_ERROR_GENELIST_ARTICLES","ERROR_PROPORTION","NUM_JOURNALS")
trends <- t(trends)
trends <- as.data.frame(trends)
CURRENT_RES <- c(NUM_GENELIST_ARTICLES,NUM_ERROR_GENELIST_ARTICLES,ERROR_PROPORTION,NUM_JOURNALS)
trends <- rbind(trends,CURRENT_RES)
paste(CURRENT_YEAR,CURRENT_MONTH,sep="-")
## [1] "2021-12"
rownames(trends)[nrow(trends)] <- paste(CURRENT_YEAR,CURRENT_MONTH,sep="-")
plot(trends$NUM_GENELIST_ARTICLES, xaxt = "n" , type="b" , main="Number of articles with Excel gene lists per month",
ylab="number of articles", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
plot(trends$NUM_ERROR_GENELIST_ARTICLES, xaxt = "n" , type="b" , main="Number of articles with gene name errors per month",
ylab="number of articles", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
plot(trends$ERROR_PROPORTION, xaxt = "n" , type="b" , main="Proportion of articles with Excel gene list affected by errors",
ylab="proportion", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
plot(trends$NUM_JOURNALS, xaxt = "n" , type="b" , main="Number of journals with affected articles",
ylab="number of journals", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
unlink("online_files/",recursive=TRUE)
Zeeberg, B.R., Riss, J., Kane, D.W. et al. Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC Bioinformatics 5, 80 (2004). https://doi.org/10.1186/1471-2105-5-80
Ziemann, M., Eren, Y. & El-Osta, A. Gene name errors are widespread in the scientific literature. Genome Biol 17, 177 (2016). https://doi.org/10.1186/s13059-016-1044-7
sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
## [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8
## [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] readxl_1.3.1 reutils_0.2.3 xml2_1.3.2 jsonlite_1.7.2 XML_3.99-0.8
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.7 knitr_1.36 magrittr_2.0.1 R6_2.5.1
## [5] rlang_0.4.12 fastmap_1.1.0 stringr_1.4.0 highr_0.9
## [9] tools_4.1.2 xfun_0.28 jquerylib_0.1.4 htmltools_0.5.2
## [13] yaml_2.2.1 digest_0.6.28 assertthat_0.2.1 sass_0.4.0
## [17] bitops_1.0-7 RCurl_1.98-1.5 evaluate_0.14 rmarkdown_2.11
## [21] stringi_1.7.5 compiler_4.1.2 bslib_0.3.1 cellranger_1.1.0