Source: https://github.com/markziemann/GeneNameErrors2020

View the reports: http://ziemann-lab.net/public/gene_name_errors/

Intro

Gene name errors result when data are imported improperly into MS Excel and other spreadsheet programs (Zeeberg et al, 2004). Certain gene names like MARCH3, SEPT2 and DEC1 are converted into date format. These errors are surprisingly common in supplementary data files in the field of genomics (Ziemann et al, 2016). This could be considered a small error because it only affects a small number of genes, however it is symptomtic of poor data processing methods. The purpose of this script is to identify gene name errors present in supplementary files of PubMed Central articles in the previous month.

library("XML")
library("jsonlite")
library("xml2")
library("reutils")
library("readxl")
library("RCurl")

Get PMC IDs

Here I will be getting PubMed Central IDs for the previous month.

Start with figuring out the date to search PubMed Central.

CURRENT_MONTH=format(Sys.time(), "%m")
CURRENT_YEAR=format(Sys.time(), "%Y")

if (CURRENT_MONTH == "01") {
  PREV_YEAR=as.character(as.numeric(format(Sys.time(), "%Y"))-1)
  PREV_MONTH="12"
} else {
  PREV_YEAR=CURRENT_YEAR
  PREV_MONTH=as.character(as.numeric(format(Sys.time(), "%m"))-1)
}

DATE=paste(PREV_YEAR,"/",PREV_MONTH,sep="")
DATE
## [1] "2023/6"

Let’s see how many PMC IDs we have in the past month.

QUERY ='((genom*[Abstract]))'

ESEARCH_RES <- esearch(term=QUERY, db = "pmc", rettype = "uilist", retmode = "xml", retstart = 0, 
  retmax = 5000000, usehistory = TRUE, webenv = NULL, querykey = NULL, sort = NULL, field = NULL, 
  datetype = NULL, reldate = NULL, 
  mindate = paste(DATE,"/1",sep="") , maxdate = paste(DATE,"/31",sep=""))

pmc <- efetch(ESEARCH_RES,retmode="text",rettype="uilist",outfile="pmcids.txt")
## Retrieving UIDs 1 to 500
## Retrieving UIDs 501 to 1000
## Retrieving UIDs 1001 to 1500
## Retrieving UIDs 1501 to 2000
## Retrieving UIDs 2001 to 2500
## Retrieving UIDs 2501 to 3000
## Retrieving UIDs 3001 to 3500
pmc <- read.table(pmc)
pmc <- paste("PMC",pmc$V1,sep="")
NUM_ARTICLES=length(pmc)
NUM_ARTICLES
## [1] 3262
writeLines(pmc,con="pmc.txt")

Run the screen

Now run the bash script. Note that false positives can occur (~1.5%) and these results have not been verified by a human.

Here are some definitions:

  • NUM_XLS = Number of supplementary Excel files in this set of PMC articles.

  • NUM_XLS_ARTICLES = Number of articles matching the PubMed Central search which have supplementary Excel files.

  • GENELISTS = The gene lists found in the Excel files. Each Excel file is counted once even it has multiple gene lists.

  • NUM_GENELISTS = The number of Excel files with gene lists.

  • NUM_GENELIST_ARTICLES = The number of PMC articles with supplementary Excel gene lists.

  • ERROR_GENELISTS = Files suspected to contain gene name errors. The dates and five-digit numbers indicate transmogrified gene names.

  • NUM_ERROR_GENELISTS = Number of Excel gene lists with errors.

  • NUM_ERROR_GENELIST_ARTICLES = Number of articles with supplementary Excel gene name errors.

  • ERROR_PROPORTION = This is the proportion of articles with Excel gene lists that have errors.

system("./gene_names.sh pmc.txt")
results <- readLines("results.txt")

XLS <- results[grep("XLS",results,ignore.case=TRUE)]
NUM_XLS = length(XLS)
NUM_XLS
## [1] 5626
NUM_XLS_ARTICLES = length(unique(sapply(strsplit(XLS," "),"[[",1)))
NUM_XLS_ARTICLES
## [1] 1161
GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>2]
#GENELISTS

NUM_GENELISTS <- length(unique(sapply(strsplit(GENELISTS," "),"[[",2)))
NUM_GENELISTS
## [1] 588
NUM_GENELIST_ARTICLES <- length(unique(sapply(strsplit(GENELISTS," "),"[[",1)))
NUM_GENELIST_ARTICLES
## [1] 291
ERROR_GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>3]
#ERROR_GENELISTS

NUM_ERROR_GENELISTS = length(ERROR_GENELISTS)
NUM_ERROR_GENELISTS
## [1] 188
GENELIST_ERROR_ARTICLES <- unique(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
GENELIST_ERROR_ARTICLES
##  [1] "PMC10302091" "PMC10299661" "PMC10297931" "PMC10297722" "PMC10297614"
##  [6] "PMC10294482" "PMC10290958" "PMC10287941" "PMC10287762" "PMC10285042"
## [11] "PMC10279740" "PMC10267701" "PMC10266228" "PMC10265687" "PMC10263356"
## [16] "PMC10261190" "PMC10258790" "PMC10253224" "PMC10253099" "PMC10252000"
## [21] "PMC10250210" "PMC10242354" "PMC10242145" "PMC10241889" "PMC10237035"
## [26] "PMC10233889" "PMC10229598" "PMC10307879" "PMC10306090" "PMC10300554"
## [31] "PMC10300496" "PMC10300174" "PMC10297900" "PMC10294383" "PMC10292886"
## [36] "PMC10291128" "PMC10289466" "PMC10288324" "PMC10286233" "PMC10284603"
## [41] "PMC10282049" "PMC10281998" "PMC10277924" "PMC10275808" "PMC10273709"
## [46] "PMC10266224" "PMC10265161" "PMC10261899" "PMC10253790" "PMC10252847"
## [51] "PMC10250199" "PMC10249681" "PMC10245500" "PMC10243159" "PMC10241222"
## [56] "PMC10239147" "PMC10234573" "PMC10234209" "PMC10230138" "PMC10229423"
NUM_ERROR_GENELIST_ARTICLES <- length(GENELIST_ERROR_ARTICLES) 
NUM_ERROR_GENELIST_ARTICLES
## [1] 60
ERROR_PROPORTION = NUM_ERROR_GENELIST_ARTICLES / NUM_GENELIST_ARTICLES
ERROR_PROPORTION
## [1] 0.2061856

Look at the errors detected

Here you can have a look at all the gene lists detected in the past month, as well as those with errors. The dates are obvious errors, these are commonly dates in September, March, December and October. The five-digit numbers represent dates as they are encoded in the Excel internal format. The five digit number is the number of days since 1900. If you were to take these numbers and put them into Excel and format the cells as dates, then these will also mostly map to dates in September, March, December and October.

#GENELISTS

ERROR_GENELISTS
##   [1] "PMC10302091 zip/S1_GSE153434_all_genes_expression.xlsx Hsapiens 28 45261 44986 44987 44986 44995 44996 44987 44988 44989 44990 44991 44992 44993 44994 45184 45170 45179 45180 45181 45183 45171 45172 45173 45174 45175 45176 45177 45178"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
##   [2] "PMC10302091 zip/S1_GSE153434_all_genes_expression.xlsx Hsapiens 29 45176 45261 44986 44987 44986 44995 44996 44987 44988 44989 44990 44991 44992 44993 44994 45184 45170 45179 45180 45181 45183 45171 45172 45173 45174 45175 45176 45177 45178"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##   [3] "PMC10302091 zip/S3_GSE52093_genes_expression.xlsx Hsapiens 27 45178 45177 45176 45175 45174 45173 45172 45171 45183 45181 45180 45179 45170 44994 44993 44992 44991 44990 44989 44988 44987 44996 44995 44986 44987 44986 45261"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##   [4] "PMC10302091 zip/S6_DEGs.xlsx Hsapiens 23 44988 44987 45178 44986 44989 45179 45173 45175 44990 44986 45171 45176 44987 45180 45177 45170 44993 45174 45184 44992 44994 44991 45172"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##   [5] "PMC10299661 zip/TableS7_The_distribution_of_DMCs_in_genes_and_the_functional_enrichment_for_genes_that_harbored_DMCs.xlsx Hsapiens 6 44996 44996 44996 44996 44996 44996"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
##   [6] "PMC10299661 zip/TableS10_List_of_identified_differentially_methylated_and_expressed_genes_DMEGs_.xlsx Hsapiens 1 44986"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##   [7] "PMC10299661 zip/TableS10_List_of_identified_differentially_methylated_and_expressed_genes_DMEGs_.xlsx Hsapiens 1 44986"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##   [8] "PMC10299661 zip/TableS10_List_of_identified_differentially_methylated_and_expressed_genes_DMEGs_.xlsx Hsapiens 2 44986 44986"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##   [9] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 3 42992 42984 44989"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [10] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 3 45261 44989 42983"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [11] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 1 45175"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
##  [12] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 6 44989 44996 44995 44993 42983 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##  [13] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 1 45175"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
##  [14] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 6 44989 44996 44995 44993 42983 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##  [15] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 3 45180 45175 44986"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [16] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 6 42984 44989 44996 44993 42983 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##  [17] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 3 44989 42983 44996"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [18] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 1 45175"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
##  [19] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_4-IJMB-1June23.xlsx Hsapiens 6 45261 44989 44995 44993 44994 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##  [20] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_2-IJMB-1June2023.xlsx Hsapiens 2 9-Sep 1-Mar"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##  [21] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_2-IJMB-1June2023.xlsx Hsapiens 4 10-Mar 3-Mar 8-Sep 6-Sep"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [22] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_1-IJMB-1_June_2023.xlsx Hsapiens 26 45173 45173 45173 44995 44995 45178 45178 45179 45179 45171 45174 44986 44986 44986 44988 44988 45177 45177 45177 45177 45175 45175 45175 45175 45176 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [23] "PMC10297931 zip/Zhang_et_al-Supplementary_Table_1-IJMB-1_June_2023.xlsx Hsapiens 26 45173 45173 45173 44995 44995 45178 45178 45179 45179 45171 45174 44986 44986 44986 44988 44988 45177 45177 45177 45177 45175 45175 45175 45175 45176 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [24] "PMC10297722 zip/cells-2199100-supplementary.xlsx Hsapiens 16 01-Sep 11-Sep 02-Sep 01-Mar 02-Mar 15-Sep 10-Sep 05-Sep 05-Mar 07-Sep 06-Sep 08-Sep 09-Sep 06-Mar 04-Sep 03-Sep"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##  [25] "PMC10297722 zip/cells-2199100-supplementary.xlsx Hsapiens 1 43897"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [26] "PMC10297614 zip/Supplementary_data_Table_S1.xlsx Hsapiens 5 43532 43718 43530 43719 43525"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [27] "PMC10294482 PMC_DL/PMC10294482/supplementaryfiles/12967_2023_4275_MOESM10_ESM.xlsx Mmusculus 8 2023/03/03 2023/09/05 2023/03/06 2023/09/03 2023/09/06 2023/03/05 2023/09/10 2023/09/11"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [28] "PMC10290958 PMC_DL/PMC10290958/supplementaryfiles/42255_2023_820_MOESM3_ESM.xlsx Hsapiens 1 43715"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [29] "PMC10287941 zip/NAR_Supplemental_Table.xlsx Hsapiens 27 38596 38047 37865 40603 38231 39692 36951 40238 40057 40422 37316 39873 37681 36951 38961 37226 39508 40787 41153 41883 37316 38777 37135 38412 39142 39326 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [30] "PMC10287762 PMC_DL/PMC10287762/supplementaryfiles/42003_2023_5038_MOESM4_ESM.xlsx Athaliana 1 43009"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [31] "PMC10287762 PMC_DL/PMC10287762/supplementaryfiles/42003_2023_5038_MOESM4_ESM.xlsx Athaliana 1 43009"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [32] "PMC10285042 PMC_DL/PMC10285042/supplementaryfiles/mmc2.xlsx Hsapiens 1 44896"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##  [33] "PMC10279740 PMC_DL/PMC10279740/supplementaryfiles/41598_2023_36900_MOESM4_ESM.xlsx Hsapiens 6 38231 37500 41153 40057 38777 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [34] "PMC10279740 PMC_DL/PMC10279740/supplementaryfiles/41598_2023_36900_MOESM4_ESM.xlsx Hsapiens 10 38231 37500 37500 37500 40057 41153 40057 37500 38777 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
##  [35] "PMC10267701 PMC_DL/PMC10267701/supplementaryfiles/EMBJ-42-e111272-s012.xlsx Hsapiens 7 40422 40787 37135 38231 40422 40787 40787"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [36] "PMC10267701 PMC_DL/PMC10267701/supplementaryfiles/EMBJ-42-e111272-s006.xlsx Hsapiens 1 38596"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##  [37] "PMC10266228 PMC_DL/PMC10266228/supplementaryfiles/Table_4.xlsx Hsapiens 1 44622"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [38] "PMC10266228 PMC_DL/PMC10266228/supplementaryfiles/Table_3.xlsx Hsapiens 1 44626"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [39] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table2.XLSX Mmusculus 8 44079 43897 43897 43892 44076 44076 44082 43895"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [40] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table2.XLSX Mmusculus 1 43897"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [41] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table2.XLSX Mmusculus 42 44076 44076 44076 44076 44076 44079 44079 44079 44079 44079 43897 43897 43897 43897 43897 43898 43898 43898 43898 43898 43898 43898 44085 44085 44085 44085 44085 44084 44084 43892 43892 43892 43892 43892 43892 44083 44082 43895 43895 43895 43895 43895"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [42] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table2.XLSX Mmusculus 7 43898 43898 43892 43892 43892 43895 43895"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [43] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table2.XLSX Mmusculus 1 44083"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [44] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table3.XLSX Mmusculus 7 44444 44262 44257 44441 44441 44447 44260"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [45] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table3.XLSX Mmusculus 1 44262"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [46] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table3.XLSX Mmusculus 4 44263 44263 44257 44257"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [47] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table3.XLSX Mmusculus 2 44440 44448"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
##  [48] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table3.XLSX Mmusculus 36 44441 44441 44258 44444 44444 44444 44262 44262 44262 44262 44263 44263 44263 44263 44263 44263 44450 44450 44450 44450 44450 44449 44449 44257 44257 44257 44257 44257 44257 44257 44446 44261 44261 44261 44447 44260"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##  [49] "PMC10265687 PMC_DL/PMC10265687/supplementaryfiles/Table1.XLSX Mmusculus 21 44075 44081 44082 44084 44078 44077 43895 43898 44076 43897 43893 43891 43896 43894 43899 44080 44085 44083 44079 43892 43892"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
##  [50] "PMC10263356 PMC_DL/PMC10263356/supplementaryfiles/pbio.3002097.s001.xlsx Hsapiens 1 1-Dec"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [51] "PMC10263356 PMC_DL/PMC10263356/supplementaryfiles/pbio.3002097.s001.xlsx Hsapiens 18 4-Mar 2-Sep 10-Sep 9-Sep 10-Mar 8-Sep 2-Mar 1-Dec 4-Sep 3-Sep 3-Mar 1-Mar 6-Mar 11-Mar 12-Sep 1-Mar 5-Mar 6-Sep"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
##  [52] "PMC10261190 PMC_DL/PMC10261190/supplementaryfiles/401_2023_2583_MOESM2_ESM.xlsx Hsapiens 9 39508 38047 39508 37681 36951 40603 36951 38596 38047"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [53] "PMC10261190 PMC_DL/PMC10261190/supplementaryfiles/401_2023_2583_MOESM2_ESM.xlsx Hsapiens 2 38596 40057"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [54] "PMC10261190 PMC_DL/PMC10261190/supplementaryfiles/401_2023_2583_MOESM2_ESM.xlsx Hsapiens 11 40057 42248 42248 39142 40603 38596 38777 37316 40057 41518 38047"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [55] "PMC10258790 PMC_DL/PMC10258790/supplementaryfiles/41467_2023_39191_MOESM13_ESM.xlsx Hsapiens 27 44622 44621 44814 44627 44624 44806 44815 44621 44626 44631 44623 44812 44811 44818 44896 44628 44625 44629 44816 44805 44808 44630 44813 44622 44809 44807 44810"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [56] "PMC10258790 PMC_DL/PMC10258790/supplementaryfiles/41467_2023_39191_MOESM15_ESM.xlsx Hsapiens 13 44622 44814 44627 44815 44623 44812 44811 44625 44629 44805 44630 44622 44807"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [57] "PMC10258790 PMC_DL/PMC10258790/supplementaryfiles/41467_2023_39191_MOESM15_ESM.xlsx Hsapiens 15 44622 44814 44627 44806 44815 44626 44623 44812 44811 44625 44805 44630 44813 44622 44807"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [58] "PMC10258790 PMC_DL/PMC10258790/supplementaryfiles/41467_2023_39191_MOESM15_ESM.xlsx Hsapiens 10 44814 44627 44806 44815 44812 44811 44628 44625 44629 44622"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
##  [59] "PMC10258790 PMC_DL/PMC10258790/supplementaryfiles/41467_2023_39191_MOESM15_ESM.xlsx Hsapiens 14 44814 44627 44806 44815 44626 44623 44811 44628 44625 44629 44805 44630 44622 44807"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [60] "PMC10253224 zip/Supplementary_file_11.xlsx Hsapiens 1 43709"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
##  [61] "PMC10253224 zip/Supplementary_file_11.xlsx Hsapiens 3 43528 43713 43719"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
##  [62] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 7 43709 43714 43528 43713 43719 43525 43525"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
##  [63] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 28 43525 43528 43718 43717 43533 43709 43534 43720 43712 43714 43532 43525 43716 43711 43526 43713 43531 43526 43710 43719 43715 43535 43527 43530 43723 43529 43800 43722"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [64] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 28 43526 43529 43530 43718 43719 43710 43711 43712 43714 43715 43716 43717 43525 43533 43526 43723 43720 43528 43534 43535 43531 43527 43525 43532 43713 43800 43709 43722"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [65] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 28 43525 43528 43718 43533 43717 43534 43720 43713 43526 43711 43712 43716 43526 43710 43719 43715 43532 43535 43531 43525 43709 43714 43722 43723 43530 43527 43800 43529"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [66] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 28 43535 43719 43710 43715 43713 43527 43525 43720 43526 43718 43533 43525 43526 43534 43800 43712 43528 43723 43709 43711 43532 43530 43716 43714 43717 43722 43529 43531"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [67] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 28 43526 43529 43530 43718 43719 43710 43711 43712 43714 43715 43717 43723 43716 43720 43525 43533 43531 43709 43526 43532 43534 43535 43527 43528 43525 43800 43713 43722"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [68] "PMC10253224 zip/Supplementary_file_3.xlsx Hsapiens 28 43526 43529 43530 43718 43719 43710 43712 43714 43715 43717 43711 43723 43716 43720 43525 43533 43531 43526 43709 43532 43534 43528 43527 43535 43525 43713 43722 43800"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [69] "PMC10253099 zip/Table_S3_Differential_Expression_Genes_in_the_atpc1_Mutant.xlsx Athaliana 9 42648 42644 42463 42649 42647 42645 42614 42586 42616"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [70] "PMC10252000 zip/Table_S2,GO_and_KEGG_analyses_of_genes_with_MAMSTR-binding_sites..xlsx Hsapiens 1 44988"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
##  [71] "PMC10250210 zip/supplementary_Table_S2_Summary_of_all_regions_captured_by_CapSTARR-seq_within_all_conditions.xlsx Mmusculus 10 38231 37135 37135 37135 39326 39692 39692 37865 38961 39326"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
##  [72] "PMC10242354 PMC_DL/PMC10242354/supplementaryfiles/CAM4-12-11686-s002.xlsx Hsapiens 3 44083 44080 44079"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [73] "PMC10242354 PMC_DL/PMC10242354/supplementaryfiles/CAM4-12-11686-s002.xlsx Hsapiens 1 44448"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
##  [74] "PMC10242145 PMC_DL/PMC10242145/supplementaryfiles/DataSheet3.XLSX Hsapiens 3 37681 40603 37316"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
##  [75] "PMC10242145 PMC_DL/PMC10242145/supplementaryfiles/DataSheet4.XLSX Hsapiens 49 44257 44257 44256 44256 44448 44448 44448 44448 44448 44448 44448 44448 44448 44448 44265 44265 44265 44443 44263 44263 44263 44260 44262 44262 44262 44261 44261 44257 44257 44258 44258 44258 44447 44440 44449 44442 44442 44264 44264 44256 44256 44256 44256 44256 44256 44256 44256 44256 44445"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [76] "PMC10242145 PMC_DL/PMC10242145/supplementaryfiles/DataSheet4.XLSX Hsapiens 39 44257 44257 44256 44256 44448 44448 44448 44448 44448 44448 44448 44448 44265 44265 44265 44443 44263 44263 44260 44262 44262 44262 44261 44261 44257 44257 44258 44258 44258 44447 44440 44449 44442 44442 44264 44264 44256 44445 44445"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
##  [77] "PMC10242145 PMC_DL/PMC10242145/supplementaryfiles/DataSheet1.XLSX Hsapiens 16 44440 44450 44263 44257 44445 44257 44258 44447 44262 44441 44446 44261 44448 44449 44264 44260"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
##  [78] "PMC10241889 PMC_DL/PMC10241889/supplementaryfiles/41416_2023_2247_MOESM3_ESM.xlsx Hsapiens 128 44816 44628 44811 44809 44629 44808 44808 43435 44809 44807 44816 44814 44813 44812 44627 44626 44631 44814 44622 44813 44805 44813 44628 44630 44816 44627 44813 44630 44805 44813 44621 44812 44806 44630 44808 43435 44621 44628 44808 44624 44813 44629 44624 44629 44626 44806 44813 44621 44814 44622 44815 44623 44813 44621 44808 44813 44818 44625 44810 44627 44812 44628 44623 44621 44621 44806 44808 44623 44813 44813 44622 44809 44807 44624 44806 44807 44622 44813 44813 44627 44630 44621 44622 44622 44630 44810 44626 44625 44813 44815 44812 44813 44808 44809 44806 44806 44812 44631 44815 44627 44808 44628 44813 44808 44813 44810 44628 44630 44631 44812 44805 44621 44813 44818 44813 44808 44809 44808 44813 44809 44627 44625 44813 44818 44811 44808 43435 44811"
##  [79] "PMC10241889 PMC_DL/PMC10241889/supplementaryfiles/41416_2023_2247_MOESM3_ESM.xlsx Hsapiens 128 44813 44805 44628 44630 44813 44627 44630 44816 44627 44813 44805 44629 44813 44621 44812 44808 44806 44630 44808 44628 43435 44621 44628 44808 44624 44813 44629 44624 44816 44629 44626 44806 44813 44621 44814 44622 44815 44814 44807 44623 44813 44626 44621 44808 44813 44818 44625 44810 44627 44812 44628 44623 44621 44621 44809 44806 44808 44623 44813 44813 44622 44809 44807 44624 44806 44622 44807 44814 44622 44813 44813 44627 44630 44621 44622 44622 44630 44810 44813 44811 44626 44625 44813 44815 44812 44813 44808 44816 44809 44806 44631 44806 44812 44631 44815 44809 44627 44808 44628 44813 44808 44812 44810 44813 44628 44630 44631 44812 44805 44621 44813 44818 44813 44808 44808 44809 44808 44813 44809 44627 44625 43435 44813 44818 44811 44808 43435 44811"
##  [80] "PMC10237035 PMC_DL/PMC10237035/supplementaryfiles/mmc6.xlsx Hsapiens 12 39692 39326 39873 37135 38961 38596 40422 41153 36951 37681 42248 38412"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [81] "PMC10237035 PMC_DL/PMC10237035/supplementaryfiles/mmc4.xlsx Hsapiens 2 42248 40422"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
##  [82] "PMC10237035 PMC_DL/PMC10237035/supplementaryfiles/mmc2.xlsx Hsapiens 27 40422 40787 40057 37500 37500 37500 38231 38777 38047 39873 39508 39326 37316 41883 40238 37316 36951 41153 37681 37226 37135 37865 39508 36951 39142 37500 38412"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [83] "PMC10237035 PMC_DL/PMC10237035/supplementaryfiles/mmc2.xlsx Hsapiens 1 38047"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##  [84] "PMC10237035 PMC_DL/PMC10237035/supplementaryfiles/mmc2.xlsx Hsapiens 1 37316"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
##  [85] "PMC10233889 zip/SArnesen_miRNAs_in_ER_mutant_breast_cancer_paper_NARcancer_submission_sup_table_S2.xlsx Hsapiens 6 44812 44819 44627 44806 44623 44811"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [86] "PMC10233889 zip/SArnesen_miRNAs_in_ER_mutant_breast_cancer_paper_NARcancer_submission_sup_table_S2.xlsx Hsapiens 6 44813 44812 44819 44627 44806 44811"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
##  [87] "PMC10233889 zip/SArnesen_miRNAs_in_ER_mutant_breast_cancer_paper_NARcancer_submission_sup_table_S2.xlsx Hsapiens 2 44629 44621"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
##  [88] "PMC10233889 zip/SArnesen_miRNAs_in_ER_mutant_breast_cancer_paper_NARcancer_submission_sup_table_S2.xlsx Hsapiens 5 44629 44622 44627 44621 44811"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
##  [89] "PMC10229598 PMC_DL/PMC10229598/supplementaryfiles/41467_2023_38800_MOESM4_ESM.xlsx Hsapiens 6 44256 44256 44260 44621 44621 44625"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
##  [90] "PMC10307879 PMC_DL/PMC10307879/supplementaryfiles/41598_2023_36650_MOESM4_ESM.xlsx Hsapiens 2 44819 44811"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [91] "PMC10307879 PMC_DL/PMC10307879/supplementaryfiles/41598_2023_36650_MOESM3_ESM.xlsx Hsapiens 2 44819 44811"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
##  [92] "PMC10306090 PMC_DL/PMC10306090/supplementaryfiles/SC-014-D3SC01737K-s005.xlsx Hsapiens 27 44811 44806 44627 44626 44896 44807 44623 44814 44625 44621 44805 44631 44810 44818 44815 44813 44808 44630 44809 44629 44622 44816 44624 44628 44622 44812 44621"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
##  [93] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc5.xlsx Hsapiens 58 40057 40057 40057 37681 39326 40057 39142 39142 40057 40787 40057 37681 40787 37865 40238 40057 40422 40057 40057 37500 38777 37316 39508 38047 40057 40787 40057 37865 40057 39508 40057 38047 38231 40057 37316 37681 36951 39692 40238 40057 40057 36951 36951 40057 40238 36951 38047 40057 37316 36951 36951 40057 40057 40057 37681 40057 38777 37681"                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [94] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc5.xlsx Hsapiens 60 40057 40057 40057 40057 40057 37316 39142 36951 37681 39326 40787 40787 37316 40057 40057 40787 39692 40057 38777 40057 40238 38777 40057 40057 40057 39508 38231 39142 40057 40057 40057 38047 40057 37681 40057 37865 40238 40057 39508 40057 40057 39873 36951 40057 39508 36951 37316 40422 36951 37681 37865 37681 38047 40057 40057 40238 40057 40057 37500 40057"                                                                                                                                                                                                                                                                                                                                                                                                                                               
##  [95] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc5.xlsx Hsapiens 62 37316 39508 39508 39508 37135 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 37316 39142 36951 40787 38777 37681 39692 39326 40057 39142 39142 39873 36951 36951 40057 39508 40238 40057 40057 36951 40057 37681 39508 37865 37316 37316 40057 36951 40057 36951 38231 40057 40057 40057 40057 40057 37316 38777 38777 40422 40057 37865 36951 36951 40057 36951 36951"                                                                                                                                                                                                                                                                                                                                                                                                                                   
##  [96] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc5.xlsx Hsapiens 58 37681 37681 40787 39142 39326 39692 40057 40238 36951 40057 40787 40787 40057 39508 40057 38777 40057 40057 40057 38047 40057 39508 37316 37865 40057 38047 40057 40057 40057 40057 39508 40057 40057 38047 40057 36951 40057 40057 40057 40057 40057 37681 39508 40057 40057 40787 37681 40057 40057 40057 40057 40057 38047 37500 38777 38777 37681 39326"                                                                                                                                                                                                                                                                                                                                                                                                                                                           
##  [97] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc5.xlsx Ggallus 73 39508 39508 39508 39508 39508 39508 38231 38231 40238 40238 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 37316 40422 39142 39142 38047 38047 38047 37500 37865 37865 36951 36951 36951 40787 40787 40787 40787 40787 38777 38777 38777 38777 38777 37681 37681 37681 37681 37681 39692 39692 39326 39326 39326 39326 39326"                                                                                                                                                                                                                                                                                                                                                                  
##  [98] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc5.xlsx Hsapiens 61 40787 39508 39508 37681 40787 39873 38777 36951 36951 39326 36951 36951 40057 40057 37316 36951 40057 40422 40057 39142 36951 36951 40057 40057 36951 40057 39142 40057 40057 40057 40057 39508 36951 40057 37316 37316 40057 38231 40057 40057 40057 38777 40057 37316 40057 39142 39508 41153 40057 40057 40057 40057 39142 39142 39142 39142 36951 36951 36951 37681 39692"                                                                                                                                                                                                                                                                                                                                                                                                                                         
##  [99] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc6.xlsx Hsapiens 2 39326 37681"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [100] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc6.xlsx Hsapiens 2 39326 37681"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [101] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc6.xlsx Hsapiens 2 39326 39142"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [102] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc4.xlsx Hsapiens 94 37316 37316 37316 36951 39508 39508 39508 39508 39508 39873 39873 39873 37135 38231 40238 40238 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 37316 40422 40422 39142 39142 38047 38047 38047 38047 37500 37500 37865 37865 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 40787 40787 40787 40787 40787 40787 40787 40787 38777 38777 38777 38777 37681 37681 37681 37681 37681 37681 39692 39326 39326 39326 39326"                                                                                                                                                                                                                                   
## [103] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc4.xlsx Hsapiens 107 37316 37316 37316 36951 39508 39508 39508 39508 39508 39508 38412 39873 39873 39873 37135 40238 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 40057 37316 40422 40422 39142 39142 38047 38047 38047 37500 37500 37865 37865 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 40787 40787 40787 40787 40787 40787 40787 38777 38777 38777 37681 37681 37681 37681 37681 39692 39326 39326 39326 39326 38047 40238 38777 38777 36951 37316 40057 40057 38231 40238 40057 40787 37681"                                                                                                                                                    
## [104] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc4.xlsx Hsapiens 116 40057 39508 36951 40057 39508 39326 40057 37865 40057 38047 38777 37865 40422 40057 36951 36951 37681 39508 39142 40787 37316 37316 40057 40057 39692 40057 39142 36951 40787 36951 36951 40057 36951 39326 36951 36951 36951 38777 40057 40057 40057 36951 36951 36951 36951 40787 38231 36951 40057 39508 38777 40057 40057 40057 40057 39142 37316 39508 40057 37316 40057 37500 40057 40057 40238 40057 39508 39142 40057 40057 39692 37681 40057 36951 37316 39508 40057 39692 40057 37316 38231 37865 40238 36951 40057 38231 39508 40057 41153 39142 37316 36951 39508 40238 40238 40057 40057 37316 40422 40422 38047 38047 38047 37500 36951 36951 36951 36951 40787 40787 38777 38777 38777 37681 39692 39326"                                                                                              
## [105] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc3.xlsx Hsapiens 21 39508 38412 39142 39873 39326 38231 37681 37316 37226 40057 40422 36951 36951 37135 39692 37865 40787 40238 38777 37500 37316"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## [106] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc3.xlsx Hsapiens 20 36951 38412 36951 39508 40422 39142 37316 39873 39692 37226 40057 38231 37316 37681 37500 40787 37135 39326 38777 40238"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## [107] "PMC10300554 PMC_DL/PMC10300554/supplementaryfiles/mmc3.xlsx Hsapiens 20 39142 37316 40422 37135 38412 39873 37681 39508 38777 38231 36951 37316 37500 37226 36951 40057 40787 39692 39326 41153"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## [108] "PMC10300496 PMC_DL/PMC10300496/supplementaryfiles/mmc2.xlsx Hsapiens 1 44631"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [109] "PMC10300496 PMC_DL/PMC10300496/supplementaryfiles/mmc2.xlsx Hsapiens 2 44623 44626"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [110] "PMC10300496 PMC_DL/PMC10300496/supplementaryfiles/mmc2.xlsx Hsapiens 2 44623 44626"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [111] "PMC10300496 PMC_DL/PMC10300496/supplementaryfiles/mmc2.xlsx Hsapiens 16 44624 44896 44630 44631 44628 44622 44630 44896 44631 44896 44624 44622 44630 44630 44631 44624"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [112] "PMC10300174 PMC_DL/PMC10300174/supplementaryfiles/18_2023_4838_MOESM9_ESM.xlsx Hsapiens 62 44819 44630 44630 44630 44630 44630 44630 44630 44630 44813 44813 44813 44813 44813 44813 44813 44813 44813 44813 44813 44813 44814 44814 44814 44814 44624 44624 44624 44815 44815 44815 44626 44623 44810 44810 44819 44813 44813 44624 44624 44815 44812 44813 44813 44813 44813 44813 44813 44813 44814 44626 44812 44621 44625 44815 44812 44813 44813 44813 44815 44815 44626"                                                                                                                                                                                                                                                                                                                                                                                                                
## [113] "PMC10297900 zip/Supplementary_File_2.xlsx Ggallus 3 44446 44446 44445"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## [114] "PMC10297900 zip/Supplementary_File_2.xlsx Ggallus 1 44260"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [115] "PMC10297900 zip/Supplementary_File_1.xlsx Ggallus 2 44446 44446"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## [116] "PMC10297900 zip/Supplementary_File_1.xlsx Ggallus 1 44260"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [117] "PMC10294383 PMC_DL/PMC10294383/supplementaryfiles/12864_2023_9342_MOESM7_ESM.xlsx Hsapiens 3 44263 44444 44258"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
## [118] "PMC10294383 PMC_DL/PMC10294383/supplementaryfiles/12864_2023_9342_MOESM7_ESM.xlsx Hsapiens 5 44993 45181 45179 44986 44988"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [119] "PMC10294383 PMC_DL/PMC10294383/supplementaryfiles/12864_2023_9342_MOESM5_ESM.xlsx Ggallus 1 45178"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
## [120] "PMC10294383 PMC_DL/PMC10294383/supplementaryfiles/12864_2023_9342_MOESM5_ESM.xlsx Rnorvegicus 10 44993 45173 45171 44989 45178 45178 45178 45178 45178 45178"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [121] "PMC10292886 PMC_DL/PMC10292886/supplementaryfiles/aging-15-204798-s002.xlsx Hsapiens 98 45184 44987 44993 44993 44993 44994 44994 45170 45173 45173 45173 45173 45173 45173 45173 45173 45173 45178 45178 45178 45178 45178 45178 45178 45178 45178 45178 45178 45178 45178 44987 44987 44987 44987 44987 44987 45179 45179 45179 45179 45179 45179 45179 44992 44992 44992 44992 44992 45171 45171 45171 45171 45171 45171 45171 45171 45171 45171 45171 45174 45174 45174 45174 45174 45172 45180 45180 45180 45180 45180 44986 44991 44991 44991 44991 44991 44988 44988 45177 45177 45177 45177 45177 45177 45177 45177 45176 45176 45175 45175 45175 45175 45175 45175 45175 45175 45175 45175"                                                                                                                                                                                           
## [122] "PMC10291128 PMC_DL/PMC10291128/supplementaryfiles/Table_8.xlsx Hsapiens 3 10377 10182 10375"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [123] "PMC10289466 PMC_DL/PMC10289466/supplementaryfiles/pone.0287132.s002.xlsx Hsapiens 1 44990"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [124] "PMC10288324 PMC_DL/PMC10288324/supplementaryfiles/Table3.XLSX Hsapiens 87 38412 38412 38412 38412 37135 37135 37135 37135 37135 37135 37135 37135 37135 37135 37135 37135 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 40422 39142 39142 39142 39142 39142 39142 39142 39142 39142 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 38596 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951 36951"                                                                                                                                                                                                                                                                           
## [125] "PMC10288324 PMC_DL/PMC10288324/supplementaryfiles/Table1.XLSX Hsapiens 8 39142 40422 36951 36951 36951 36951 37135 37316"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
## [126] "PMC10286233 PMC_DL/PMC10286233/supplementaryfiles/EVA-16-1135-s005.xlsx Hsapiens 5 44990 44993 44986 45174 45175"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
## [127] "PMC10286233 PMC_DL/PMC10286233/supplementaryfiles/EVA-16-1135-s003.xlsx Hsapiens 3 45174 44993 44990"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [128] "PMC10286233 PMC_DL/PMC10286233/supplementaryfiles/EVA-16-1135-s003.xlsx Hsapiens 1 45174"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
## [129] "PMC10286233 PMC_DL/PMC10286233/supplementaryfiles/EVA-16-1135-s003.xlsx Ggallus 1 44993"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [130] "PMC10286233 PMC_DL/PMC10286233/supplementaryfiles/EVA-16-1135-s003.xlsx Hsapiens 1 44990"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
## [131] "PMC10284603 PMC_DL/PMC10284603/supplementaryfiles/elife-83606-supp5.xlsx Mmusculus 73 40787 39692 38412 38412 39142 39142 40787 39692 40787 39692 38961 39692 38412 40787 38777 37316 38961 38961 39142 39142 37316 39508 37316 40787 40057 38596 40057 39508 37500 37316 39326 39142 39508 42248 37316 37500 40057 40238 40057 42248 42248 39873 37865 39142 37500 37681 40057 39508 37135 37500 37500 37316 42248 37316 37865 37316 38412 39508 38231 36951 39508 37500 42248 37316 37681 38231 37500 42248 40787 40422 38961 37500 39142"                                                                                                                                                                                                                                                                                                                                                   
## [132] "PMC10282049 PMC_DL/PMC10282049/supplementaryfiles/18_2023_4833_MOESM2_ESM.xlsx Hsapiens 18 44807 44622 44809 44810 44805 44812 44815 44806 44808 44625 44811 44628 44626 44629 44627 44814 44813 44896"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
## [133] "PMC10282049 PMC_DL/PMC10282049/supplementaryfiles/18_2023_4833_MOESM2_ESM.xlsx Hsapiens 20 45172 45174 44987 44988 45180 45170 45175 45179 44993 45177 44990 45171 44991 44994 45173 44992 45176 45178 45261 44989"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [134] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM5_ESM.xlsx Mmusculus 9 44625 44622 44629 44621 44627 44631 44626 44623 44628"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [135] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM5_ESM.xlsx Mmusculus 1 44625"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [136] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 1 44630"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [137] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 2 44621 44623"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [138] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 1 44628"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [139] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 1 44625"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [140] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 2 44624 44625"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [141] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 1 44622"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [142] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 1 44623"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [143] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM3_ESM.xlsx Mmusculus 1 44629"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
## [144] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM4_ESM.xlsx Mmusculus 48 44625 44630 44629 44628 44626 44621 44631 44622 44627 44623 44621 44630 44626 44628 44629 44623 44631 44627 44625 44622 44628 44626 44630 44621 44622 44629 44627 44623 44631 44625 44628 44621 44626 44622 44625 44623 44629 44627 44631 44628 44626 44625 44622 44627 44623 44621 44629 44631"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## [145] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM4_ESM.xlsx Mmusculus 12 44630 44629 44621 44621 44630 44628 44630 44621 44628 44621 44628 44621"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [146] "PMC10281998 PMC_DL/PMC10281998/supplementaryfiles/41467_2023_39344_MOESM4_ESM.xlsx Mmusculus 2 44622 44623"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [147] "PMC10277924 PMC_DL/PMC10277924/supplementaryfiles/mmc1.xlsx Hsapiens 1 44621"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [148] "PMC10277924 PMC_DL/PMC10277924/supplementaryfiles/mmc9.xlsx Hsapiens 18 44810 44622 44812 44814 44805 44810 44622 44812 44805 44814 44810 44622 44805 44812 44814 44623 44625 44622"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
## [149] "PMC10277924 PMC_DL/PMC10277924/supplementaryfiles/mmc8.xlsx Hsapiens 1 44623"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [150] "PMC10277924 PMC_DL/PMC10277924/supplementaryfiles/mmc2.xlsx Hsapiens 2 44807 44810"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [151] "PMC10275808 PMC_DL/PMC10275808/supplementaryfiles/10689_2022_325_MOESM1_ESM.xlsx Hsapiens 1 41518"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
## [152] "PMC10273709 PMC_DL/PMC10273709/supplementaryfiles/13062_2023_388_MOESM2_ESM.xlsx Mmusculus 1 43900"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [153] "PMC10266224 PMC_DL/PMC10266224/supplementaryfiles/Table_3.xlsx Hsapiens 5 44991 45176 45173 44989 45177"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [154] "PMC10265161 PMC_DL/PMC10265161/supplementaryfiles/ACEL-22-e13841-s004.xls Hsapiens 1 44621"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [155] "PMC10261899 PMC_DL/PMC10261899/supplementaryfiles/mmc3.xlsx Hsapiens 1 43169"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [156] "PMC10261899 PMC_DL/PMC10261899/supplementaryfiles/mmc3.xlsx Hsapiens 1 43169"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## [157] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Hsapiens 1 44806"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [158] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Ggallus 23 44808 44810 44806 44815 44814 44621 44621 44625 44626 44627 44628 44629 44630 44625 44626 44628 44629 44814 44815 44806 44812 44813 44624"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
## [159] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Hsapiens 6 44622 44625 44627 44819 44807 44812"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [160] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Hsapiens 6 44630 44623 44814 44806 44810 44812"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [161] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Hsapiens 1 44806"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [162] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Hsapiens 28 44629 44621 44622 44621 44623 44624 44625 44626 44814 44815 44806 44808 44810 44811 44812 44896 44814 44815 44806 44808 44810 44812 44621 44621 44623 44625 44628 44629"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
## [163] "PMC10253790 zip/Supplementary_Table_S9-Targeted_mRNAs.xlsx Hsapiens 1 44806"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [164] "PMC10252847 zip/Supplemental_table/Table_S1.xlsx Hsapiens 2 45181 45172"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
## [165] "PMC10252847 zip/Supplemental_table/Table_S1.xlsx Hsapiens 4 44995 44987 45179 45173"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
## [166] "PMC10250199 PMC_DL/PMC10250199/supplementaryfiles/41559_2023_2056_MOESM4_ESM.xlsx Hsapiens 1 40238"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
## [167] "PMC10249681 PMC_DL/PMC10249681/supplementaryfiles/iovs-64-7-12_s006.xlsx Hsapiens 27 42248 37316 36951 40422 39142 38047 37500 40787 36951 38777 40603 37681 39692 39326 41883 37226 39508 38412 39873 41153 38231 40238 40057 37316 38596 37865 38961"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
## [168] "PMC10245500 PMC_DL/PMC10245500/supplementaryfiles/40001_2023_1148_MOESM5_ESM.xlsx Hsapiens 5 45178 45178 44986 44986 44993"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## [169] "PMC10243159 PMC_DL/PMC10243159/supplementaryfiles/JCMM-27-1493-s001.xlsx Hsapiens 4 44810 44810 44810 44805"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [170] "PMC10243159 PMC_DL/PMC10243159/supplementaryfiles/JCMM-27-1493-s001.xlsx Hsapiens 9 44813 44810 44810 44813 44810 44805 44810 44621 44813"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
## [171] "PMC10241222 PMC_DL/PMC10241222/supplementaryfiles/mmc7.xlsx Mmusculus 1 40787"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
## [172] "PMC10239147 PMC_DL/PMC10239147/supplementaryfiles/13062_2023_386_MOESM1_ESM.xlsx Hsapiens 1 44815"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
## [173] "PMC10234573 PMC_DL/PMC10234573/supplementaryfiles/Table_1.xlsx Hsapiens 2 44986 44987"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
## [174] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38231 37226 37135 38596 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [175] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38231 37135 37226 38596 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [176] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38231 37226 38596 37135 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [177] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 37135 37226 38596 38231 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [178] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38231 37500 38596 37135 37226"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [179] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 37135 38231 38596 37226 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [180] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 37135 38231 38596 37226 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [181] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 37226 38231 38596 37135 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [182] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 37135 38596 38231 37226 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [183] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 37226 38231 38596 37135 37500"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [184] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38596 38231 37226 37500 37135"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [185] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38231 38596 37226 37500 37135"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [186] "PMC10234209 zip/Table_S5.xlsx Dmelanogaster 5 38596 38231 37500 37226 37135"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
## [187] "PMC10230138 PMC_DL/PMC10230138/supplementaryfiles/12864_2023_9400_MOESM3_ESM.xls Hsapiens 2 2023/09/11 2023/09/10"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
## [188] "PMC10229423 PMC_DL/PMC10229423/supplementaryfiles/42255_2023_774_MOESM2_ESM.xlsx Hsapiens 22 43164 43352 43344 43350 43161 43349 43353 43355 43345 43349 43346 43347 43160 43354 43353 43166 43349 43167 43161 43168 43347 43345"

In depth look at the errors

Let’s investigate the errors in more detail.

# By species
SPECIES <- sapply(strsplit(ERROR_GENELISTS," "),"[[",3)
table(SPECIES)
## SPECIES
##     Athaliana Dmelanogaster       Ggallus      Hsapiens     Mmusculus 
##             3            13             8           134            29 
##   Rnorvegicus 
##             1
par(mar=c(5,12,4,2))
barplot(table(SPECIES),horiz=TRUE,las=1)

par(mar=c(5,5,4,2))

# Number of affected Excel files per paper
DIST <- table(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
DIST
## 
## PMC10229423 PMC10229598 PMC10230138 PMC10233889 PMC10234209 PMC10234573 
##           1           1           1           4          13           1 
## PMC10237035 PMC10239147 PMC10241222 PMC10241889 PMC10242145 PMC10242354 
##           5           1           1           2           4           2 
## PMC10243159 PMC10245500 PMC10249681 PMC10250199 PMC10250210 PMC10252000 
##           2           1           1           1           1           1 
## PMC10252847 PMC10253099 PMC10253224 PMC10253790 PMC10258790 PMC10261190 
##           2           1           9           7           5           3 
## PMC10261899 PMC10263356 PMC10265161 PMC10265687 PMC10266224 PMC10266228 
##           2           2           1          11           1           2 
## PMC10267701 PMC10273709 PMC10275808 PMC10277924 PMC10279740 PMC10281998 
##           2           1           1           4           2          13 
## PMC10282049 PMC10284603 PMC10285042 PMC10286233 PMC10287762 PMC10287941 
##           2           1           1           5           2           1 
## PMC10288324 PMC10289466 PMC10290958 PMC10291128 PMC10292886 PMC10294383 
##           2           1           1           1           1           4 
## PMC10294482 PMC10297614 PMC10297722 PMC10297900 PMC10297931 PMC10299661 
##           1           1           2           4          15           4 
## PMC10300174 PMC10300496 PMC10300554 PMC10302091 PMC10306090 PMC10307879 
##           1           4          15           4           1           2
summary(as.numeric(DIST))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.000   2.000   3.133   4.000  15.000
hist(DIST,main="Number of affected Excel files per paper")

# PMC Articles with the most errors
DIST_DF <- as.data.frame(DIST)
DIST_DF <- DIST_DF[order(-DIST_DF$Freq),,drop=FALSE]
head(DIST_DF,20)
##           Var1 Freq
## 53 PMC10297931   15
## 57 PMC10300554   15
## 5  PMC10234209   13
## 36 PMC10281998   13
## 28 PMC10265687   11
## 21 PMC10253224    9
## 22 PMC10253790    7
## 7  PMC10237035    5
## 23 PMC10258790    5
## 40 PMC10286233    5
## 4  PMC10233889    4
## 11 PMC10242145    4
## 34 PMC10277924    4
## 48 PMC10294383    4
## 52 PMC10297900    4
## 54 PMC10299661    4
## 56 PMC10300496    4
## 58 PMC10302091    4
## 24 PMC10261190    3
## 10 PMC10241889    2
MOST_ERR_FILES = as.character(DIST_DF[1,1])
MOST_ERR_FILES
## [1] "PMC10297931"
# Number of errors per paper
NERR <- as.numeric(sapply(strsplit(ERROR_GENELISTS," "),"[[",4))
names(NERR) <- sapply(strsplit(ERROR_GENELISTS," "),"[[",1)
NERR <-tapply(NERR, names(NERR), sum)
NERR
## PMC10229423 PMC10229598 PMC10230138 PMC10233889 PMC10234209 PMC10234573 
##          22           6           2          19          65           2 
## PMC10237035 PMC10239147 PMC10241222 PMC10241889 PMC10242145 PMC10242354 
##          43           1           1         256         107           4 
## PMC10243159 PMC10245500 PMC10249681 PMC10250199 PMC10250210 PMC10252000 
##          13           5          27           1          10           1 
## PMC10252847 PMC10253099 PMC10253224 PMC10253790 PMC10258790 PMC10261190 
##           6           9         179          66          79          22 
## PMC10261899 PMC10263356 PMC10265161 PMC10265687 PMC10266224 PMC10266228 
##           2          19           1         130           5           2 
## PMC10267701 PMC10273709 PMC10275808 PMC10277924 PMC10279740 PMC10281998 
##           8           1           1          22          16          82 
## PMC10282049 PMC10284603 PMC10285042 PMC10286233 PMC10287762 PMC10287941 
##          38          73           1          11           2          27 
## PMC10288324 PMC10289466 PMC10290958 PMC10291128 PMC10292886 PMC10294383 
##          95           1           1           3          98          19 
## PMC10294482 PMC10297614 PMC10297722 PMC10297900 PMC10297931 PMC10299661 
##           8           5          17           7          97          10 
## PMC10300174 PMC10300496 PMC10300554 PMC10302091 PMC10306090 PMC10307879 
##          62          21         756         107          27           4
hist(NERR,main="number of errors per PMC article")

NERR_DF <- as.data.frame(NERR)
NERR_DF <- NERR_DF[order(-NERR_DF$NERR),,drop=FALSE]
head(NERR_DF,20)
##             NERR
## PMC10300554  756
## PMC10241889  256
## PMC10253224  179
## PMC10265687  130
## PMC10242145  107
## PMC10302091  107
## PMC10292886   98
## PMC10297931   97
## PMC10288324   95
## PMC10281998   82
## PMC10258790   79
## PMC10284603   73
## PMC10253790   66
## PMC10234209   65
## PMC10300174   62
## PMC10237035   43
## PMC10282049   38
## PMC10249681   27
## PMC10287941   27
## PMC10306090   27
MOST_ERR = rownames(NERR_DF)[1]
MOST_ERR
## [1] "PMC10300554"

Journals affected

GENELIST_ERROR_ARTICLES <- gsub("PMC","",GENELIST_ERROR_ARTICLES)

### JSON PARSING is more reliable than XML
ARTICLES <- esummary( GENELIST_ERROR_ARTICLES , db="pmc" , retmode = "json"  )
ARTICLE_DATA <- reutils::content(ARTICLES,as= "parsed")
ARTICLE_DATA <- ARTICLE_DATA$result
ARTICLE_DATA <- ARTICLE_DATA[2:length(ARTICLE_DATA)]
JOURNALS <- unlist(lapply(ARTICLE_DATA,function(x) {x$fulljournalname} ))
JOURNALS_TABLE <- table(JOURNALS)
JOURNALS_TABLE <- JOURNALS_TABLE[order(-JOURNALS_TABLE)]

length(JOURNALS_TABLE)
## [1] 44
NUM_JOURNALS=length(JOURNALS_TABLE)

par(mar=c(5,25,4,2))
barplot(head(JOURNALS_TABLE,10), horiz=TRUE, las=1, 
  xlab="Articles with gene name errors in supp files",
  main="Top journals this month")

Journal of the month winner

Congrats to our Journal of the Month winner!

JOURNAL_WINNER <- names(head(JOURNALS_TABLE,1))
JOURNAL_WINNER
## [1] "International Journal of Molecular Sciences"

Paper of the month winners

There are two categories:

  • Paper with the most suplementary files affected by gene name errors (MOST_ERR_FILES)

  • Paper with the most gene names converted to dates (MOST_ERR)

Sometimes, one paper can win both categories. Congrats to our winners.

Paper with most files affected

MOST_ERR_FILES <- gsub("PMC","",MOST_ERR_FILES)
ARTICLES <- esummary( MOST_ERR_FILES , db="pmc" , retmode = "json"  )
ARTICLE_DATA <- reutils::content(ARTICLES,as= "parsed")
ARTICLE_DATA <- ARTICLE_DATA[2]
ARTICLE_DATA
## $result
## $result$uids
## [1] "10297931"
## 
## $result$`10297931`
## $result$`10297931`$uid
## [1] "10297931"
## 
## $result$`10297931`$pubdate
## [1] "2023 Jun 6"
## 
## $result$`10297931`$epubdate
## [1] "2023 Jun 6"
## 
## $result$`10297931`$printpubdate
## [1] ""
## 
## $result$`10297931`$source
## [1] "Int J Mol Sci"
## 
## $result$`10297931`$authors
##         name authtype
## 1    Zhang L   Author
## 2   Fritah S   Author
## 3 Nazarov PV   Author
## 4    Kaoma T   Author
## 5 Van Dyck E   Author
## 
## $result$`10297931`$title
## [1] "Impact of IDH Mutations, the 1p/19q Co-Deletion and the G-CIMP Status on Alternative Splicing in Diffuse Gliomas"
## 
## $result$`10297931`$volume
## [1] "24"
## 
## $result$`10297931`$issue
## [1] "12"
## 
## $result$`10297931`$pages
## [1] "9825"
## 
## $result$`10297931`$articleids
##   idtype                value
## 1   pmid             37372972
## 2    doi 10.3390/ijms24129825
## 3  pmcid          PMC10297931
## 
## $result$`10297931`$fulljournalname
## [1] "International Journal of Molecular Sciences"
## 
## $result$`10297931`$sortdate
## [1] "2023/06/06 00:00"
## 
## $result$`10297931`$pmclivedate
## [1] "2023/06/28"

Paper with most date conversions

MOST_ERR <- gsub("PMC","",MOST_ERR)
ARTICLE_DATA <- esummary(MOST_ERR,db = "pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLE_DATA,as= "parsed")
ARTICLE_DATA
## $header
## $header$type
## [1] "esummary"
## 
## $header$version
## [1] "0.3"
## 
## 
## $result
## $result$uids
## [1] "10300554"
## 
## $result$`10300554`
## $result$`10300554`$uid
## [1] "10300554"
## 
## $result$`10300554`$pubdate
## [1] "2023 Apr 24"
## 
## $result$`10300554`$epubdate
## [1] "2023 Apr 24"
## 
## $result$`10300554`$printpubdate
## [1] ""
## 
## $result$`10300554`$source
## [1] "Cell Genom"
## 
## $result$`10300554`$authors
##             name authtype
## 1       Brown AC   Author
## 2       Cohen CJ   Author
## 3   Mielczarek O   Author
## 4   Migliorini G   Author
## 5   Costantino F   Author
## 6      Allcock A   Author
## 7     Davidson C   Author
## 8     Elliott KS   Author
## 9         Fang H   Author
## 10 Lledó Lara A   Author
## 11     Martin AC   Author
## 12     Osgood JA   Author
## 13     Sanniti A   Author
## 14  Scozzafava G   Author
## 15    Vecellio M   Author
## 16       Zhang P   Author
## 17      Black MH   Author
## 18          Li S   Author
## 19      Truong D   Author
## 20   Molineros J   Author
## 21        Howe T   Author
## 22 Wordsworth BP   Author
## 23     Bowness P   Author
## 24     Knight JC   Author
## 
## $result$`10300554`$title
## [1] "Comprehensive epigenomic profiling reveals the extent of disease-specific chromatin states and informs target discovery in ankylosing spondylitis"
## 
## $result$`10300554`$volume
## [1] "3"
## 
## $result$`10300554`$issue
## [1] "6"
## 
## $result$`10300554`$pages
## [1] "100306"
## 
## $result$`10300554`$articleids
##   idtype                      value
## 1   pmid                   37388915
## 2    doi 10.1016/j.xgen.2023.100306
## 3  pmcid                PMC10300554
## 
## $result$`10300554`$fulljournalname
## [1] "Cell Genomics"
## 
## $result$`10300554`$sortdate
## [1] "2023/04/24 00:00"
## 
## $result$`10300554`$pmclivedate
## [1] "2023/06/29"

Trend info

To plot the trend over the past 6-12 months.

url <- "https://ziemann-lab.net/public/gene_name_errors/"
doc <- htmlParse(url)
links <- xpathSApply(doc, "//a/@href")
links <- links[grep("html",links)]

listing <- htmlParse( getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE) )
listing <- xpathSApply(listing, "//a/@href")
listing <- listing[grep("html",listing)]

unlink("online_files/",recursive=TRUE)

dir.create("online_files")

sapply(listing, function(mylink) { 
  download.file(paste(url,mylink,sep=""),destfile=paste("online_files/",mylink,sep=""))  
} )
## href href href href href href href href href href href href href href href href 
##    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0 
## href href href href href href href href href href href href 
##    0    0    0    0    0    0    0    0    0    0    0    0
myfilelist <- list.files("online_files/",full.names=TRUE)

trends <- sapply(myfilelist,  function(myfilename) {

  x <- readLines(myfilename)

  # Num XL gene list articles
  NUM_GENELIST_ARTICLES <- x[grep("NUM_GENELIST_ARTICLES",x)[3]+1]
  NUM_GENELIST_ARTICLES <- sapply(strsplit(NUM_GENELIST_ARTICLES," "),"[[",3)
  NUM_GENELIST_ARTICLES <- sapply(strsplit(NUM_GENELIST_ARTICLES,"<"),"[[",1)
  NUM_GENELIST_ARTICLES <- as.numeric(NUM_GENELIST_ARTICLES)

  # number of affected articles
  NUM_ERROR_GENELIST_ARTICLES <- x[grep("NUM_ERROR_GENELIST_ARTICLES",x)[3]+1]
  NUM_ERROR_GENELIST_ARTICLES <- sapply(strsplit(NUM_ERROR_GENELIST_ARTICLES," "),"[[",3)
  NUM_ERROR_GENELIST_ARTICLES <- sapply(strsplit(NUM_ERROR_GENELIST_ARTICLES,"<"),"[[",1)
  NUM_ERROR_GENELIST_ARTICLES <- as.numeric(NUM_ERROR_GENELIST_ARTICLES)

  # Error proportion
  ERROR_PROPORTION <- x[grep("ERROR_PROPORTION",x)[3]+1]
  ERROR_PROPORTION <- sapply(strsplit(ERROR_PROPORTION," "),"[[",3)
  ERROR_PROPORTION <- sapply(strsplit(ERROR_PROPORTION,"<"),"[[",1)
  ERROR_PROPORTION <- as.numeric(ERROR_PROPORTION)

  # number of journals
  NUM_JOURNALS <- x[grep('JOURNALS_TABLE',x)[3]+1]
  NUM_JOURNALS <- sapply(strsplit(NUM_JOURNALS," "),"[[",3)
  NUM_JOURNALS <- sapply(strsplit(NUM_JOURNALS,"<"),"[[",1)
  NUM_JOURNALS <- as.numeric(NUM_JOURNALS)
  NUM_JOURNALS

  res <- c(NUM_GENELIST_ARTICLES,NUM_ERROR_GENELIST_ARTICLES,ERROR_PROPORTION,NUM_JOURNALS)

  return(res)
})

colnames(trends) <- sapply(strsplit(colnames(trends),"_"),"[[",3)
colnames(trends) <- gsub(".html","",colnames(trends))
trends <- as.data.frame(trends)
rownames(trends) <- c("NUM_GENELIST_ARTICLES","NUM_ERROR_GENELIST_ARTICLES","ERROR_PROPORTION","NUM_JOURNALS")
trends <- t(trends)
trends <- as.data.frame(trends)

CURRENT_RES <- c(NUM_GENELIST_ARTICLES,NUM_ERROR_GENELIST_ARTICLES,ERROR_PROPORTION,NUM_JOURNALS)

trends <- rbind(trends,CURRENT_RES)
paste(CURRENT_YEAR,CURRENT_MONTH,sep="-")
## [1] "2023-07"
rownames(trends)[nrow(trends)] <- paste(CURRENT_YEAR,CURRENT_MONTH,sep="-")

plot(trends$NUM_GENELIST_ARTICLES, xaxt = "n" , type="b" , main="Number of articles with Excel gene lists per month",
 ylab="number of articles", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))

plot(trends$NUM_ERROR_GENELIST_ARTICLES, xaxt = "n" , type="b" , main="Number of articles with gene name errors per month",
 ylab="number of articles", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))

plot(trends$ERROR_PROPORTION, xaxt = "n" , type="b" , main="Proportion of articles with Excel gene list affected by errors",
 ylab="proportion", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))

plot(trends$NUM_JOURNALS, xaxt = "n" , type="b" , main="Number of journals with affected articles",
 ylab="number of journals", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))

unlink("online_files/",recursive=TRUE)

References

  1. Zeeberg, B.R., Riss, J., Kane, D.W. et al. Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC Bioinformatics 5, 80 (2004). https://doi.org/10.1186/1471-2105-5-80

  2. Ziemann, M., Eren, Y. & El-Osta, A. Gene name errors are widespread in the scientific literature. Genome Biol 17, 177 (2016). https://doi.org/10.1186/s13059-016-1044-7

SessionInfo

sessionInfo()
## R version 4.3.0 (2023-04-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
##  [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
##  [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Australia/Melbourne
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] RCurl_1.98-1.12 readxl_1.4.2    reutils_0.2.3   xml2_1.3.4     
## [5] jsonlite_1.8.4  XML_3.99-0.14  
## 
## loaded via a namespace (and not attached):
##  [1] assertthat_0.2.1 digest_0.6.31    R6_2.5.1         fastmap_1.1.1   
##  [5] cellranger_1.1.0 xfun_0.39        cachem_1.0.7     knitr_1.42      
##  [9] htmltools_0.5.5  rmarkdown_2.21   bitops_1.0-7     cli_3.6.1       
## [13] sass_0.4.5       jquerylib_0.1.4  compiler_4.3.0   highr_0.10      
## [17] tools_4.3.0      evaluate_0.20    bslib_0.4.2      yaml_2.3.7      
## [21] rlang_1.1.1