Source: https://github.com/markziemann/GeneNameErrors2020
View the reports: http://ziemann-lab.net/public/gene_name_errors/
Gene name errors result when data are imported improperly into MS Excel and other spreadsheet programs (Zeeberg et al, 2004). Certain gene names like MARCH3, SEPT2 and DEC1 are converted into date format. These errors are surprisingly common in supplementary data files in the field of genomics (Ziemann et al, 2016). This could be considered a small error because it only affects a small number of genes, however it is symptomtic of poor data processing methods. The purpose of this script is to identify gene name errors present in supplementary files of PubMed Central articles in the previous month.
library("XML")
library("jsonlite")
library("xml2")
library("reutils")
library("readxl")
library("RCurl")
Here I will be getting PubMed Central IDs for the previous month.
Start with figuring out the date to search PubMed Central.
CURRENT_MONTH=format(Sys.time(), "%m")
CURRENT_YEAR=format(Sys.time(), "%Y")
if (CURRENT_MONTH == "01") {
PREV_YEAR=as.character(as.numeric(format(Sys.time(), "%Y"))-1)
PREV_MONTH="12"
} else {
PREV_YEAR=CURRENT_YEAR
PREV_MONTH=as.character(as.numeric(format(Sys.time(), "%m"))-1)
}
DATE=paste(PREV_YEAR,"/",PREV_MONTH,sep="")
DATE
## [1] "2022/9"
Let’s see how many PMC IDs we have in the past month.
QUERY ='((genom*[Abstract]))'
ESEARCH_RES <- esearch(term=QUERY, db = "pmc", rettype = "uilist", retmode = "xml", retstart = 0,
retmax = 5000000, usehistory = TRUE, webenv = NULL, querykey = NULL, sort = NULL, field = NULL,
datetype = NULL, reldate = NULL,
mindate = paste(DATE,"/1",sep="") , maxdate = paste(DATE,"/31",sep=""))
pmc <- efetch(ESEARCH_RES,retmode="text",rettype="uilist",outfile="pmcids.txt")
## Retrieving UIDs 1 to 500
## Retrieving UIDs 501 to 1000
## Retrieving UIDs 1001 to 1500
## Retrieving UIDs 1501 to 2000
## Retrieving UIDs 2001 to 2500
## Retrieving UIDs 2501 to 3000
## Retrieving UIDs 3001 to 3500
pmc <- read.table(pmc)
pmc <- paste("PMC",pmc$V1,sep="")
NUM_ARTICLES=length(pmc)
NUM_ARTICLES
## [1] 3445
writeLines(pmc,con="pmc.txt")
Now run the bash script. Note that false positives can occur (~1.5%) and these results have not been verified by a human.
Here are some definitions:
NUM_XLS = Number of supplementary Excel files in this set of PMC articles.
NUM_XLS_ARTICLES = Number of articles matching the PubMed Central search which have supplementary Excel files.
GENELISTS = The gene lists found in the Excel files. Each Excel file is counted once even it has multiple gene lists.
NUM_GENELISTS = The number of Excel files with gene lists.
NUM_GENELIST_ARTICLES = The number of PMC articles with supplementary Excel gene lists.
ERROR_GENELISTS = Files suspected to contain gene name errors. The dates and five-digit numbers indicate transmogrified gene names.
NUM_ERROR_GENELISTS = Number of Excel gene lists with errors.
NUM_ERROR_GENELIST_ARTICLES = Number of articles with supplementary Excel gene name errors.
ERROR_PROPORTION = This is the proportion of articles with Excel gene lists that have errors.
#system("./gene_names.sh pmc.txt")
results <- readLines("results.txt")
XLS <- results[grep("XLS",results,ignore.case=TRUE)]
NUM_XLS = length(XLS)
NUM_XLS
## [1] 5722
NUM_XLS_ARTICLES = length(unique(sapply(strsplit(XLS," "),"[[",1)))
NUM_XLS_ARTICLES
## [1] 1009
GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>2]
#GENELISTS
NUM_GENELISTS <- length(unique(sapply(strsplit(GENELISTS," "),"[[",2)))
NUM_GENELISTS
## [1] 739
NUM_GENELIST_ARTICLES <- length(unique(sapply(strsplit(GENELISTS," "),"[[",1)))
NUM_GENELIST_ARTICLES
## [1] 382
ERROR_GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>3]
#ERROR_GENELISTS
NUM_ERROR_GENELISTS = length(ERROR_GENELISTS)
NUM_ERROR_GENELISTS
## [1] 281
GENELIST_ERROR_ARTICLES <- unique(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
GENELIST_ERROR_ARTICLES
## [1] "PMC9522333" "PMC9521910" "PMC9520462" "PMC9515394" "PMC9514850"
## [6] "PMC9513135" "PMC9513039" "PMC9510610" "PMC9508851" "PMC9509653"
## [11] "PMC9509553" "PMC9509351" "PMC9504918" "PMC9502060" "PMC9502050"
## [16] "PMC9501657" "PMC9498843" "PMC9493312" "PMC9492468" "PMC9421669"
## [21] "PMC9489775" "PMC9478359" "PMC9487751" "PMC9486403" "PMC9485127"
## [26] "PMC9484640" "PMC9484176" "PMC9483895" "PMC9482661" "PMC9481918"
## [31] "PMC9478571" "PMC9477841" "PMC9477516" "PMC9476630" "PMC9468827"
## [36] "PMC9465038" "PMC9465009" "PMC9461288" "PMC9458435" "PMC9449593"
## [41] "PMC9456764" "PMC9456615" "PMC9456386" "PMC9452507" "PMC9450692"
## [46] "PMC9449361" "PMC9448678" "PMC9447897" "PMC9445988" "PMC9441061"
## [51] "PMC9436054" "PMC9434886" "PMC9434320" "PMC9396287" "PMC9481139"
## [56] "PMC9451151" "PMC9519376" "PMC9518893" "PMC9518822" "PMC9515781"
## [61] "PMC9473489" "PMC9513945" "PMC9513884" "PMC9513589" "PMC9512081"
## [66] "PMC9511766" "PMC9508838" "PMC9509640" "PMC9508879" "PMC9500203"
## [71] "PMC9500044" "PMC9499627" "PMC9495734" "PMC9495559" "PMC9493120"
## [76] "PMC9490000" "PMC9485579" "PMC9485450" "PMC9481623" "PMC9478897"
## [81] "PMC9468337" "PMC9465246" "PMC9458853" "PMC9441509" "PMC9458043"
## [86] "PMC9455665" "PMC9454937" "PMC9453314" "PMC9453157" "PMC9426562"
## [91] "PMC9445361" "PMC9440308" "PMC9440133" "PMC9439955" "PMC9438737"
## [96] "PMC9433322" "PMC9429972" "PMC9429730" "PMC9428261"
NUM_ERROR_GENELIST_ARTICLES <- length(GENELIST_ERROR_ARTICLES)
NUM_ERROR_GENELIST_ARTICLES
## [1] 99
ERROR_PROPORTION = NUM_ERROR_GENELIST_ARTICLES / NUM_GENELIST_ARTICLES
ERROR_PROPORTION
## [1] 0.2591623
Here you can have a look at all the gene lists detected in the past month, as well as those with errors. The dates are obvious errors, these are commonly dates in September, March, December and October. The five-digit numbers represent dates as they are encoded in the Excel internal format. The five digit number is the number of days since 1900. If you were to take these numbers and put them into Excel and format the cells as dates, then these will also mostly map to dates in September, March, December and October.
#GENELISTS
ERROR_GENELISTS
## [1] "PMC9522333 PMC_DL/PMC9522333/supplementaryfiles/pnas.2208496119.sd02.xlsx Athaliana 1 44781"
## [2] "PMC9521910 PMC_DL/PMC9521910/supplementaryfiles/pone.0274879.s005.xlsx Hsapiens 40 12-Sep 02-Mar 15-Sep 02-Mar 02-Sep 01-Mar 12-Sep 01-Sep 02-Sep 08-Mar 12-Sep 02-Mar 15-Sep 03-Sep 02-Sep 06-Mar 12-Sep 02-Mar 01-Mar 02-Sep 12-Sep 02-Mar 02-Sep 12-Sep 09-Sep 02-Mar 02-Mar 01-Mar 03-Sep 02-Sep 10-Sep 06-Mar 08-Mar 05-Mar 02-Mar 07-Mar 02-Sep 11-Sep 06-Mar 03-Mar"
## [3] "PMC9520462 PMC_DL/PMC9520462/supplementaryfiles/Table_1.xls Hsapiens 2 2022/03/01 2022/09/04"
## [4] "PMC9515394 PMC_DL/PMC9515394/supplementaryfiles/Table_3.xlsx Hsapiens 1 44805"
## [5] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp1.xlsx Mmusculus 1 36951"
## [6] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp1.xlsx Mmusculus 1 36951"
## [7] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp1.xlsx Mmusculus 2 37681 40057"
## [8] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp1.xlsx Hsapiens 1 38596"
## [9] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp2.xlsx Mmusculus 1 36951"
## [10] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp2.xlsx Mmusculus 1 36951"
## [11] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp2.xlsx Mmusculus 1 40057"
## [12] "PMC9514850 PMC_DL/PMC9514850/supplementaryfiles/elife-75521-supp2.xlsx Mmusculus 1 37681"
## [13] "PMC9513135 PMC_DL/PMC9513135/supplementaryfiles/Table_3.xlsx Hsapiens 3 44629 44808 44629"
## [14] "PMC9513039 PMC_DL/PMC9513039/supplementaryfiles/Table_4.xlsx Hsapiens 3 44621 44896 44622"
## [15] "PMC9513039 PMC_DL/PMC9513039/supplementaryfiles/Table_4.xlsx Hsapiens 3 44621 44896 44622"
## [16] "PMC9510610 PMC_DL/PMC9510610/supplementaryfiles/Table_3.xls Hsapiens 7 2021/03/05 2021/03/02 2021/03/08 2021/03/06 2021/03/01 2021/03/07 2021/03/03"
## [17] "PMC9510610 PMC_DL/PMC9510610/supplementaryfiles/Table_4.xls Hsapiens 7 2021/03/02 2021/03/06 2021/03/01 2021/03/07 2021/03/03 2021/03/05 2021/03/08"
## [18] "PMC9510610 PMC_DL/PMC9510610/supplementaryfiles/Table_4.xls Hsapiens 7 2021/03/02 2021/03/06 2021/03/01 2021/03/07 2021/03/03 2021/03/05 2021/03/08"
## [19] "PMC9508851 zip/Supplementary_Table_S2_marker_genes_slide1.xlsx Athaliana 1 44807"
## [20] "PMC9508851 zip/Supplementary_Table_S2_marker_genes_slide1.xlsx Athaliana 1 44807"
## [21] "PMC9508851 zip/Supplementary_Table_S2_marker_genes_slide1.xlsx Athaliana 1 44807"
## [22] "PMC9508851 zip/Supplementary_Table_S3_marker_genes_slide2.xlsx Athaliana 1 44807"
## [23] "PMC9508851 zip/Supplementary_Table_S3_marker_genes_slide2.xlsx Athaliana 1 44807"
## [24] "PMC9509653 PMC_DL/PMC9509653/supplementaryfiles/12915_2022_1398_MOESM3_ESM.xlsx Drerio 1 38961"
## [25] "PMC9509553 PMC_DL/PMC9509553/supplementaryfiles/13073_2022_1112_MOESM1_ESM.xlsx Hsapiens 11 39142 36951 39326 37316 38777 38412 37135 37500 39508 38596 40787"
## [26] "PMC9509553 PMC_DL/PMC9509553/supplementaryfiles/13073_2022_1112_MOESM1_ESM.xlsx Hsapiens 5 38596 39508 38412 40787 39873"
## [27] "PMC9509553 PMC_DL/PMC9509553/supplementaryfiles/13073_2022_1112_MOESM1_ESM.xlsx Hsapiens 2 39142 38777"
## [28] "PMC9509553 PMC_DL/PMC9509553/supplementaryfiles/13073_2022_1112_MOESM1_ESM.xlsx Hsapiens 3 36951 38596 39142"
## [29] "PMC9509553 PMC_DL/PMC9509553/supplementaryfiles/13073_2022_1112_MOESM1_ESM.xlsx Hsapiens 1 36951"
## [30] "PMC9509351 PMC_DL/PMC9509351/supplementaryfiles/41597_2022_1681_MOESM6_ESM.xlsx Hsapiens 1 44806"
## [31] "PMC9504918 zip/Supplementary_Table_5.xlsx Hsapiens 19 43525 43526 43527 43529 43530 43531 43532 43533 43723 43718 43719 43709 43710 43711 43712 43714 43715 43716 43717"
## [32] "PMC9504918 zip/Supplementary_Table_5.xlsx Hsapiens 2 43709 43711"
## [33] "PMC9504918 zip/Supplementary_Table_5.xlsx Hsapiens 2 43709 43711"
## [34] "PMC9504918 zip/Supplementary_Table_5.xlsx Hsapiens 2 43711 43709"
## [35] "PMC9504918 zip/Supplementary_Table_1.xlsx Hsapiens 19 43525 43526 43527 43529 43530 43531 43532 43533 43723 43718 43719 43709 43710 43711 43712 43714 43715 43716 43717"
## [36] "PMC9504918 zip/Supplementary_Table_1.xlsx Hsapiens 2 43709 43711"
## [37] "PMC9504918 zip/Supplementary_Table_1.xlsx Hsapiens 2 43709 43711"
## [38] "PMC9504918 zip/Supplementary_Table_1.xlsx Hsapiens 2 43709 43711"
## [39] "PMC9504918 zip/Supplementary_Table_3.xlsx Hsapiens 495 43529 43525 43716 43717 43535 43719 43525 43711 43534 43714 43527 43528 43717 43719 43717 43714 43712 43709 43719 43531 43719 43712 43717 43717 43528 43717 43715 43716 43716 43717 43534 43717 43528 43525 43535 43712 43527 43719 43530 43717 43717 43718 43531 43719 43531 43716 43528 43717 43532 43717 43720 43526 43719 43717 43717 43525 43528 43717 43717 43717 43717 43528 43713 43535 43717 43532 43717 43535 43525 43534 43717 43717 43715 43718 43717 43711 43527 43716 43713 43531 43717 43715 43713 43717 43714 43719 43528 43800 43713 43717 43717 43533 43717 43719 43525 43717 43532 43721 43534 43717 43531 43532 43528 43721 43714 43711 43525 43531 43717 43532 43717 43717 43719 43525 43533 43717 43535 43717 43717 43527 43716 43528 43528 43717 43525 43717 43715 43527 43721 43716 43534 43714 43715 43527 43528 43531 43722 43714 43717 43716 43526 43714 43717 43717 43530 43714 43527 43534 43528 43716 43722 43717 43720 43525 43717 43721 43709 43528 43714 43535 43527 43721 43719 43528 43717 43527 43717 43716 43714 43534 43717 43719 43717 43713 43713 43525 43525 43525 43712 43717 43712 43717 43717 43526 43525 43713 43717 43527 43533 43717 43526 43719 43719 43721 43527 43717 43535 43717 43721 43535 43712 43717 43717 43717 43717 43717 43530 43535 43715 43717 43717 43527 43717 43533 43717 43712 43525 43534 43534 43712 43716 43717 43711 43713 43713 43528 43717 43532 43532 43526 43532 43712 43800 43532 43526 43719 43534 43800 43717 43532 43711 43528 43716 43531 43717 43717 43715 43800 43717 43534 43800 43719 43717 43530 43532 43717 43527 43535 43717 43534 43534 43717 43717 43717 43717 43712 43525 43527 43525 43717 43721 43717 43714 43712 43716 43534 43528 43535 43712 43535 43717 43715 43717 43717 43717 43719 43711 43525 43709 43533 43721 43534 43528 43534 43800 43717 43717 43717 43719 43534 43714 43722 43716 43535 43534 43717 43530 43526 43709 43715 43717 43525 43714 43526 43535 43527 43715 43717 43535 43717 43719 43717 43527 43712 43709 43532 43532 43534 43716 43717 43717 43717 43717 43721 43528 43535 43716 43527 43712 43532 43532 43527 43527 43525 43530 43717 43710 43528 43532 43532 43525 43717 43532 43526 43534 43717 43531 43528 43716 43530 43717 43530 43721 43526 43717 43709 43712 43534 43713 43534 43713 43717 43711 43717 43721 43711 43534 43530 43711 43528 43717 43717 43717 43717 43534 43717 43717 43714 43534 43717 43715 43525 43717 43717 43525 43717 43715 43534 43526 43714 43719 43717 43531 43717 43717 43715 43717 43535 43711 43527 43530 43526 43534 43716 43528 43715 43712 43531 43534 43525 43526 43530 43534 43717 43535 43716 43717 43527 43717 43719 43535 43525 43721 43717 43534 43717 43721 43719 43717 43526 43717 43532 43713 43719 43527 43527 43531 43713 43525 43534 43719 43527 43535 43528 43721 43716 43525 43528 43717 43716 43526 43534 43714 43527 43717 43711 43717 43527 43712 43717 43717 43717 43716 43530 43717 43717 43721 43532 43715 43532 43717 43720 43532 43534 43709 43717 43527 43719 43716 43719 43531 43528 43719 43717 43529"
## [40] "PMC9504918 zip/Supplementary_Table_3.xlsx Hsapiens 9 43525 43525 43525 43525 43525 43525 43525 43525 43525"
## [41] "PMC9502060 PMC_DL/PMC9502060/supplementaryfiles/mmc2.xlsx Hsapiens 89 37865 37865 41153 41153 41153 41153 40422 41153 41153 41153 41153 41153 41153 40422 41153 41153 40422 42248 41153 37500 37500 40238 37500 37500 40238 37681 37500 37316 42248 42248 42248 37500 37316 41153 42248 37500 37316 37316 40787 37316 37500 42248 42248 36951 42248 40787 37316 37500 42248 37500 36951 40057 37500 37316 37500 37316 37316 37316 37135 36951 37316 38047 40057 37316 37316 37316 40057 37316 40422 38777 37316 38412 40787 40787 37316 40057 37316 37865 39508 39508 39508 39508 40057 40603 37316 39508 39508 36951 39508"
## [42] "PMC9502060 PMC_DL/PMC9502060/supplementaryfiles/mmc2.xlsx Hsapiens 2 40422 37316"
## [43] "PMC9502060 PMC_DL/PMC9502060/supplementaryfiles/mmc2.xlsx Hsapiens 87 36951 38047 42248 42248 42248 42248 39508 42248 42248 41153 42248 40422 42248 40057 37500 41153 40787 40057 42248 37500 37865 40238 39508 39508 41153 40057 41153 40057 41153 41153 41153 37316 41153 38777 41153 37316 41153 37316 40057 37500 37316 37316 37316 40787 41153 41153 41153 36951 37316 38412 37316 40787 36951 39508 37135 37316 37500 40422 37500 40422 37865 37500 37316 37316 36951 40422 37316 40238 40787 39508 37316 37316 39508 37316 37500 37316 37500 37681 37500 37316 37500 37316 37500 39508 37865 37500 40603"
## [44] "PMC9502060 PMC_DL/PMC9502060/supplementaryfiles/mmc2.xlsx Hsapiens 2 37316 40422"
## [45] "PMC9502050 PMC_DL/PMC9502050/supplementaryfiles/mmc2.xlsx Hsapiens 4 36951 39873 36951 39508"
## [46] "PMC9501657 zip/Table_S1.xlsx Drerio 3 44627 44629 44807"
## [47] "PMC9501657 zip/Table_S1.xlsx Drerio 1 44627"
## [48] "PMC9501657 zip/Table_S1.xlsx Drerio 2 44629 44807"
## [49] "PMC9501657 zip/Table_S8.xlsx Drerio 1 44810"
## [50] "PMC9498843 zip/genes-1865471-supplementary.xlsx Hsapiens 7 36951 38412 39142 38412 39142 39326 40238"
## [51] "PMC9498843 zip/genes-1865471-supplementary.xlsx Hsapiens 3 38231 37316 37316"
## [52] "PMC9498843 zip/genes-1865471-supplementary.xlsx Hsapiens 2 41883 39326"
## [53] "PMC9493312 PMC_DL/PMC9493312/supplementaryfiles/Table5.xlsx Drerio 44 36951 37500 38961 41153 38961 41153 38412 42248 37530 40422 38961 36951 42248 39508 37865 37712 37834 37135 37500 37347 37165 37500 38930 37469 36982 37712 37834 37104 38200 39295 38565 39661 37347 38626 37895 38261 37135 37135 37500 37865 38991 37530 39356 37135"
## [54] "PMC9492468 PMC_DL/PMC9492468/supplementaryfiles/12943_2022_1648_MOESM7_ESM.xlsx Hsapiens 2 44810 44810"
## [55] "PMC9492468 PMC_DL/PMC9492468/supplementaryfiles/12943_2022_1648_MOESM9_ESM.xlsx Hsapiens 1 44256"
## [56] "PMC9421669 PMC_DL/PMC9421669/supplementaryfiles/ofa-0015-0590-s05.xlsx Hsapiens 1 42795"
## [57] "PMC9421669 PMC_DL/PMC9421669/supplementaryfiles/ofa-0015-0590-s01.xlsx Hsapiens 1 42795"
## [58] "PMC9489775 zip/Supplementary_Data_12-Adar_KO_7dpf_editing_stats.xlsx Drerio 11 44807 44806 44819 44626 44631 44627 44624 44814 44625 44810 44622"
## [59] "PMC9478359 PMC_DL/PMC9478359/supplementaryfiles/mmc9.xlsx Hsapiens 29 44454 44257 44256 44449 44262 44259 44441 44450 44256 44261 44266 44258 44447 44446 44453 44531 44263 44260 44264 44451 44440 44440 44443 44265 44448 44257 44444 44442 44445"
## [60] "PMC9487751 zip/supplementary/Supplementary_Table_1.xlsx Hsapiens 3 44085 44081 44083"
## [61] "PMC9487751 zip/supplementary/Supplementary_Table_2.xlsx Hsapiens 1 44078"
## [62] "PMC9486403 PMC_DL/PMC9486403/supplementaryfiles/Table1.XLSX Mmusculus 1 44808"
## [63] "PMC9486403 PMC_DL/PMC9486403/supplementaryfiles/Table3.xlsx Mmusculus 4 44808 44808 44808 44808"
## [64] "PMC9485127 PMC_DL/PMC9485127/supplementaryfiles/41467_2022_32978_MOESM11_ESM.xlsx Mmusculus 2 43723 43717"
## [65] "PMC9485127 PMC_DL/PMC9485127/supplementaryfiles/41467_2022_32978_MOESM7_ESM.xlsx Mmusculus 5 37316 38961 39692 40422 38231"
## [66] "PMC9485127 PMC_DL/PMC9485127/supplementaryfiles/41467_2022_32978_MOESM12_ESM.xlsx Mmusculus 1 43723"
## [67] "PMC9484640 PMC_DL/PMC9484640/supplementaryfiles/pcbi.1010430.s009.xlsx Hsapiens 1 44166"
## [68] "PMC9484176 PMC_DL/PMC9484176/supplementaryfiles/12920_2022_1340_MOESM7_ESM.xlsx Hsapiens 1 44259"
## [69] "PMC9484176 PMC_DL/PMC9484176/supplementaryfiles/12920_2022_1340_MOESM2_ESM.xlsx Hsapiens 1 44445"
## [70] "PMC9483895 PMC_DL/PMC9483895/supplementaryfiles/438_2022_1954_MOESM9_ESM.xlsx Hsapiens 1 44815"
## [71] "PMC9483895 PMC_DL/PMC9483895/supplementaryfiles/438_2022_1954_MOESM9_ESM.xlsx Hsapiens 2 44631 44815"
## [72] "PMC9482661 PMC_DL/PMC9482661/supplementaryfiles/41598_2022_19654_MOESM8_ESM.xlsx Hsapiens 21 43891 44077 43896 44085 44081 43899 44080 43898 43895 43892 43894 43891 43897 44075 44082 44079 44076 44084 44083 43893 44078"
## [73] "PMC9482661 PMC_DL/PMC9482661/supplementaryfiles/41598_2022_19654_MOESM8_ESM.xlsx Hsapiens 21 43891 43892 43891 43892 43893 43894 43895 43896 43897 43898 43899 44084 44085 44076 44077 44078 44079 44080 44081 44082 44083"
## [74] "PMC9482661 PMC_DL/PMC9482661/supplementaryfiles/41598_2022_19654_MOESM1_ESM.xlsx Hsapiens 40 43894 43894 43896 43892 43891 43891 43898 43898 43895 43900 43900 43897 43897 43894 43894 43894 43894 43896 43893 43893 43900 43891 43892 43891 43893 43891 43898 43895 43900 43892 43896 43896 43901 43900 43894 43891 43891 43891 43900 43894"
## [75] "PMC9482661 PMC_DL/PMC9482661/supplementaryfiles/41598_2022_19654_MOESM2_ESM.xlsx Hsapiens 25 42625 42438 42616 42614 42618 42430 42619 42434 42433 42437 42440 42624 42435 42437 42621 42620 42627 42622 42623 42430 42439 42617 42432 42705 42431"
## [76] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 10 44623 44625 44622 44627 44629 44621 44628 44622 44626 44624"
## [77] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 10 44629 44627 44625 44623 44622 44621 44622 44628 44624 44626"
## [78] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 10 44622 44625 44627 44623 44629 44622 44628 44626 44624 44621"
## [79] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 9 44623 44625 44622 44627 44626 44628 44629 44621 44621"
## [80] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 9 44623 44621 44621 44628 44629 44622 44625 44626 44627"
## [81] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 14 44628 44625 44627 44629 44626 44622 44630 44896 44621 44631 44622 44621 44623 44624"
## [82] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 9 44263 44260 44262 44257 44264 44261 44258 44256 44259"
## [83] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 1 44261"
## [84] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 5 44257 44261 44260 44262 44264"
## [85] "PMC9481918 PMC_DL/PMC9481918/supplementaryfiles/mmc2.xlsx Hsapiens 1 44261"
## [86] "PMC9478571 PMC_DL/PMC9478571/supplementaryfiles/Table2.XLSX Hsapiens 6 44258 44262 44263 44257 44260 44261"
## [87] "PMC9478571 PMC_DL/PMC9478571/supplementaryfiles/Table2.XLSX Hsapiens 2 44257 44256"
## [88] "PMC9478571 PMC_DL/PMC9478571/supplementaryfiles/Table2.XLSX Hsapiens 2 44263 44531"
## [89] "PMC9478571 PMC_DL/PMC9478571/supplementaryfiles/Table1.XLSX Hsapiens 10 44262 44261 44258 44263 44260 44257 44256 44257 44531 44263"
## [90] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM4_ESM.xlsx Ggallus 33 43160 43345 43349 43353 43344 43434 43434 43434 43434 43351 43351 43351 43162 43162 43162 43159 43159 43159 43356 43169 43352 43346 43343 43167 43344 43351 43351 43354 43169 43165 43434 43350 43168"
## [91] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM4_ESM.xlsx Hsapiens 33 43343 43346 43159 43162 43356 43162 43162 43352 43159 43169 43159 43434 43434 43344 43345 43353 43434 43434 43351 43351 43351 43160 43349 43169 43354 43344 43167 43351 43351 43350 43168 43434 43165"
## [92] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM2_ESM.xlsx Athaliana 9 43378 43373 43312 43190 43315 43191 43343 43344 43192"
## [93] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM2_ESM.xlsx Athaliana 9 43315 43191 43343 43344 43192 43378 43373 43312 43190"
## [94] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM2_ESM.xlsx Celegans 4 43160 43190 43162 43343"
## [95] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM2_ESM.xlsx Celegans 4 43160 43343 43190 43162"
## [96] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM2_ESM.xlsx Drerio 3 43166 43352 43159"
## [97] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM2_ESM.xlsx Drerio 3 43166 43352 43159"
## [98] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Athaliana 11 43191 43343 43378 43373 43312 43190 43315 43191 43343 43344 43192"
## [99] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Celegans 6 43251 43163 43160 43190 43162 43343"
## [100] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Celegans 6 43160 43163 43251 43343 43190 43162"
## [101] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Dmelanogaster 2 43344 43434"
## [102] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Dmelanogaster 2 43344 43434"
## [103] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Drerio 6 43165 43165 43166 43354 43352 43159"
## [104] "PMC9477841 PMC_DL/PMC9477841/supplementaryfiles/41598_2022_19782_MOESM1_ESM.xlsx Drerio 6 43165 43165 43166 43354 43352 43159"
## [105] "PMC9477516 zip/btac542_Supplementary_Data/Supplementary_Table_1.xlsx Hsapiens 1 60525"
## [106] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 38596 39326 42248"
## [107] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 38596 39326 42248"
## [108] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 39326 38596 42248"
## [109] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 38596 42248 39326"
## [110] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 42248 39326 38596"
## [111] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 38596 39326 42248"
## [112] "PMC9476630 PMC_DL/PMC9476630/supplementaryfiles/supplementary_tables_ddac105.xlsx Mmusculus 3 38596 39326 42248"
## [113] "PMC9468827 PMC_DL/PMC9468827/supplementaryfiles/Table1.XLSX Hsapiens 1 44819"
## [114] "PMC9468827 PMC_DL/PMC9468827/supplementaryfiles/Table1.XLSX Hsapiens 1 44623"
## [115] "PMC9465038 PMC_DL/PMC9465038/supplementaryfiles/Table_2.xlsx Hsapiens 3 44256 44257 44260"
## [116] "PMC9465009 PMC_DL/PMC9465009/supplementaryfiles/Table1.XLSX Hsapiens 1 44806"
## [117] "PMC9461288 PMC_DL/PMC9461288/supplementaryfiles/13287_2022_3160_MOESM1_ESM.xlsx Hsapiens 3 44819 44621 44814"
## [118] "PMC9461288 PMC_DL/PMC9461288/supplementaryfiles/13287_2022_3160_MOESM1_ESM.xlsx Hsapiens 7 44808 44624 44809 44810 44807 44626 44806"
## [119] "PMC9461288 PMC_DL/PMC9461288/supplementaryfiles/13287_2022_3160_MOESM1_ESM.xlsx Hsapiens 2 44257 44445"
## [120] "PMC9458435 zip/Supplementary_Table_1.xlsx Mmusculus 4 44815 44814 44810 44629"
## [121] "PMC9449593 PMC_DL/PMC9449593/supplementaryfiles/EMMM-14-e15855-s001.xlsx Hsapiens 1 37226"
## [122] "PMC9456764 PMC_DL/PMC9456764/supplementaryfiles/pnas.2122170119.sd03.xlsx Mmusculus 1 44259"
## [123] "PMC9456615 zip/suplementary_Table_2_IJMS_02082022.xlsx Hsapiens 1 44081"
## [124] "PMC9456615 zip/suplementary_Table_2_IJMS_02082022.xlsx Hsapiens 2 44085 44083"
## [125] "PMC9456386 zip/Table_S1.xlsx Mmusculus 3 44440 44447 44440"
## [126] "PMC9452507 PMC_DL/PMC9452507/supplementaryfiles/41598_2022_18506_MOESM1_ESM.xlsx Hsapiens 21 44454 44441 44262 44257 44256 44449 44447 44440 44264 44443 44257 44256 44261 44260 44448 44450 44446 44445 44444 44263 44258"
## [127] "PMC9450692 PMC_DL/PMC9450692/supplementaryfiles/Table_1.xlsx Hsapiens 25 44257 44256 44442 44441 44445 44261 44531 44265 44263 44447 44451 44448 44440 44258 44450 44446 44266 44264 44444 44262 44443 44260 44453 44449 44259"
## [128] "PMC9449361 PMC_DL/PMC9449361/supplementaryfiles/Table4.XLSX Hsapiens 24 44449 44259 44257 44256 44443 44262 44448 44266 44261 44441 44440 44260 44263 44445 44444 44446 44264 44442 44256 44450 44258 44447 44257 44265"
## [129] "PMC9449361 PMC_DL/PMC9449361/supplementaryfiles/Table4.XLSX Hsapiens 2 43528 43527"
## [130] "PMC9448678 PMC_DL/PMC9448678/supplementaryfiles/41593_2022_1131_MOESM4_ESM.xlsx Hsapiens 7 44256 44256 44450 44256 44450 44256 44445"
## [131] "PMC9448678 PMC_DL/PMC9448678/supplementaryfiles/41593_2022_1131_MOESM4_ESM.xlsx Hsapiens 603 44440 44257 44448 44442 44258 44264 44256 44449 44260 44441 44262 44450 44445 44261 44446 44263 44447 44448 44446 44450 44257 44449 44447 44260 44445 44256 44264 44262 44441 44440 44261 44263 44446 44449 44440 44445 44256 44441 44447 44261 44260 44448 44257 44264 44263 44450 44262 44449 44445 44450 44263 44264 44441 44261 44262 44448 44440 44447 44446 44257 44256 44260 44256 44446 44449 44441 44442 44440 44448 44260 44262 44445 44450 44261 44264 44263 44257 44447 44450 44263 44445 44447 44443 44264 44262 44449 44446 44256 44440 44257 44260 44261 44448 44441 44448 44256 44264 44447 44446 44443 44261 44449 44263 44450 44445 44260 44440 44262 44257 44441 44256 44263 44446 44443 44262 44261 44448 44257 44440 44449 44441 44445 44450 44260 44264 44447 44256 44450 44261 44443 44447 44440 44446 44264 44262 44448 44449 44445 44441 44257 44260 44263 44450 44447 44445 44448 44446 44440 44443 44264 44256 44261 44262 44449 44257 44441 44263 44260 44263 44449 44261 44257 44256 44445 44450 44440 44264 44447 44446 44260 44441 44448 44262 44441 44448 44449 44263 44447 44450 44264 44440 44445 44262 44260 44261 44257 44446 44256 44445 44264 44262 44446 44260 44448 44441 44449 44443 44450 44447 44256 44440 44263 44261 44257 44257 44449 44447 44448 44261 44263 44256 44260 44262 44445 44264 44441 44450 44446 44440 44264 44257 44449 44446 44447 44448 44445 44263 44256 44441 44260 44450 44262 44440 44261 44448 44446 44257 44445 44261 44256 44449 44440 44263 44450 44260 44441 44447 44262 44264 44440 44264 44262 44449 44258 44446 44450 44261 44260 44441 44447 44445 44256 44257 44448 44263 44443 44256 44446 44449 44440 44257 44261 44443 44445 44441 44263 44448 44450 44264 44260 44262 44447 44450 44449 44446 44256 44445 44448 44263 44264 44441 44262 44447 44257 44440 44261 44260 44258 44450 44446 44256 44445 44441 44449 44260 44447 44257 44264 44263 44440 44262 44448 44261 44445 44450 44447 44446 44448 44441 44264 44262 44443 44440 44257 44256 44260 44263 44261 44449 44446 44445 44261 44447 44440 44448 44262 44264 44263 44449 44256 44450 44257 44260 44441 44256 44446 44440 44441 44263 44261 44262 44450 44257 44449 44260 44447 44445 44264 44448 44445 44449 44262 44446 44260 44256 44450 44448 44264 44263 44257 44441 44440 44261 44447 44263 44446 44441 44450 44440 44260 44448 44257 44256 44264 44262 44261 44449 44445 44447 44450 44262 44441 44447 44445 44448 44264 44256 44263 44449 44260 44261 44440 44257 44446 44445 44448 44256 44450 44440 44264 44449 44262 44447 44263 44446 44443 44257 44441 44261 44260 44263 44262 44445 44257 44440 44264 44447 44448 44446 44450 44261 44441 44256 44449 44260 44257 44445 44449 44264 44263 44446 44441 44450 44262 44448 44256 44440 44447 44260 44261 44447 44263 44441 44264 44448 44260 44256 44445 44261 44446 44440 44450 44449 44257 44262 44256 44441 44261 44446 44257 44450 44447 44449 44263 44264 44260 44448 44445 44262 44440 44443 44257 44450 44441 44446 44448 44445 44262 44447 44260 44263 44264 44256 44440 44449 44261 44447 44264 44440 44256 44261 44449 44446 44448 44445 44441 44260 44263 44257 44262 44450 44256 44446 44450 44447 44440 44262 44448 44261 44441 44449 44260 44257 44264 44263 44445 44445 44443 44256 44440 44446 44264 44262 44441 44260 44263 44449 44447 44261 44448 44450 44257 44256 44445 44450 44443 44448 44440 44262 44447 44446 44261 44260 44257 44264 44263 44449 44441 44445 44440 44449 44257 44448 44260 44262 44261 44264 44256 44441 44263 44447 44446 44450 44264 44445 44263 44260 44450 44449 44446 44440 44448 44441 44447 44257 44261 44256 44262 44445 44440 44446 44449 44447 44260 44264 44261 44257 44262 44450 44263 44448 44441 44256"
## [132] "PMC9447897 PMC_DL/PMC9447897/supplementaryfiles/pone.0269126.s004.xlsx Hsapiens 6 37865 38231 37316 41153 41883 40603"
## [133] "PMC9445988 PMC_DL/PMC9445988/supplementaryfiles/Table_2.XLSX Hsapiens 1 44258"
## [134] "PMC9445988 PMC_DL/PMC9445988/supplementaryfiles/Table_1.XLSX Hsapiens 39 44624 44626 44628 44623 44631 44819 44628 44626 44627 44621 44626 44625 44625 44809 44814 44628 44627 44626 44625 44627 44631 44623 44819 44625 44819 44621 44622 44623 44814 44627 44896 44626 44625 44627 44896 44626 44622 44625 44896"
## [135] "PMC9445988 PMC_DL/PMC9445988/supplementaryfiles/Table_6.XLSX Hsapiens 1 44261"
## [136] "PMC9441061 PMC_DL/PMC9441061/supplementaryfiles/12967_2022_3600_MOESM2_ESM.xls Hsapiens 3 44621 44896 44622"
## [137] "PMC9436054 PMC_DL/PMC9436054/supplementaryfiles/pgen.1010294.s008.xlsx Hsapiens 1 44084"
## [138] "PMC9436054 PMC_DL/PMC9436054/supplementaryfiles/pgen.1010294.s008.xlsx Hsapiens 2 44084 44085"
## [139] "PMC9434886 zip/HRPCa_SV_TableS5.xlsx Hsapiens 3 37226 37226 37226"
## [140] "PMC9434886 zip/HRPCa_SV_TableS5.xlsx Hsapiens 3 37226 37226 37226"
## [141] "PMC9434886 zip/HRPCa_SV_TableS5.xlsx Hsapiens 7 37226 37226 37226 37226 37226 37226 37226"
## [142] "PMC9434886 zip/HRPCa_SV_TableS5.xlsx Hsapiens 2 37226 37226"
## [143] "PMC9434886 zip/HRPCa_SV_TableS4.xlsx Hsapiens 3 37226 37226 37226"
## [144] "PMC9434886 zip/HRPCa_SV_TableS4.xlsx Hsapiens 3 37226 37226 37226"
## [145] "PMC9434320 PMC_DL/PMC9434320/supplementaryfiles/jkac167_table_s5.xlsx Athaliana 3 43897 44050 44044"
## [146] "PMC9434320 PMC_DL/PMC9434320/supplementaryfiles/jkac167_table_s5.xlsx Athaliana 3 43897 44050 44044"
## [147] "PMC9396287 PMC_DL/PMC9396287/supplementaryfiles/MSB-18-e10473-s002.xlsx Mmusculus 144 44256 44264 44266 44450 44261 44449 44448 44257 44260 44257 44265 44441 44446 44444 44263 44442 44262 44259 44447 44440 44258 44445 44443 44256 44266 44448 44263 44256 44445 44442 44450 44449 44257 44441 44260 44446 44259 44262 44258 44444 44443 44447 44257 44264 44261 44440 44256 44265 44266 44262 44444 44263 44259 44260 44449 44440 44264 44261 44257 44257 44442 44445 44447 44441 44446 44448 44265 44443 44258 44256 44450 44256 44266 44262 44261 44446 44263 44448 44260 44450 44444 44259 44441 44449 44256 44257 44445 44447 44443 44258 44442 44264 44440 44257 44256 44265 44261 44257 44450 44448 44264 44449 44266 44446 44444 44441 44445 44447 44260 44263 44257 44443 44256 44259 44442 44262 44258 44440 44256 44265 44266 44257 44263 44448 44445 44442 44449 44259 44450 44260 44446 44444 44264 44441 44262 44261 44256 44447 44257 44440 44258 44443 44256 44265"
## [148] "PMC9481139 zip/sciadv.abp9005_data_file_s3.xlsx Hsapiens 8 44809 44626 44810 44815 44623 44812 44811 44806"
## [149] "PMC9481139 zip/sciadv.abp9005_data_file_s6.xlsx Hsapiens 8 44808 44815 44812 44809 44623 44813 44625 44819"
## [150] "PMC9481139 zip/sciadv.abp9005_data_file_s5.xlsx Hsapiens 21 44443 44444 44440 44442 44445 44261 44447 44448 44454 44260 44264 44262 44441 44450 44446 44256 44263 44258 44263 44257 44256"
## [151] "PMC9451151 zip/sciadv.abh2868_table_s6.xlsx Mmusculus 10 43900 43893 43895 43896 43899 44084 44085 44076 44081 44082"
## [152] "PMC9451151 zip/sciadv.abh2868_table_s6.xlsx Mmusculus 3 43893 44076 44081"
## [153] "PMC9451151 zip/sciadv.abh2868_table_s6.xlsx Mmusculus 2 43899 44076"
## [154] "PMC9451151 zip/sciadv.abh2868_table_s5.xlsx Mmusculus 6 43893 43895 43896 44084 44076 44081"
## [155] "PMC9451151 zip/sciadv.abh2868_table_s5.xlsx Mmusculus 16 43891 43900 43901 43892 43893 43894 43897 43899 44084 44076 44077 44078 44079 44081 44082 44083"
## [156] "PMC9451151 zip/sciadv.abh2868_table_s5.xlsx Mmusculus 3 43899 44076 44081"
## [157] "PMC9451151 zip/sciadv.abh2868_table_s5.xlsx Mmusculus 9 43891 43900 43894 43895 43897 43899 44076 44081 44083"
## [158] "PMC9451151 zip/sciadv.abh2868_table_s5.xlsx Mmusculus 14 43900 43901 43893 43895 43896 43899 44084 44085 44076 44077 44079 44081 44082 44083"
## [159] "PMC9451151 zip/sciadv.abh2868_table_s5.xlsx Mmusculus 11 43900 43893 43895 43896 43897 43899 44084 44085 44076 44081 44082"
## [160] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 1 43896"
## [161] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 4 43893 43896 44085 43900"
## [162] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 1 43896"
## [163] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 8 44081 43893 43893 43894 44083 44083 44083 44083"
## [164] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 16 44081 44081 44081 44081 43893 43893 43893 43893 43893 44089 43896 43894 43894 44083 44083 44083"
## [165] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 3 43893 43896 43900"
## [166] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 8 44081 43897 43893 43896 44085 44079 43900 43900"
## [167] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 19 43893 43893 43893 43896 43894 43894 43894 43894 43899 44085 44083 44083 44083 44083 44083 44083 44083 44083 44079"
## [168] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 26 44081 43901 43898 43893 43893 43893 43893 43893 43896 43894 43894 43894 43894 43899 44085 44085 44083 44083 44083 44083 44083 44083 44083 44083 44083 44079"
## [169] "PMC9451151 zip/sciadv.abh2868_table_s4.xlsx Mmusculus 12 44081 43893 43891 43896 43899 44085 44083 44083 44079 44079 43900 43900"
## [170] "PMC9451151 zip/sciadv.abh2868_table_s1.xlsx Mmusculus 59 42981 42983 42798 42989 42799 42993 42803 42800 42796 42801 42796 42988 42988 42804 42987 42987 42987 42980 42980 42980 42799 42799 42795 42795 42984 42984 42985 42986 42986 42984 42982 42982 42982 42795 42795 42802 42802 42802 42989 42989 42805 42985 42980 42982 42987 42979 42984 42990 42802 42992 42990 42986 42796 42804 42795 42797 42805 42796 42796"
## [171] "PMC9519376 PMC_DL/PMC9519376/supplementaryfiles/mmc2.xlsx Hsapiens 17 43528 43715 43718 43531 43530 43719 43529 43532 43710 43533 43717 43526 43714 43527 43525 43526 43716"
## [172] "PMC9518893 PMC_DL/PMC9518893/supplementaryfiles/pgen.1009923.s010.xlsx Hsapiens 31 37316 37316 40057 40057 37316 37316 37316 37316 37316 37500 37500 39326 41883 41883 41883 41883 41883 38231 38231 40057 40057 40057 40057 40057 40057 38596 38596 37865 37865 37865 40057"
## [173] "PMC9518822 zip/Supplementary_File_S1_RAW.xlsx Hsapiens 9 37500 39326 40057 40787 39692 38961 38412 36951 38777"
## [174] "PMC9518822 zip/Supplementary_File_S1_RAW.xlsx Hsapiens 8 37500 39326 40057 39692 40787 38961 38412 38777"
## [175] "PMC9518822 zip/Supplementary_File_S2_DEPS.xlsx Hsapiens 1 40057"
## [176] "PMC9515781 PMC_DL/PMC9515781/supplementaryfiles/Table8.XLSX Mmusculus 2 5-Mar 5-Sep"
## [177] "PMC9515781 PMC_DL/PMC9515781/supplementaryfiles/Table9.XLSX Mmusculus 2 5-Sep 5-Mar"
## [178] "PMC9515781 PMC_DL/PMC9515781/supplementaryfiles/Table10.XLSX Mmusculus 2 5-Sep 5-Mar"
## [179] "PMC9515781 PMC_DL/PMC9515781/supplementaryfiles/Table7.XLSX Mmusculus 2 5-Mar 5-Sep"
## [180] "PMC9473489 PMC_DL/PMC9473489/supplementaryfiles/CTM2-12-e990-s010.xlsx Hsapiens 5 44814 44813 44809 44630 44808"
## [181] "PMC9473489 PMC_DL/PMC9473489/supplementaryfiles/CTM2-12-e990-s010.xlsx Hsapiens 2 44624 42803"
## [182] "PMC9473489 PMC_DL/PMC9473489/supplementaryfiles/CTM2-12-e990-s010.xlsx Hsapiens 1 44808"
## [183] "PMC9473489 PMC_DL/PMC9473489/supplementaryfiles/CTM2-12-e990-s010.xlsx Hsapiens 3 44819 44811 44621"
## [184] "PMC9473489 PMC_DL/PMC9473489/supplementaryfiles/CTM2-12-e990-s016.xlsx Hsapiens 3 44819 44811 44621"
## [185] "PMC9473489 PMC_DL/PMC9473489/supplementaryfiles/CTM2-12-e990-s016.xlsx Hsapiens 1 44808"
## [186] "PMC9513945 PMC_DL/PMC9513945/supplementaryfiles/12935_2022_2698_MOESM2_ESM.xlsx Hsapiens 9 44448 44265 44441 44261 44256 44447 44260 44442 44263"
## [187] "PMC9513945 PMC_DL/PMC9513945/supplementaryfiles/12935_2022_2698_MOESM2_ESM.xlsx Hsapiens 1 44440"
## [188] "PMC9513945 PMC_DL/PMC9513945/supplementaryfiles/12935_2022_2698_MOESM2_ESM.xlsx Hsapiens 1 44440"
## [189] "PMC9513884 PMC_DL/PMC9513884/supplementaryfiles/12920_2022_1339_MOESM5_ESM.xlsx Hsapiens 9 44621 44624 44627 44629 44626 44628 44623 44622 44625"
## [190] "PMC9513589 PMC_DL/PMC9513589/supplementaryfiles/Data_Sheet_2.xlsx Athaliana 1 44476"
## [191] "PMC9512081 PMC_DL/PMC9512081/supplementaryfiles/NIHMS1835097-supplement-supplemental_table_2.xlsx Mmusculus 1 44442"
## [192] "PMC9511766 PMC_DL/PMC9511766/supplementaryfiles/13059_2022_2772_MOESM2_ESM.xlsx Mmusculus 4 44260 44454 44445 44448"
## [193] "PMC9511766 PMC_DL/PMC9511766/supplementaryfiles/13059_2022_2772_MOESM2_ESM.xlsx Mmusculus 4 44449 44443 44260 44448"
## [194] "PMC9511766 PMC_DL/PMC9511766/supplementaryfiles/13059_2022_2772_MOESM2_ESM.xlsx Mmusculus 5 44449 44451 44445 44448 44444"
## [195] "PMC9511766 PMC_DL/PMC9511766/supplementaryfiles/13059_2022_2772_MOESM2_ESM.xlsx Mmusculus 1 44448"
## [196] "PMC9508838 zip/SupplementaryTableS2_Vuilleumier.xlsx Dmelanogaster 1 37500"
## [197] "PMC9508838 zip/SupplementaryTableS2_Vuilleumier.xlsx Dmelanogaster 4 37135 37500 38596 38231"
## [198] "PMC9509640 PMC_DL/PMC9509640/supplementaryfiles/13058_2022_1540_MOESM7_ESM.xlsx Hsapiens 2 40787 38777"
## [199] "PMC9508879 zip/raw_data_for_Figure_2-6/Figure4-DEG_RASG_overlap.xlsx Hsapiens 1 44813"
## [200] "PMC9508879 zip/raw_data_for_Figure_2-6/Figure4-siDDX17_vs_siCtrl_NIR_RAS_p0.05.xlsx Hsapiens 7 44622 44813 44813 44813 44813 44813 44813"
## [201] "PMC9508879 zip/raw_data_for_Figure_2-6/Figure6-NIR_RASE_olp_anno.xlsx Hsapiens 1 44813"
## [202] "PMC9508879 zip/raw_data_for_Figure_2-6/Figure2-siDDX17_vs_siCtrl_Sig_DEG.xlsx Hsapiens 1 44813"
## [203] "PMC9508879 zip/raw_data_for_Figure_2-6/Figure4-siDDX17_vs_siCtrl_IR_RAS_p0.05.xlsx Hsapiens 1 44622"
## [204] "PMC9500203 PMC_DL/PMC9500203/supplementaryfiles/Table3.xlsx Hsapiens 3 44622 44626 44630"
## [205] "PMC9500203 PMC_DL/PMC9500203/supplementaryfiles/Table5.xlsx Hsapiens 1 44630"
## [206] "PMC9500203 PMC_DL/PMC9500203/supplementaryfiles/Table4.xlsx Hsapiens 1 44630"
## [207] "PMC9500044 PMC_DL/PMC9500044/supplementaryfiles/41467_2022_33333_MOESM8_ESM.xlsx Mmusculus 11 44813 44813 44813 44810 44621 44630 44813 44630 44819 44627 44626"
## [208] "PMC9500044 PMC_DL/PMC9500044/supplementaryfiles/41467_2022_33333_MOESM8_ESM.xlsx Mmusculus 14 44813 44813 44813 44810 44621 44630 44622 44621 44813 44819 44819 44627 44813 44626"
## [209] "PMC9500044 PMC_DL/PMC9500044/supplementaryfiles/41467_2022_33333_MOESM7_ESM.xlsx Mmusculus 18 44813 44815 44813 44810 44813 44809 44622 44811 44621 44808 44813 44819 44628 44626 44808 44626 44819 44813"
## [210] "PMC9500044 PMC_DL/PMC9500044/supplementaryfiles/41467_2022_33333_MOESM7_ESM.xlsx Mmusculus 9 44813 44813 44815 44810 44810 44813 44621 44630 44819"
## [211] "PMC9500044 PMC_DL/PMC9500044/supplementaryfiles/41467_2022_33333_MOESM7_ESM.xlsx Mmusculus 12 44813 44813 44815 44810 44813 44621 44813 44808 44811 44622 44819 44626"
## [212] "PMC9499627 zip/Table_S9.xlsx Athaliana 1 44840"
## [213] "PMC9495734 zip/TableS1-TableS14.xlsx Ggallus 1 44630"
## [214] "PMC9495734 zip/TableS1-TableS14.xlsx Hsapiens 3 44628 44630 44629"
## [215] "PMC9495559 zip/Supplementary_Data.xlsx Ggallus 1 44628"
## [216] "PMC9493120 PMC_DL/PMC9493120/supplementaryfiles/Data_Sheet_1.xlsx Hsapiens 11 43898 44077 44082 44084 44084 43898 43891 43898 44077 44077 43895"
## [217] "PMC9490000 zip/Additional_Files/Table_S4._Detailed_information_of_the_detected_CNVRs.xlsx Hsapiens 1 44263"
## [218] "PMC9490000 zip/Additional_Files/Table_S4._Detailed_information_of_the_detected_CNVRs.xlsx Hsapiens 1 44263"
## [219] "PMC9485579 PMC_DL/PMC9485579/supplementaryfiles/Table_3.XLSX Hsapiens 19 44896 44621 44622 44621 44622 44623 44625 44626 44627 44628 44819 44814 44815 44806 44808 44810 44811 44812 44813"
## [220] "PMC9485579 PMC_DL/PMC9485579/supplementaryfiles/Table_2.XLSX Hsapiens 26 44896 44621 44622 44621 44630 44631 44622 44623 44624 44625 44626 44627 44628 44629 44819 44805 44814 44815 44816 44806 44807 44808 44810 44811 44812 44813"
## [221] "PMC9485579 PMC_DL/PMC9485579/supplementaryfiles/Table_1.XLSX Hsapiens 27 44896 44621 44622 44621 44630 44631 44622 44623 44624 44625 44626 44627 44628 44629 44819 44805 44814 44815 44816 44818 44806 44807 44808 44810 44811 44812 44813"
## [222] "PMC9485579 PMC_DL/PMC9485579/supplementaryfiles/Table_4.XLSX Hsapiens 26 44896 44621 44622 44621 44630 44631 44622 44623 44624 44625 44626 44627 44628 44629 44819 44805 44814 44815 44816 44806 44807 44808 44810 44811 44812 44813"
## [223] "PMC9485579 PMC_DL/PMC9485579/supplementaryfiles/Table_5.XLSX Hsapiens 26 44896 44621 44622 44621 44630 44631 44622 44623 44624 44625 44626 44627 44628 44629 44819 44805 44814 44815 44816 44806 44807 44808 44810 44811 44812 44813"
## [224] "PMC9485579 PMC_DL/PMC9485579/supplementaryfiles/Table_5.XLSX Hsapiens 26 44815 44631 44623 44627 44622 44628 44624 44806 44621 44805 44808 44896 44625 44814 44810 44622 44816 44621 44629 44630 44812 44811 44626 44813 44807 44819"
## [225] "PMC9485450 PMC_DL/PMC9485450/supplementaryfiles/Table1.xlsx Hsapiens 1 44623"
## [226] "PMC9481623 PMC_DL/PMC9481623/supplementaryfiles/41398_2022_2144_MOESM7_ESM.xlsx Mmusculus 1 44257"
## [227] "PMC9478897 PMC_DL/PMC9478897/supplementaryfiles/Table_2.XLS Hsapiens 3 7-Sep 8-Mar 8-Sep"
## [228] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 7 44446 44441 44445 44448 44450 44447 44449"
## [229] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 6 44446 44441 44447 44448 44445 44450"
## [230] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 5 44441 44446 44448 44450 44445"
## [231] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 4 44446 44441 44448 44450"
## [232] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 5 44441 44446 44448 44445 44450"
## [233] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 1 44441"
## [234] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 7 44446 44441 44448 44445 44447 44450 44449"
## [235] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 7 44450 44445 44449 44446 44447 44441 44448"
## [236] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 7 44446 44441 44448 44445 44447 44450 44449"
## [237] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 7 44447 44449 44446 44445 44450 44441 44448"
## [238] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 7 44441 44446 44448 44450 44445 44447 44449"
## [239] "PMC9468337 PMC_DL/PMC9468337/supplementaryfiles/41467_2022_33025_MOESM6_ESM.xlsx Hsapiens 6 44446 44450 44445 44447 44441 44448"
## [240] "PMC9465246 PMC_DL/PMC9465246/supplementaryfiles/Table3.xlsx Hsapiens 3 44086 44088 43901"
## [241] "PMC9465246 PMC_DL/PMC9465246/supplementaryfiles/Table3.xlsx Hsapiens 4 44166 44086 43901 44088"
## [242] "PMC9458853 PMC_DL/PMC9458853/supplementaryfiles/Data_Sheet_1.xlsx Hsapiens 1 44816"
## [243] "PMC9441509 PMC_DL/PMC9441509/supplementaryfiles/ceo-2022-00206-suppl5.xlsx Hsapiens 1 44261"
## [244] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S1.xls Mmusculus 1 2017/09/04"
## [245] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S1.xls Mmusculus 1 2017/09/04"
## [246] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S1.xls Mmusculus 1 2017/09/04"
## [247] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S1.xls Mmusculus 1 2017/09/04"
## [248] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S5.xls Mmusculus 24 2017/03/02 2017/09/09 2017/09/03 2017/03/07 2017/09/02 2017/09/08 2017/03/08 2017/03/05 2017/09/11 2017/03/01 2017/09/14 2017/03/06 2017/03/03 2017/09/06 2017/09/10 2017/03/09 2017/03/10 2017/03/04 2017/09/04 2017/09/15 2017/09/01 2017/09/07 2017/03/11 2017/09/12"
## [249] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S3.xls Mmusculus 1 2017/09/04"
## [250] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S3.xls Mmusculus 1 2017/09/04"
## [251] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S6.xls Mmusculus 1 2017/09/04"
## [252] "PMC9458043 zip/molecules-1887054-supplementary/Supplementary_files/Table_S6.xls Mmusculus 1 2017/09/04"
## [253] "PMC9455665 zip/ijms-1824056-Supplementary_Table_S1.xlsx Hsapiens 22 44814 44811 44808 44815 44621 44806 44628 44627 44622 44621 44896 44812 44818 44805 44624 44807 44629 44816 44622 44810 44809 44630"
## [254] "PMC9454937 zip/cancers-1855002-supplementary-for_xml/Table_S3.xlsx Hsapiens 12 44449 44259 44265 44448 44443 44262 44265 44444 44260 44262 44257 44264"
## [255] "PMC9453314 PMC_DL/PMC9453314/supplementaryfiles/Table_2.xlsx Hsapiens 1 44805"
## [256] "PMC9453157 PMC_DL/PMC9453157/supplementaryfiles/Table3.XLSX Hsapiens 1 44625"
## [257] "PMC9426562 PMC_DL/PMC9426562/supplementaryfiles/mmc4.xlsx Mmusculus 24 41883 40787 40057 39692 39508 38596 37135 40787 40057 37865 37865 40057 40422 40238 40057 36951 37316 40057 40238 37681 40603 38596 40057 40238"
## [258] "PMC9426562 PMC_DL/PMC9426562/supplementaryfiles/mmc4.xlsx Mmusculus 1 40057"
## [259] "PMC9426562 PMC_DL/PMC9426562/supplementaryfiles/mmc7.xlsx Mmusculus 45 37316 38777 38047 37316 38231 40238 37316 38961 38596 40238 37865 39692 38231 38777 38961 37316 39326 37316 40057 40787 40238 37681 38961 40057 38596 40057 40057 40057 36951 36951 39873 40422 40238 40057 40057 40057 39508 37135 36951 37316 40057 40057 38777 36951 37681"
## [260] "PMC9426562 PMC_DL/PMC9426562/supplementaryfiles/mmc7.xlsx Mmusculus 81 41153 38596 36951 37681 40787 39142 36951 37316 38777 40057 38777 39142 36951 38777 38777 40787 39326 39326 37316 37316 36951 40238 38596 37681 40787 40422 37316 40422 40057 40057 40422 38596 37135 37316 40057 37316 37316 40057 37316 37681 36951 40238 40057 40057 37865 38596 38596 39873 39873 38047 40238 36951 40238 40238 40238 40603 40787 40057 40057 40603 40057 38596 39508 40238 38596 39508 38231 36951 40422 40057 38596 38596 37681 37681 37681 39142 39142 40238 37865 39508 40057"
## [261] "PMC9445361 PMC_DL/PMC9445361/supplementaryfiles/Table_6.xlsx Hsapiens 1 44818"
## [262] "PMC9445361 PMC_DL/PMC9445361/supplementaryfiles/Table_6.xlsx Hsapiens 2 44818 44631"
## [263] "PMC9440308 PMC_DL/PMC9440308/supplementaryfiles/mmc2.xlsx Hsapiens 1 44622"
## [264] "PMC9440133 PMC_DL/PMC9440133/supplementaryfiles/41598_2022_19106_MOESM2_ESM.xlsx Hsapiens 25 38596 41153 38231 37316 38047 37681 39692 37865 39326 37226 40787 39508 39142 40238 40603 40057 37135 36951 39873 37316 40422 36951 37500 38412 38777"
## [265] "PMC9440133 PMC_DL/PMC9440133/supplementaryfiles/41598_2022_19106_MOESM2_ESM.xlsx Hsapiens 28 41883 37865 38231 40057 37226 40603 38596 39142 41153 38047 37681 40238 36951 37135 40787 39873 39326 39692 37316 37135 37316 39508 38777 38412 42248 36951 37500 40422"
## [266] "PMC9439955 PMC_DL/PMC9439955/supplementaryfiles/41564_2021_1049_MOESM4_ESM.xlsx Hsapiens 1 42063"
## [267] "PMC9439955 PMC_DL/PMC9439955/supplementaryfiles/41564_2021_1049_MOESM4_ESM.xlsx Hsapiens 1 42255"
## [268] "PMC9438737 PMC_DL/PMC9438737/supplementaryfiles/NIHMS1827843-supplement-Tables_A4_A11.xlsx Hsapiens 4 3-deoxy-2-octulosonic acid(2)-lipid A HG-10-11-01"
## [269] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM4_ESM.xlsx Hsapiens 52 9-Mar 15-Sep 10-Sep 2-Sep 1-Sep 3-Sep 1-Dec 12-Sep 2-Mar 1-Mar 4-Sep 10-Mar 4-Mar 3-Mar 11-Sep 8-Mar 14-Sep 5-Sep 7-Sep 6-Mar 6-Sep 7-Mar 9-Sep 8-Sep 5-Mar 11-Mar 4-Sep 2-Sep 8-Sep 1-Sep 1-Mar 3-Sep 3-Mar 5-Sep 6-Mar 9-Mar 8-Mar 10-Sep 12-Sep 9-Sep 2-Mar 15-Sep 14-Sep 7-Mar 10-Mar 4-Mar 11-Mar 5-Mar 6-Sep 11-Sep 7-Sep 1-Dec"
## [270] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM4_ESM.xlsx Hsapiens 52 9-Mar 15-Sep 10-Sep 2-Sep 1-Sep 3-Sep 1-Dec 12-Sep 2-Mar 1-Mar 4-Sep 10-Mar 4-Mar 3-Mar 11-Sep 8-Mar 14-Sep 5-Sep 7-Sep 6-Mar 6-Sep 7-Mar 9-Sep 8-Sep 5-Mar 11-Mar 4-Sep 2-Sep 8-Sep 1-Sep 1-Mar 3-Sep 3-Mar 5-Sep 6-Mar 9-Mar 8-Mar 10-Sep 12-Sep 9-Sep 2-Mar 15-Sep 14-Sep 7-Mar 10-Mar 4-Mar 11-Mar 5-Mar 6-Sep 11-Sep 7-Sep 1-Dec"
## [271] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM4_ESM.xlsx Hsapiens 880 8-Mar 12-Sep 1-Dec 11-Mar 1-Mar 6-Mar 12-Sep 8-Sep 8-Mar 1-Dec 6-Sep 2-Mar 14-Sep 9-Sep 6-Sep 6-Mar 11-Sep 2-Mar 12-Sep 3-Sep 2-Mar 5-Mar 9-Mar 11-Sep 15-Sep 14-Sep 2-Sep 12-Sep 15-Sep 11-Sep 6-Sep 10-Sep 3-Mar 12-Sep 15-Sep 2-Mar 4-Sep 8-Sep 2-Mar 6-Sep 5-Sep 4-Sep 9-Mar 10-Sep 3-Mar 11-Mar 4-Mar 9-Sep 3-Mar 15-Sep 10-Mar 10-Mar 1-Mar 2-Mar 1-Mar 7-Mar 1-Mar 4-Sep 12-Sep 6-Mar 11-Sep 1-Dec 1-Mar 8-Mar 3-Sep 4-Sep 1-Mar 7-Mar 3-Mar 1-Dec 2-Mar 3-Sep 3-Sep 11-Mar 1-Dec 6-Sep 11-Mar 7-Sep 12-Sep 11-Mar 2-Sep 14-Sep 9-Mar 10-Sep 12-Sep 8-Sep 5-Sep 4-Sep 8-Sep 7-Sep 8-Mar 5-Mar 2-Mar 5-Mar 8-Mar 1-Sep 4-Sep 7-Mar 5-Sep 3-Sep 3-Sep 10-Mar 2-Mar 1-Dec 9-Sep 7-Mar 6-Mar 12-Sep 1-Dec 10-Mar 8-Sep 8-Mar 6-Sep 11-Sep 14-Sep 1-Mar 2-Sep 8-Sep 11-Mar 2-Mar 6-Sep 1-Sep 10-Sep 1-Sep 11-Sep 8-Sep 10-Sep 1-Sep 8-Mar 11-Sep 10-Sep 5-Mar 1-Dec 7-Mar 1-Mar 1-Mar 6-Sep 6-Mar 8-Mar 4-Sep 3-Sep 5-Mar 1-Mar 8-Mar 15-Sep 5-Mar 8-Sep 14-Sep 2-Mar 15-Sep 15-Sep 9-Mar 9-Mar 4-Mar 2-Sep 2-Sep 2-Mar 1-Mar 10-Mar 10-Mar 14-Sep 1-Mar 1-Mar 6-Sep 5-Sep 2-Sep 6-Mar 14-Sep 2-Mar 15-Sep 4-Sep 11-Mar 3-Sep 1-Dec 6-Sep 1-Sep 9-Sep 10-Mar 4-Sep 4-Mar 6-Sep 11-Mar 7-Mar 10-Mar 4-Mar 11-Sep 1-Mar 1-Mar 7-Sep 15-Sep 2-Sep 8-Sep 1-Mar 2-Mar 9-Mar 2-Mar 9-Mar 10-Sep 15-Sep 7-Mar 3-Mar 10-Sep 9-Mar 1-Mar 1-Mar 3-Mar 7-Mar 1-Mar 6-Mar 2-Sep 5-Sep 7-Sep 11-Mar 7-Sep 10-Mar 9-Mar 4-Sep 2-Sep 2-Mar 9-Sep 10-Sep 2-Sep 4-Sep 7-Mar 14-Sep 6-Mar 4-Mar 8-Mar 4-Mar 2-Mar 3-Mar 2-Sep 14-Sep 2-Mar 12-Sep 4-Mar 3-Mar 6-Sep 8-Sep 10-Mar 3-Mar 9-Sep 5-Mar 4-Mar 3-Sep 7-Sep 1-Sep 10-Mar 7-Mar 4-Mar 9-Mar 5-Mar 2-Mar 9-Mar 9-Mar 1-Mar 6-Mar 9-Sep 8-Sep 1-Dec 12-Sep 15-Sep 9-Mar 2-Sep 1-Dec 11-Sep 8-Sep 6-Sep 3-Sep 6-Sep 5-Mar 15-Sep 4-Mar 1-Sep 5-Mar 1-Sep 1-Mar 15-Sep 2-Mar 2-Mar 4-Sep 1-Sep 15-Sep 10-Mar 9-Sep 9-Mar 9-Mar 7-Sep 2-Mar 11-Sep 3-Sep 3-Mar 8-Sep 2-Sep 11-Mar 1-Mar 8-Sep 14-Sep 11-Sep 7-Mar 2-Mar 1-Mar 7-Sep 1-Sep 1-Sep 7-Sep 3-Sep 1-Mar 4-Mar 1-Sep 14-Sep 5-Sep 9-Sep 1-Sep 11-Sep 6-Sep 7-Sep 9-Sep 2-Sep 14-Sep 14-Sep 5-Mar 1-Sep 3-Mar 9-Sep 5-Sep 1-Dec 12-Sep 7-Sep 4-Mar 7-Sep 3-Mar 8-Mar 1-Dec 7-Mar 4-Mar 15-Sep 11-Mar 8-Mar 5-Mar 6-Mar 2-Mar 3-Mar 9-Sep 11-Sep 5-Sep 9-Sep 3-Mar 8-Mar 12-Sep 7-Sep 6-Mar 2-Mar 8-Mar 7-Mar 2-Mar 7-Mar 5-Mar 9-Sep 8-Mar 4-Mar 10-Mar 4-Mar 4-Sep 1-Sep 3-Sep 14-Sep 3-Sep 2-Mar 5-Sep 9-Sep 7-Sep 7-Sep 9-Sep 7-Sep 5-Mar 2-Sep 3-Mar 14-Sep 2-Sep 1-Mar 1-Mar 3-Sep 4-Sep 1-Dec 10-Sep 5-Sep 8-Sep 2-Mar 2-Mar 4-Mar 10-Sep 5-Mar 1-Mar 10-Sep 10-Mar 11-Sep 14-Sep 5-Mar 10-Mar 11-Mar 10-Sep 6-Sep 3-Sep 3-Mar 11-Mar 11-Sep 5-Sep 4-Sep 8-Sep 2-Mar 2-Mar 10-Sep 1-Sep 6-Mar 1-Dec 7-Mar 11-Mar 10-Sep 10-Sep 12-Sep 10-Mar 12-Sep 11-Mar 11-Mar 12-Sep 11-Sep 5-Sep 5-Sep 8-Mar 9-Mar 1-Dec 5-Sep 7-Mar 4-Sep 1-Mar 15-Sep 5-Sep 2-Mar 5-Sep 6-Mar 2-Mar 8-Mar 8-Sep 6-Sep 2-Mar 10-Sep 7-Mar 2-Sep 7-Mar 14-Sep 6-Sep 2-Sep 1-Sep 5-Mar 5-Sep 10-Mar 3-Mar 9-Mar 10-Mar 8-Sep 8-Mar 1-Dec 1-Dec 11-Sep 2-Mar 7-Mar 10-Mar 12-Sep 5-Sep 8-Sep 4-Sep 10-Sep 1-Mar 2-Mar 14-Sep 4-Sep 7-Mar 1-Sep 2-Mar 3-Mar 4-Sep 12-Sep 4-Sep 7-Mar 2-Mar 1-Mar 8-Sep 1-Sep 5-Sep 2-Mar 5-Mar 1-Dec 3-Sep 1-Mar 5-Mar 2-Mar 10-Sep 12-Sep 4-Sep 9-Sep 9-Mar 12-Sep 11-Mar 2-Mar 12-Sep 11-Mar 3-Mar 9-Sep 7-Sep 6-Mar 3-Sep 7-Mar 12-Sep 3-Sep 8-Sep 2-Mar 9-Sep 7-Sep 4-Mar 1-Dec 2-Mar 9-Mar 1-Sep 1-Mar 4-Mar 4-Sep 5-Sep 11-Mar 7-Mar 2-Mar 8-Sep 2-Mar 8-Sep 2-Mar 11-Sep 9-Sep 1-Mar 14-Sep 11-Sep 5-Sep 9-Sep 1-Dec 8-Mar 11-Sep 11-Sep 7-Sep 2-Sep 1-Sep 3-Sep 9-Sep 1-Sep 15-Sep 10-Mar 10-Sep 5-Mar 6-Sep 9-Sep 4-Mar 8-Sep 4-Sep 9-Mar 10-Sep 1-Sep 3-Mar 2-Sep 8-Sep 10-Mar 1-Mar 3-Mar 8-Sep 2-Mar 6-Sep 3-Sep 8-Sep 1-Sep 12-Sep 14-Sep 15-Sep 3-Sep 3-Mar 4-Mar 10-Sep 15-Sep 1-Mar 5-Sep 11-Sep 11-Sep 1-Mar 2-Mar 8-Mar 10-Mar 11-Sep 1-Mar 1-Dec 2-Mar 5-Sep 4-Sep 7-Sep 6-Mar 9-Mar 10-Mar 10-Sep 1-Mar 9-Mar 3-Sep 6-Sep 1-Mar 1-Dec 1-Sep 7-Mar 9-Sep 11-Mar 7-Sep 4-Sep 3-Sep 4-Mar 1-Mar 2-Mar 8-Mar 7-Mar 10-Sep 4-Mar 1-Mar 15-Sep 1-Sep 14-Sep 10-Sep 15-Sep 5-Mar 9-Mar 4-Mar 2-Sep 3-Mar 7-Sep 14-Sep 11-Sep 3-Mar 10-Sep 8-Mar 3-Sep 2-Sep 7-Mar 1-Dec 2-Sep 2-Mar 3-Sep 3-Mar 9-Mar 5-Mar 11-Mar 9-Mar 12-Sep 15-Sep 1-Mar 2-Sep 2-Sep 4-Sep 1-Mar 1-Mar 2-Mar 10-Mar 8-Mar 9-Mar 10-Mar 2-Mar 12-Sep 6-Sep 8-Mar 7-Sep 8-Sep 5-Mar 7-Sep 1-Dec 10-Mar 8-Mar 2-Sep 9-Sep 2-Sep 12-Sep 10-Sep 5-Sep 1-Dec 10-Sep 14-Sep 8-Sep 6-Mar 11-Mar 6-Sep 8-Sep 12-Sep 6-Mar 1-Dec 1-Mar 6-Mar 10-Mar 2-Mar 1-Mar 5-Mar 15-Sep 7-Mar 2-Sep 7-Sep 14-Sep 7-Mar 12-Sep 4-Sep 9-Sep 11-Mar 1-Mar 3-Sep 3-Mar 15-Sep 2-Mar 6-Mar 7-Mar 10-Sep 1-Sep 6-Mar 8-Mar 4-Sep 11-Mar 2-Mar 2-Mar 6-Mar 5-Mar 6-Mar 1-Dec 3-Sep 7-Mar 10-Sep 6-Sep 9-Mar 11-Mar 11-Sep 2-Sep 4-Sep 5-Mar 6-Sep 12-Sep 3-Mar 2-Mar 6-Sep 7-Sep 5-Sep 1-Sep 1-Sep 6-Sep 4-Mar 6-Sep 1-Sep 4-Mar 12-Sep 9-Mar 11-Mar 5-Sep 11-Mar 4-Mar 15-Sep 1-Dec 6-Sep 8-Mar 10-Mar 11-Sep 11-Mar 4-Mar 1-Mar 5-Mar 11-Sep 11-Mar 14-Sep 2-Mar 2-Mar 15-Sep 9-Sep 15-Sep 1-Dec 1-Mar 2-Sep 9-Sep 3-Sep 9-Mar 8-Mar 12-Sep 11-Sep 7-Sep 4-Mar 14-Sep 3-Mar 6-Sep 9-Mar 2-Mar 5-Sep 1-Mar 4-Sep 14-Sep 15-Sep 7-Sep 4-Sep 4-Mar 9-Sep 7-Sep 5-Mar 1-Dec 5-Mar 8-Mar 3-Mar 11-Sep 1-Mar 6-Mar 1-Mar 15-Sep 9-Sep 8-Mar 6-Sep 4-Mar 3-Sep 3-Sep 5-Mar 2-Sep 7-Sep 10-Mar 8-Sep 4-Mar 14-Sep 9-Mar 10-Sep 5-Sep 8-Mar 14-Sep 7-Mar 11-Sep 12-Sep 6-Sep 5-Sep 5-Sep 10-Mar 2-Sep 4-Mar 3-Sep 10-Mar 11-Mar 15-Sep 4-Sep 11-Sep 7-Sep 14-Sep 1-Dec 9-Mar 2-Mar 6-Mar 10-Sep 1-Mar 5-Mar 7-Sep 1-Sep 1-Mar 3-Mar 11-Mar 5-Mar 3-Mar 5-Sep 8-Sep 2-Mar 11-Mar 9-Sep 3-Mar 7-Mar 15-Sep 2-Mar 15-Sep 10-Mar 1-Sep 9-Sep 14-Sep 8-Mar 1-Mar 5-Sep 14-Sep 1-Mar"
## [272] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM5_ESM.xlsx Hsapiens 13 44263 44261 44257 44454 44266 44259 44262 44531 44265 44258 44264 44260 44256"
## [273] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM5_ESM.xlsx Hsapiens 13 44264 44258 44262 44454 44265 44261 44259 44266 44531 44260 44257 44256 44263"
## [274] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM5_ESM.xlsx Hsapiens 9 44262 44259 44266 44265 44258 44264 44260 44263 44531"
## [275] "PMC9433322 PMC_DL/PMC9433322/supplementaryfiles/41586_2022_5126_MOESM5_ESM.xlsx Hsapiens 9 44259 44265 44260 44258 44266 44262 44263 44264 44531"
## [276] "PMC9429972 PMC_DL/PMC9429972/supplementaryfiles/NIHMS1798678-supplement-2.xlsx Ggallus 150 44448 44445 44445 44445 44448 44445 44448 44448 44445 44445 44448 44448 44448 44448 44440 44440 44440 44448 44448 44448 44445 44445 44446 44447 44440 44446 44446 44446 44446 44448 44445 44448 44446 44448 44448 44448 44448 44448 44448 44445 44445 44446 44448 44448 44446 44445 44445 44446 44454 44445 44448 44448 44448 44446 44445 44445 44440 44448 44448 44445 44445 44453 44446 44440 44445 44448 44440 44445 44448 44448 44448 44448 44448 44448 44446 44448 44445 44445 44448 44448 44448 44454 44445 44445 44448 44448 44440 44448 44448 44445 44446 44448 44448 44448 44448 44440 44445 44445 44440 44445 44448 44446 44440 44448 44448 44445 44446 44445 44448 44445 44446 44448 44440 44446 44440 44448 44440 44448 44448 44448 44448 44440 44448 44447 44445 44446 44446 44448 44448 44445 44446 44448 44448 44445 44445 44448 44448 44448 44448 44446 44445 44448 44446 44448 44448 44454 44448 44446 44445 44445"
## [277] "PMC9429730 PMC_DL/PMC9429730/supplementaryfiles/12916_2022_2476_MOESM2_ESM.xls Hsapiens 1 44453"
## [278] "PMC9429730 PMC_DL/PMC9429730/supplementaryfiles/12916_2022_2476_MOESM2_ESM.xls Hsapiens 2 44260 44443"
## [279] "PMC9429730 PMC_DL/PMC9429730/supplementaryfiles/12916_2022_2476_MOESM1_ESM.xls Hsapiens 1 44453"
## [280] "PMC9429730 PMC_DL/PMC9429730/supplementaryfiles/12916_2022_2476_MOESM1_ESM.xls Hsapiens 2 44260 44443"
## [281] "PMC9428261 PMC_DL/PMC9428261/supplementaryfiles/Table5.XLSX Athaliana 1 1-Sep"
Let’s investigate the errors in more detail.
# By species
SPECIES <- sapply(strsplit(ERROR_GENELISTS," "),"[[",3)
table(SPECIES)
## SPECIES
## Athaliana Celegans Dmelanogaster Drerio Ggallus
## 14 4 4 11 4
## Hsapiens Mmusculus
## 173 71
par(mar=c(5,12,4,2))
barplot(table(SPECIES),horiz=TRUE,las=1)
par(mar=c(5,5,4,2))
# Number of affected Excel files per paper
DIST <- table(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
DIST
##
## PMC9396287 PMC9421669 PMC9426562 PMC9428261 PMC9429730 PMC9429972 PMC9433322
## 1 2 4 1 4 1 7
## PMC9434320 PMC9434886 PMC9436054 PMC9438737 PMC9439955 PMC9440133 PMC9440308
## 2 6 2 1 2 2 1
## PMC9441061 PMC9441509 PMC9445361 PMC9445988 PMC9447897 PMC9448678 PMC9449361
## 1 1 2 3 1 2 2
## PMC9449593 PMC9450692 PMC9451151 PMC9452507 PMC9453157 PMC9453314 PMC9454937
## 1 1 20 1 1 1 1
## PMC9455665 PMC9456386 PMC9456615 PMC9456764 PMC9458043 PMC9458435 PMC9458853
## 1 1 2 1 9 1 1
## PMC9461288 PMC9465009 PMC9465038 PMC9465246 PMC9468337 PMC9468827 PMC9473489
## 3 1 1 2 12 2 6
## PMC9476630 PMC9477516 PMC9477841 PMC9478359 PMC9478571 PMC9478897 PMC9481139
## 7 1 15 1 4 1 3
## PMC9481623 PMC9481918 PMC9482661 PMC9483895 PMC9484176 PMC9484640 PMC9485127
## 1 10 4 2 2 1 3
## PMC9485450 PMC9485579 PMC9486403 PMC9487751 PMC9489775 PMC9490000 PMC9492468
## 1 6 2 2 1 2 2
## PMC9493120 PMC9493312 PMC9495559 PMC9495734 PMC9498843 PMC9499627 PMC9500044
## 1 1 1 2 3 1 5
## PMC9500203 PMC9501657 PMC9502050 PMC9502060 PMC9504918 PMC9508838 PMC9508851
## 3 4 1 4 10 2 5
## PMC9508879 PMC9509351 PMC9509553 PMC9509640 PMC9509653 PMC9510610 PMC9511766
## 5 1 5 1 1 3 4
## PMC9512081 PMC9513039 PMC9513135 PMC9513589 PMC9513884 PMC9513945 PMC9514850
## 1 2 1 1 1 3 8
## PMC9515394 PMC9515781 PMC9518822 PMC9518893 PMC9519376 PMC9520462 PMC9521910
## 1 4 3 1 1 1 1
## PMC9522333
## 1
summary(as.numeric(DIST))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.000 2.000 2.838 3.000 20.000
hist(DIST,main="Number of affected Excel files per paper")
# PMC Articles with the most errors
DIST_DF <- as.data.frame(DIST)
DIST_DF <- DIST_DF[order(-DIST_DF$Freq),,drop=FALSE]
head(DIST_DF,20)
## Var1 Freq
## 24 PMC9451151 20
## 45 PMC9477841 15
## 40 PMC9468337 12
## 51 PMC9481918 10
## 75 PMC9504918 10
## 33 PMC9458043 9
## 91 PMC9514850 8
## 7 PMC9433322 7
## 43 PMC9476630 7
## 9 PMC9434886 6
## 42 PMC9473489 6
## 58 PMC9485579 6
## 70 PMC9500044 5
## 77 PMC9508851 5
## 78 PMC9508879 5
## 80 PMC9509553 5
## 3 PMC9426562 4
## 5 PMC9429730 4
## 47 PMC9478571 4
## 52 PMC9482661 4
MOST_ERR_FILES = as.character(DIST_DF[1,1])
MOST_ERR_FILES
## [1] "PMC9451151"
# Number of errors per paper
NERR <- as.numeric(sapply(strsplit(ERROR_GENELISTS," "),"[[",4))
names(NERR) <- sapply(strsplit(ERROR_GENELISTS," "),"[[",1)
NERR <-tapply(NERR, names(NERR), sum)
NERR
## PMC9396287 PMC9421669 PMC9426562 PMC9428261 PMC9429730 PMC9429972 PMC9433322
## 144 2 151 1 6 150 1028
## PMC9434320 PMC9434886 PMC9436054 PMC9438737 PMC9439955 PMC9440133 PMC9440308
## 6 21 3 4 2 53 1
## PMC9441061 PMC9441509 PMC9445361 PMC9445988 PMC9447897 PMC9448678 PMC9449361
## 3 1 3 41 6 610 26
## PMC9449593 PMC9450692 PMC9451151 PMC9452507 PMC9453157 PMC9453314 PMC9454937
## 1 25 231 21 1 1 12
## PMC9455665 PMC9456386 PMC9456615 PMC9456764 PMC9458043 PMC9458435 PMC9458853
## 22 3 3 1 32 4 1
## PMC9461288 PMC9465009 PMC9465038 PMC9465246 PMC9468337 PMC9468827 PMC9473489
## 12 1 3 7 69 2 15
## PMC9476630 PMC9477516 PMC9477841 PMC9478359 PMC9478571 PMC9478897 PMC9481139
## 21 1 137 29 20 3 37
## PMC9481623 PMC9481918 PMC9482661 PMC9483895 PMC9484176 PMC9484640 PMC9485127
## 1 78 107 3 2 1 8
## PMC9485450 PMC9485579 PMC9486403 PMC9487751 PMC9489775 PMC9490000 PMC9492468
## 1 150 5 4 11 2 3
## PMC9493120 PMC9493312 PMC9495559 PMC9495734 PMC9498843 PMC9499627 PMC9500044
## 11 44 1 4 12 1 64
## PMC9500203 PMC9501657 PMC9502050 PMC9502060 PMC9504918 PMC9508838 PMC9508851
## 5 7 4 180 554 5 5
## PMC9508879 PMC9509351 PMC9509553 PMC9509640 PMC9509653 PMC9510610 PMC9511766
## 11 1 22 2 1 21 14
## PMC9512081 PMC9513039 PMC9513135 PMC9513589 PMC9513884 PMC9513945 PMC9514850
## 1 6 3 1 9 11 9
## PMC9515394 PMC9515781 PMC9518822 PMC9518893 PMC9519376 PMC9520462 PMC9521910
## 1 8 18 31 17 2 40
## PMC9522333
## 1
hist(NERR,main="number of errors per PMC article")
NERR_DF <- as.data.frame(NERR)
NERR_DF <- NERR_DF[order(-NERR_DF$NERR),,drop=FALSE]
head(NERR_DF,20)
## NERR
## PMC9433322 1028
## PMC9448678 610
## PMC9504918 554
## PMC9451151 231
## PMC9502060 180
## PMC9426562 151
## PMC9429972 150
## PMC9485579 150
## PMC9396287 144
## PMC9477841 137
## PMC9482661 107
## PMC9481918 78
## PMC9468337 69
## PMC9500044 64
## PMC9440133 53
## PMC9493312 44
## PMC9445988 41
## PMC9521910 40
## PMC9481139 37
## PMC9458043 32
MOST_ERR = rownames(NERR_DF)[1]
MOST_ERR
## [1] "PMC9433322"
GENELIST_ERROR_ARTICLES <- gsub("PMC","",GENELIST_ERROR_ARTICLES)
### JSON PARSING is more reliable than XML
ARTICLES <- esummary( GENELIST_ERROR_ARTICLES , db="pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLES,as= "parsed")
ARTICLE_DATA <- ARTICLE_DATA$result
ARTICLE_DATA <- ARTICLE_DATA[2:length(ARTICLE_DATA)]
JOURNALS <- unlist(lapply(ARTICLE_DATA,function(x) {x$fulljournalname} ))
JOURNALS_TABLE <- table(JOURNALS)
JOURNALS_TABLE <- JOURNALS_TABLE[order(-JOURNALS_TABLE)]
length(JOURNALS_TABLE)
## [1] 62
NUM_JOURNALS=length(JOURNALS_TABLE)
par(mar=c(5,25,4,2))
barplot(head(JOURNALS_TABLE,10), horiz=TRUE, las=1,
xlab="Articles with gene name errors in supp files",
main="Top journals this month")
Congrats to our Journal of the Month winner!
JOURNAL_WINNER <- names(head(JOURNALS_TABLE,1))
JOURNAL_WINNER
## [1] "Frontiers in Genetics"
There are two categories:
Paper with the most suplementary files affected by gene name errors (MOST_ERR_FILES)
Paper with the most gene names converted to dates (MOST_ERR)
Sometimes, one paper can win both categories. Congrats to our winners.
MOST_ERR_FILES <- gsub("PMC","",MOST_ERR_FILES)
ARTICLES <- esummary( MOST_ERR_FILES , db="pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLES,as= "parsed")
ARTICLE_DATA <- ARTICLE_DATA[2]
ARTICLE_DATA
## $result
## $result$uids
## [1] "9451151"
##
## $result$`9451151`
## $result$`9451151`$uid
## [1] "9451151"
##
## $result$`9451151`$pubdate
## [1] "2022 Sep 7"
##
## $result$`9451151`$epubdate
## [1] "2022 Sep 7"
##
## $result$`9451151`$printpubdate
## [1] ""
##
## $result$`9451151`$source
## [1] "Sci Adv"
##
## $result$`9451151`$authors
## name authtype
## 1 Langouët M Author
## 2 Jolicoeur C Author
## 3 Javed A Author
## 4 Mattar P Author
## 5 Gearhart MD Author
## 6 Daiger SP Author
## 7 Bertelsen M Author
## 8 Tranebjærg L Author
## 9 Rendtorff ND Author
## 10 Grønskov K Author
## 11 Jespersgaard C Author
## 12 Chen R Author
## 13 Sun Z Author
## 14 Li H Author
## 15 Alirezaie N Author
## 16 Majewski J Author
## 17 Bardwell VJ Author
## 18 Sui R Author
## 19 Koenekoop RK Author
## 20 Cayouette M Author
##
## $result$`9451151`$title
## [1] "Mutations in BCOR, a co-repressor of CRX/OTX2, are associated with early-onset retinal degeneration"
##
## $result$`9451151`$volume
## [1] "8"
##
## $result$`9451151`$issue
## [1] "36"
##
## $result$`9451151`$pages
## [1] "eabh2868"
##
## $result$`9451151`$articleids
## idtype value
## 1 pmid 36070393
## 2 doi 10.1126/sciadv.abh2868
## 3 pmcid PMC9451151
##
## $result$`9451151`$fulljournalname
## [1] "Science Advances"
##
## $result$`9451151`$sortdate
## [1] "2022/09/07 00:00"
##
## $result$`9451151`$pmclivedate
## [1] "2022/09/29"
MOST_ERR <- gsub("PMC","",MOST_ERR)
ARTICLE_DATA <- esummary(MOST_ERR,db = "pmc" , retmode = "json" )
ARTICLE_DATA <- reutils::content(ARTICLE_DATA,as= "parsed")
ARTICLE_DATA
## $header
## $header$type
## [1] "esummary"
##
## $header$version
## [1] "0.3"
##
##
## $result
## $result$uids
## [1] "9433322"
##
## $result$`9433322`
## $result$`9433322`$uid
## [1] "9433322"
##
## $result$`9433322`$pubdate
## [1] "2022 Aug 24"
##
## $result$`9433322`$epubdate
## [1] "2022 Aug 24"
##
## $result$`9433322`$printpubdate
## [1] "2022"
##
## $result$`9433322`$source
## [1] "Nature"
##
## $result$`9433322`$authors
## name authtype
## 1 Carnevale J Author
## 2 Shifrut E Author
## 3 Kale N Author
## 4 Nyberg WA Author
## 5 Blaeschke F Author
## 6 Chen YY Author
## 7 Li Z Author
## 8 Bapat SP Author
## 9 Diolaiti ME Author
## 10 O’Leary P Author
## 11 Vedova S Author
## 12 Belk J Author
## 13 Daniel B Author
## 14 Roth TL Author
## 15 Bachl S Author
## 16 Anido AA Author
## 17 Prinzing B Author
## 18 Ibañez-Vega J Author
## 19 Lange S Author
## 20 Haydar D Author
## 21 Luetke-Eversloh M Author
## 22 Born-Bony M Author
## 23 Hegde B Author
## 24 Kogan S Author
## 25 Feuchtinger T Author
## 26 Okada H Author
## 27 Satpathy AT Author
## 28 Shannon K Author
## 29 Gottschalk S Author
## 30 Eyquem J Author
## 31 Krenciute G Author
## 32 Ashworth A Author
## 33 Marson A Author
##
## $result$`9433322`$title
## [1] "RASA2 ablation in T cells boosts antigen sensitivity and long-term function"
##
## $result$`9433322`$volume
## [1] "609"
##
## $result$`9433322`$issue
## [1] "7925"
##
## $result$`9433322`$pages
## [1] "174-182"
##
## $result$`9433322`$articleids
## idtype value
## 1 pmid 36002574
## 2 doi 10.1038/s41586-022-05126-w
## 3 pmcid PMC9433322
##
## $result$`9433322`$fulljournalname
## [1] "Nature"
##
## $result$`9433322`$sortdate
## [1] "2022/08/24 00:00"
##
## $result$`9433322`$pmclivedate
## [1] "2022/09/02"
To plot the trend over the past 6-12 months.
url <- "https://ziemann-lab.net/public/gene_name_errors/"
doc <- htmlParse(url)
links <- xpathSApply(doc, "//a/@href")
links <- links[grep("html",links)]
listing <- htmlParse( getURL(url, ftp.use.epsv = FALSE, dirlistonly = TRUE) )
listing <- xpathSApply(listing, "//a/@href")
listing <- listing[grep("html",listing)]
unlink("online_files/",recursive=TRUE)
dir.create("online_files")
sapply(listing, function(mylink) {
download.file(paste(url,mylink,sep=""),destfile=paste("online_files/",mylink,sep=""))
} )
## href href href href href href href href href href href href href href href href
## 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## href href href
## 0 0 0
myfilelist <- list.files("online_files/",full.names=TRUE)
trends <- sapply(myfilelist, function(myfilename) {
x <- readLines(myfilename)
# Num XL gene list articles
NUM_GENELIST_ARTICLES <- x[grep("NUM_GENELIST_ARTICLES",x)[3]+1]
NUM_GENELIST_ARTICLES <- sapply(strsplit(NUM_GENELIST_ARTICLES," "),"[[",3)
NUM_GENELIST_ARTICLES <- sapply(strsplit(NUM_GENELIST_ARTICLES,"<"),"[[",1)
NUM_GENELIST_ARTICLES <- as.numeric(NUM_GENELIST_ARTICLES)
# number of affected articles
NUM_ERROR_GENELIST_ARTICLES <- x[grep("NUM_ERROR_GENELIST_ARTICLES",x)[3]+1]
NUM_ERROR_GENELIST_ARTICLES <- sapply(strsplit(NUM_ERROR_GENELIST_ARTICLES," "),"[[",3)
NUM_ERROR_GENELIST_ARTICLES <- sapply(strsplit(NUM_ERROR_GENELIST_ARTICLES,"<"),"[[",1)
NUM_ERROR_GENELIST_ARTICLES <- as.numeric(NUM_ERROR_GENELIST_ARTICLES)
# Error proportion
ERROR_PROPORTION <- x[grep("ERROR_PROPORTION",x)[3]+1]
ERROR_PROPORTION <- sapply(strsplit(ERROR_PROPORTION," "),"[[",3)
ERROR_PROPORTION <- sapply(strsplit(ERROR_PROPORTION,"<"),"[[",1)
ERROR_PROPORTION <- as.numeric(ERROR_PROPORTION)
# number of journals
NUM_JOURNALS <- x[grep('JOURNALS_TABLE',x)[3]+1]
NUM_JOURNALS <- sapply(strsplit(NUM_JOURNALS," "),"[[",3)
NUM_JOURNALS <- sapply(strsplit(NUM_JOURNALS,"<"),"[[",1)
NUM_JOURNALS <- as.numeric(NUM_JOURNALS)
NUM_JOURNALS
res <- c(NUM_GENELIST_ARTICLES,NUM_ERROR_GENELIST_ARTICLES,ERROR_PROPORTION,NUM_JOURNALS)
return(res)
})
colnames(trends) <- sapply(strsplit(colnames(trends),"_"),"[[",3)
colnames(trends) <- gsub(".html","",colnames(trends))
trends <- as.data.frame(trends)
rownames(trends) <- c("NUM_GENELIST_ARTICLES","NUM_ERROR_GENELIST_ARTICLES","ERROR_PROPORTION","NUM_JOURNALS")
trends <- t(trends)
trends <- as.data.frame(trends)
CURRENT_RES <- c(NUM_GENELIST_ARTICLES,NUM_ERROR_GENELIST_ARTICLES,ERROR_PROPORTION,NUM_JOURNALS)
trends <- rbind(trends,CURRENT_RES)
paste(CURRENT_YEAR,CURRENT_MONTH,sep="-")
## [1] "2022-10"
rownames(trends)[nrow(trends)] <- paste(CURRENT_YEAR,CURRENT_MONTH,sep="-")
plot(trends$NUM_GENELIST_ARTICLES, xaxt = "n" , type="b" , main="Number of articles with Excel gene lists per month",
ylab="number of articles", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
plot(trends$NUM_ERROR_GENELIST_ARTICLES, xaxt = "n" , type="b" , main="Number of articles with gene name errors per month",
ylab="number of articles", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
plot(trends$ERROR_PROPORTION, xaxt = "n" , type="b" , main="Proportion of articles with Excel gene list affected by errors",
ylab="proportion", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
plot(trends$NUM_JOURNALS, xaxt = "n" , type="b" , main="Number of journals with affected articles",
ylab="number of journals", xlab="month")
axis(1, at=1:nrow(trends), labels=rownames(trends))
unlink("online_files/",recursive=TRUE)
Zeeberg, B.R., Riss, J., Kane, D.W. et al. Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC Bioinformatics 5, 80 (2004). https://doi.org/10.1186/1471-2105-5-80
Ziemann, M., Eren, Y. & El-Osta, A. Gene name errors are widespread in the scientific literature. Genome Biol 17, 177 (2016). https://doi.org/10.1186/s13059-016-1044-7
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
## [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8
## [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] RCurl_1.98-1.7 readxl_1.4.0 reutils_0.2.3 xml2_1.3.3 jsonlite_1.8.0
## [6] XML_3.99-0.10
##
## loaded via a namespace (and not attached):
## [1] knitr_1.39 magrittr_2.0.3 R6_2.5.1 rlang_1.0.4
## [5] fastmap_1.1.0 stringr_1.4.0 highr_0.9 tools_4.2.1
## [9] xfun_0.31 cli_3.3.0 jquerylib_0.1.4 htmltools_0.5.3
## [13] yaml_2.3.5 digest_0.6.29 assertthat_0.2.1 sass_0.4.2
## [17] bitops_1.0-7 cachem_1.0.6 evaluate_0.15 rmarkdown_2.14
## [21] stringi_1.7.8 compiler_4.2.1 bslib_0.4.0 cellranger_1.1.0