Source: https://github.com/markziemann/GeneNameErrors2020
View the reports: http://ziemann-lab.net/public/gene_name_errors/
Gene name errors result when data are imported improperly into MS Excel and other spreadsheet programs (Zeeberg et al, 2004). Certain gene names like MARCH3, SEPT2 and DEC1 are converted into date format. These errors are surprisingly common in supplementary data files in the field of genomics (Ziemann et al, 2016). This could be considered a small error because it only affects a small number of genes, however it is symptomtic of poor data processing methods. The purpose of this script is to identify gene name errors present in supplementary files of PubMed Central articles in the previous month.
library("jsonlite")
library("xml2")
library("reutils")
library("readxl")
Here I will be getting PubMed Central IDs for the previous month.
Start with figuring out the date to search PubMed Central.
DATE="2021/2"
Let’s see how many PMC IDs we have in the past month.
QUERY ='((genom*[Abstract]))'
ESEARCH_RES <- esearch(term=QUERY, db = "pmc", rettype = "uilist", retmode = "xml", retstart = 0,
retmax = 5000000, usehistory = TRUE, webenv = NULL, querykey = NULL, sort = NULL, field = NULL,
datetype = NULL, reldate = NULL, mindate = DATE, maxdate = DATE)
pmc <- efetch(ESEARCH_RES,retmode="text",rettype="uilist",outfile="pmcids.txt")
## Retrieving UIDs 1 to 500
## Retrieving UIDs 501 to 1000
## Retrieving UIDs 1001 to 1500
## Retrieving UIDs 1501 to 2000
## Retrieving UIDs 2001 to 2500
## Retrieving UIDs 2501 to 3000
pmc <- read.table(pmc)
pmc <- paste("PMC",pmc$V1,sep="")
NUM_ARTICLES=length(pmc)
NUM_ARTICLES
## [1] 2632
writeLines(pmc,con="pmc.txt")
Now run the bash script. Note that false positives can occur (~1.5%) and these results have not been verified by a human.
Here are some definitions:
NUM_XLS = Number of supplementary Excel files in this set of PMC articles.
NUM_XLS_ARTICLES = Number of articles matching the PubMed Central search which have supplementary Excel files.
GENELISTS = The gene lists found in the Excel files. Each Excel file is counted once even it has multiple gene lists.
NUM_GENELISTS = The number of Excel files with gene lists.
NUM_GENELIST_ARTICLES = The number of PMC articles with supplementary Excel gene lists.
ERROR_GENELISTS = Files suspected to contain gene name errors. The dates and five-digit numbers indicate transmogrified gene names.
NUM_ERROR_GENELISTS = Number of Excel gene lists with errors.
NUM_ERROR_GENELIST_ARTICLES = Number of articles with supplementary Excel gene name errors.
ERROR_PROPORTION = This is the proportion of articles with Excel gene lists that have errors.
#system("./gene_names.sh pmc.txt")
results <- readLines("results.txt")
XLS <- results[grep("XLS",results,ignore.case=TRUE)]
NUM_XLS = length(XLS)
NUM_XLS
## [1] 3318
NUM_XLS_ARTICLES = length(unique(sapply(strsplit(XLS," "),"[[",1)))
NUM_XLS_ARTICLES
## [1] 575
GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>2]
#GENELISTS
NUM_GENELISTS <- length(unique(sapply(strsplit(GENELISTS," "),"[[",2)))
NUM_GENELISTS
## [1] 470
NUM_GENELIST_ARTICLES <- length(unique(sapply(strsplit(GENELISTS," "),"[[",1)))
NUM_GENELIST_ARTICLES
## [1] 214
ERROR_GENELISTS <- XLS[lapply(strsplit(XLS," "),length)>3]
#ERROR_GENELISTS
NUM_ERROR_GENELISTS = length(ERROR_GENELISTS)
NUM_ERROR_GENELISTS
## [1] 219
GENELIST_ERROR_ARTICLES <- unique(sapply(strsplit(ERROR_GENELISTS," "),"[[",1))
GENELIST_ERROR_ARTICLES
## [1] "PMC7908713" "PMC7903802" "PMC7893923" "PMC7890893" "PMC7887196"
## [6] "PMC7884730" "PMC7884410" "PMC7881617" "PMC7881115" "PMC7881037"
## [11] "PMC7880998" "PMC7878750" "PMC7896317" "PMC7885916" "PMC7876146"
## [16] "PMC7876141" "PMC7874467" "PMC7871411" "PMC7896349" "PMC7870932"
## [21] "PMC7870011" "PMC7894049" "PMC7893110" "PMC7865055" "PMC7865025"
## [26] "PMC7864951" "PMC7863008" "PMC7888619" "PMC7863452" "PMC7862275"
## [31] "PMC7861379" "PMC7861375" "PMC7866887" "PMC7859232" "PMC7887632"
## [36] "PMC7884756" "PMC7884045" "PMC7880683" "PMC7854732" "PMC7882740"
## [41] "PMC7851759" "PMC7116828" "PMC7851345" "PMC7851772" "PMC7875399"
## [46] "PMC7846840" "PMC7845134" "PMC7876278" "PMC7844411" "PMC7844020"
## [51] "PMC7873973" "PMC7846933" "PMC7868554" "PMC7862794" "PMC7862768"
## [56] "PMC7859520" "PMC7859435" "PMC7880367" "PMC7880322" "PMC7845975"
## [61] "PMC7845644" "PMC7848703" "PMC7848201" "PMC7808690" "PMC7877913"
## [66] "PMC7874222" "PMC7873862" "PMC7868925" "PMC7834956" "PMC7880379"
## [71] "PMC7876704" "PMC7834090" "PMC7889151" "PMC7671374" "PMC7854777"
## [76] "PMC7850965"
NUM_ERROR_GENELIST_ARTICLES <- length(GENELIST_ERROR_ARTICLES)
NUM_ERROR_GENELIST_ARTICLES
## [1] 76
ERROR_PROPORTION = NUM_ERROR_GENELIST_ARTICLES / NUM_GENELIST_ARTICLES
ERROR_PROPORTION
## [1] 0.3551402
Here you can have a look at all the gene lists detected in the past month, as well as those with errors. The dates are obvious errors, these are commonly dates in September, March, December and October. The five-digit numbers represent dates as they are encoded in the Excel internal format. The five digit number is the number of days since 1900. If you were to take these numbers and put them into Excel and format the cells as dates, then these will also mostly map to dates in September, March, December and October.
#GENELISTS
ERROR_GENELISTS
## [1] "PMC7908713 /pmc/articles/PMC7908713/bin/13073_2021_852_MOESM2_ESM.xlsx Hsapiens 20 40238 39142 36951 37500 37865 38047 41883 39508 37316 38231 40787 38777 40603 38596 40422 37316 39692 37135 40057 38412"
## [2] "PMC7908713 /pmc/articles/PMC7908713/bin/13073_2021_852_MOESM3_ESM.xlsx Hsapiens 28 37865 40238 40603 36951 39142 40057 41883 39508 40422 38596 39692 37316 36951 38231 40787 41153 37316 39326 37500 37681 37135 38961 39873 38412 38047 38777 42248 37226"
## [3] "PMC7908713 /pmc/articles/PMC7908713/bin/13073_2021_852_MOESM3_ESM.xlsx Hsapiens 28 37316 39142 37865 40238 40603 41883 38777 36951 38596 40422 39508 38047 40057 36951 38231 41153 39692 39326 40787 37681 38961 37500 37135 37316 38412 39873 42248 37226"
## [4] "PMC7908713 zip/Supplementary_table_S1.xlsx Hsapiens 26 40057 37226 37316 40422 37681 39142 37500 37865 41883 36951 40238 40603 38047 38412 38777 39508 39873 42248 37135 40787 41153 38231 38596 38961 39326 39692"
## [5] "PMC7908713 zip/Supplementary_table_S1.xlsx Hsapiens 26 38047 37316 37865 40238 39142 37226 36951 40603 37681 38412 38777 39508 39873 42248 37135 40422 40787 41153 41883 37500 38231 38596 38961 39326 39692 40057"
## [6] "PMC7903802 /pmc/articles/PMC7903802/bin/12864_2021_7438_MOESM2_ESM.xlsx Scerevisiae 1 37165"
## [7] "PMC7893923 /pmc/articles/PMC7893923/bin/13058_2021_1402_MOESM1_ESM.xlsx Hsapiens 20 43526 43526 43526 43527 43527 43529 43529 43530 43531 43723 43719 43710 43710 43711 43711 43713 43714 43715 43716 43717"
## [8] "PMC7893923 /pmc/articles/PMC7893923/bin/13058_2021_1402_MOESM1_ESM.xlsx Hsapiens 19 43526 43526 43527 43527 43529 43529 43530 43531 43533 43723 43719 43710 43710 43710 43711 43711 43713 43715 43717"
## [9] "PMC7893923 /pmc/articles/PMC7893923/bin/13058_2021_1402_MOESM1_ESM.xlsx Hsapiens 19 43526 43526 43527 43527 43529 43529 43530 43531 43532 43533 43723 43719 43710 43710 43711 43713 43714 43715 43717"
## [10] "PMC7890893 /pmc/articles/PMC7890893/bin/13046_2021_1865_MOESM5_ESM.xlsx Hsapiens 2 43891 43892"
## [11] "PMC7887196 /pmc/articles/PMC7887196/bin/41598_2021_82877_MOESM2_ESM.xlsx Ggallus 27 42248 37316 36951 40422 39142 38047 37500 40787 36951 38777 40603 37681 39692 39326 41883 37226 39508 38412 41153 37135 37135 38231 40238 40057 37316 38596 37865"
## [12] "PMC7887196 /pmc/articles/PMC7887196/bin/41598_2021_82877_MOESM2_ESM.xlsx Ggallus 28 42248 37316 36951 40422 39142 38047 37500 40787 36951 38777 40603 37681 39692 39326 41883 37226 39508 38412 41153 37135 37135 38231 40238 40057 37316 38596 37865 37865"
## [13] "PMC7887196 /pmc/articles/PMC7887196/bin/41598_2021_82877_MOESM2_ESM.xlsx Ggallus 28 42248 37316 36951 40422 39142 38047 37500 40787 36951 38777 40603 37681 39692 39326 41883 37226 39508 38412 41153 37135 37135 38231 40238 40057 37316 38596 37865 37865"
## [14] "PMC7884730 /pmc/articles/PMC7884730/bin/41525_2021_177_MOESM6_ESM.xlsx Hsapiens 27 2-Mar 3-Sep 4-Sep 2-Mar 7-Sep 6-Sep 7-Mar 11-Sep 9-Mar 12-Sep 4-Mar 1-Mar 6-Mar 14-Sep 8-Sep 8-Mar 2-Sep 1-Dec 10-Mar 3-Mar 1-Sep 11-Mar 9-Sep 5-Sep 1-Mar 10-Sep 5-Mar"
## [15] "PMC7884410 /pmc/articles/PMC7884410/bin/41398_2021_1248_MOESM3_ESM.xlsx Hsapiens 27 44076 44077 44082 44086 44084 44081 43891 44089 43893 44083 43901 43897 43894 43892 44088 43891 43900 44078 44085 43896 44166 44075 43892 44075 43898 43895 43899"
## [16] "PMC7881617 /pmc/articles/PMC7881617/bin/12967_2021_2733_MOESM1_ESM.xlsx Hsapiens 3 44089 43891 43892"
## [17] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM11_ESM.xlsx Mmusculus 13 40057 37135 38412 37500 38596 37316 39326 39873 39142 38777 40787 37316 38961"
## [18] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM11_ESM.xlsx Mmusculus 15 40787 38961 40057 39142 38596 39873 37316 38412 37500 37681 38231 37135 38777 37316 39326"
## [19] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM11_ESM.xlsx Mmusculus 16 40422 37135 39326 40787 38596 39692 40057 37316 39873 37500 38961 37316 38777 39142 38412 38231"
## [20] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 97 40238 38961 37681 37681 40057 39142 37865 38777 37681 36951 38961 40057 38961 40057 38961 40787 37681 37681 38047 39326 38961 38961 40787 40057 39326 36951 37135 39692 38231 37681 40057 36951 40422 40057 40057 40057 40057 36951 40787 40057 37500 36951 37681 40057 40057 40057 40057 40057 40787 40787 37135 40057 38596 37316 37316 40057 40787 37681 40787 40057 40787 37135 40057 37316 37135 38596 40787 37316 38961 40057 40057 37316 40057 39873 40057 36951 40057 40057 37500 40057 40057 36951 40057 40787 40057 39873 40057 40057 37681 37681 39326 40057 40057 39873 38777 39873 40057"
## [21] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 29 38961 38961 40238 40238 39142 38777 38596 38047 38596 38231 38777 39326 41883 37316 38596 38777 37316 38777 36951 37316 37500 37681 37681 39692 38596 37316 41883 38047 39873"
## [22] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 42 40238 39142 37681 36951 37681 36951 41153 37681 37681 37865 39873 37135 37681 39142 38777 37316 37316 37681 40057 39873 40422 37681 39142 39142 37681 36951 37681 40603 41883 39142 37681 37681 38047 37500 40603 38777 37681 39326 36951 37316 38047 37681"
## [23] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 121 40238 38961 37316 40238 40238 38961 38961 38777 37681 37681 40057 37681 37681 39142 38961 37681 39326 40238 40238 38777 39326 38047 39692 38047 40057 38961 37681 37681 38961 38231 40057 37681 40787 40057 40057 37681 40057 38961 38231 40057 39326 40787 40787 39326 40787 40787 40057 39873 36951 40057 40057 38961 40057 37135 40057 40057 38777 40057 40057 37500 40057 40057 40787 40057 40057 40057 40057 39873 38777 40057 36951 40057 37316 37316 40057 40057 40057 39873 40057 40057 40057 37865 37681 37316 40057 40057 38596 39873 37500 40787 37135 36951 40057 39873 39326 40057 38596 38777 40057 40787 37135 37316 40787 40057 40057 40057 38961 40057 39142 37316 40057 40787 40787 40057 40057 36951 37135 39692 40787 40422 37681"
## [24] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 36 36951 38777 38961 38596 38047 40057 40238 40238 40238 38777 38231 39326 37316 39692 40238 40057 38777 37681 38777 38777 38777 41883 36951 38596 37316 36951 40057 38596 38596 37500 40057 39873 40057 41883 38596 38047"
## [25] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 58 39692 37316 39142 39326 37681 37681 40238 36951 40603 40057 37681 40057 38047 40603 37681 37681 37681 40238 40603 40603 37865 39873 38777 39142 37865 37681 40422 37135 37681 37681 37135 37681 41153 39142 36951 36951 37316 37681 37500 38047 37681 37316 39326 36951 40603 37681 37681 37316 41883 37681 39142 38047 40057 40603 37681 37681 37500 38777"
## [26] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 100 38231 39142 36951 40238 40057 37681 37681 40057 40238 37316 37681 39692 40057 40057 37316 38047 40057 40057 37681 38047 40057 39692 40057 40057 37681 38231 40057 40057 40057 36951 40057 40057 37135 40057 40057 40057 37681 40057 40057 40057 38777 36951 36951 40238 40057 40787 38961 37316 40057 40057 40057 39692 40057 40057 38961 40057 39873 38961 37316 40787 40787 38777 40238 40787 37316 37135 37316 40057 40057 38596 40057 40787 38596 40057 40057 38777 40057 37135 40057 37500 39326 38961 40057 40057 37681 37681 38777 40787 40057 37135 40057 40787 37681 40787 37500 39873 40057 39873 38961 40787"
## [27] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 28 40057 36951 38777 38596 38777 39692 38777 38961 37316 37316 38777 37681 36951 38777 37500 40238 39873 41883 38596 38047 39326 38231 38596 39142 40238 40057 40238 41883"
## [28] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 52 39326 40057 39326 36951 40603 40422 36951 38047 37316 40603 37681 38047 37681 40057 37681 38047 40238 37316 39142 40057 37681 37681 37681 36951 37135 37681 37681 37681 40238 41883 40238 38777 37316 39142 37681 37500 41153 37865 39142 36951 37681 37500 39142 37135 37865 39692 37681 39142 37681 40603 37681 39873"
## [29] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 114 37865 40238 36951 40057 36951 39142 37681 37681 40057 37681 40057 36951 37316 36951 40057 38961 40057 40057 39326 40057 40057 37135 40057 40057 40057 38231 40057 40787 38047 40057 40057 40057 37681 40057 40057 40057 39692 37681 40422 37316 40787 36951 39692 40057 40057 40787 40057 40057 37865 40057 38961 38961 40057 38961 37135 38777 40787 37316 40057 40057 37135 40057 40057 37681 37681 38961 38777 39326 39326 39873 37316 40057 40057 38961 39873 40787 39142 40787 37681 40057 40787 39326 37681 40787 39873 38777 40787 38961 38596 40787 40422 37681 40057 40057 38961 38961 37500 40787 40057 36951 37135 40057 38777 37316 37681 40787 38961 37681 37500 40057 38596 37681 39326 39873"
## [30] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 38 40238 40238 40057 40238 36951 38777 40057 38596 38596 40238 38596 36951 38777 38596 38596 41883 38047 40057 37316 39873 38961 39326 38047 38961 38777 40057 38777 37500 37316 38777 37681 38231 38596 37316 41883 39692 37681 37316"
## [31] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM3_ESM.xlsx Mmusculus 55 39326 40238 40057 40057 37316 40422 39142 38047 36951 39142 37316 38047 41153 36951 37316 37681 41883 37681 37681 37681 37681 36951 37681 37681 36951 39142 37681 41153 37681 40603 38777 38047 39142 39142 37865 38047 40603 39142 37681 37681 37681 37681 39873 38777 37135 37681 37681 40603 37316 37865 37681 40603 39873 37500 40603"
## [32] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM6_ESM.xlsx Mmusculus 15 40787 38961 39326 37135 38777 37681 39142 40057 38596 37316 37316 37500 38412 39873 38231"
## [33] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM7_ESM.xlsx Mmusculus 1 39142"
## [34] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM8_ESM.xlsx Mmusculus 117 40057 37681 37865 40787 40787 38961 40787 37681 38777 40057 37316 39326 38961 37681 39692 36951 39692 40057 39142 37681 40787 38961 40422 40787 40057 37681 40057 40787 38961 40057 36951 39326 40057 40057 39142 37135 37135 37681 40057 40057 40057 38596 40057 40057 38961 40057 39873 37316 40057 39873 39326 37316 40057 37500 37681 40057 38961 40057 40057 37681 37681 37135 38961 38777 40787 38777 40057 40057 40057 38047 37316 37681 38961 40057 40057 37316 39142 40057 40057 40787 37681 40057 38596 39142 40057 38961 39326 40787 40057 36951 39873 40238 37865 38777 36951 40787 37135 40787 40057 40057 40057 39873 38961 40787 39326 37500 39873 38231 40057 40787 40057 37681 38961 40057 37316 40787 38961"
## [35] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM8_ESM.xlsx Mmusculus 37 40238 40057 38596 40238 40057 41883 38596 40057 38231 38596 38777 38777 38047 39692 38596 38596 38777 40057 41883 37316 40057 36951 38961 39326 38777 38777 38777 37500 37681 36951 38231 39873 37316 38047 37316 40238 40238"
## [36] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM8_ESM.xlsx Mmusculus 56 39142 39142 40057 37681 38777 38047 40238 37865 39326 37135 36951 39142 39142 40422 37681 41883 37681 39142 36951 37316 39142 37681 37681 37681 40057 37681 39873 37681 38047 41153 37681 37681 38047 40603 40603 40603 37681 40057 37681 37865 38777 40603 37316 37316 37316 40603 37681 37135 37681 41153 37681 37681 37500 36951 37681 37681"
## [37] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM8_ESM.xlsx Mmusculus 104 36951 40057 37681 37681 40057 40057 40057 37316 36951 36951 39692 40787 37681 38961 38777 40057 40057 40057 37135 38961 38231 36951 40057 38777 40787 39326 40057 40787 40057 40057 40422 37135 40057 40057 38777 40057 40057 40057 40057 40057 37681 40057 40057 37316 40787 38961 40057 37316 37135 40057 40057 40057 40057 40057 36951 39873 40057 37865 40057 40057 40057 40787 38777 36951 40057 40787 37316 40787 40057 36951 38961 39873 39142 40057 36951 39142 39326 40787 40057 38961 37135 36951 40787 37681 38596 40787 37681 38047 37681 39873 40057 40057 37681 37500 39326 37316 39326 38961 38961 39873 38777 38961 37500 40057"
## [38] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM8_ESM.xlsx Mmusculus 32 38961 38596 40057 38596 38777 41883 40238 38596 40057 38596 40057 37316 37316 39326 38961 38596 38777 37316 38047 41883 37316 37681 38231 38777 39873 40238 37500 38777 38047 37681 36951 39692"
## [39] "PMC7881115 /pmc/articles/PMC7881115/bin/41467_2021_21109_MOESM8_ESM.xlsx Mmusculus 47 40057 36951 37316 37865 41153 40238 37681 40422 37500 36951 39142 39142 38777 39873 39326 37681 36951 37681 39142 38047 37316 39142 41153 37681 37681 38047 37681 37681 37316 36951 37681 39142 40603 38777 37681 40603 39873 37681 41883 37681 37681 37135 40603 37681 37681 37500 39142"
## [40] "PMC7881037 /pmc/articles/PMC7881037/bin/42003_2021_1722_MOESM4_ESM.xlsx Hsapiens 81 44089 44089 43892 43892 43891 44084 44084 43897 43897 43897 43894 43894 44076 44076 44085 44085 44085 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43896 43896 43896 43901 43901 43893 43893 43893 44082 44081 44081 44081 44088 44088 44166 44166 44166 44166 44166 44166 44166 43898 43898 43898 43898 43895 43895 43899 43899 43899 44086 44075 43900 43900 43900 43900 44083 44083 44083 44083 44083 44083 43892 44079 44077 44080 44080 44080"
## [41] "PMC7881037 /pmc/articles/PMC7881037/bin/42003_2021_1722_MOESM4_ESM.xlsx Hsapiens 81 44089 44089 43892 43892 43891 44084 44084 43897 43897 43897 43894 43894 44076 44076 44085 44085 44085 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43896 43896 43896 43901 43901 43893 43893 43893 44082 44081 44081 44081 44088 44088 44166 44166 44166 44166 44166 44166 44166 43898 43898 43898 43898 43895 43895 43899 43899 43899 44086 44075 43900 43900 43900 43900 44083 44083 44083 44083 44083 44083 43892 44079 44077 44080 44080 44080"
## [42] "PMC7881037 /pmc/articles/PMC7881037/bin/42003_2021_1722_MOESM4_ESM.xlsx Hsapiens 81 44089 44089 43892 43892 43891 44084 44084 43897 43897 43897 43894 43894 44076 44076 44085 44085 44085 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43896 43896 43896 43901 43901 43893 43893 43893 44082 44081 44081 44081 44088 44088 44166 44166 44166 44166 44166 44166 44166 43898 43898 43898 43898 43895 43895 43899 43899 43899 44086 44075 43900 43900 43900 43900 44083 44083 44083 44083 44083 44083 43892 44079 44077 44080 44080 44080"
## [43] "PMC7881037 /pmc/articles/PMC7881037/bin/42003_2021_1722_MOESM4_ESM.xlsx Hsapiens 81 44089 44089 43892 43892 43891 44084 44084 43897 43897 43897 43894 43894 44076 44076 44085 44085 44085 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43891 43896 43896 43896 43901 43901 43893 43893