Abstract Tumor immunosurveillance plays a major role in melanoma, prompting the development of immunotherapy strategies. The gut microbiota composition, influencing peripheral and tumoral immune tonus, earned its credentials among predictors of survival in melanoma. The MIND-DC phase III trial ([72]NCT02993315) randomized (2:1 ratio) 148 patients with stage IIIB/C melanoma to adjuvant treatment with autologous natural dendritic cell (nDC) or placebo (PL). Overall, 144 patients collected serum and stool samples before and after 2 bimonthly injections to perform metabolomics (MB) and metagenomics (MG) as prespecified exploratory analysis. Clinical outcomes are reported separately. Here we show that different microbes were associated with prognosis, with the health-related Faecalibacterium prausnitzii standing out as the main beneficial taxon for no recurrence at 2 years (p = 0.008 at baseline, nDC arm). Therapy coincided with major MB perturbations (acylcarnitines, carboxylic and fatty acids). Despite randomization, nDC arm exhibited MG and MB bias at baseline: relative under-representation of F. prausnitzii, and perturbations of primary biliary acids (BA). F. prausnitzii anticorrelated with BA, medium- and long-chain acylcarnitines. Combined, these MG and MB biomarkers markedly determined prognosis. Altogether, the host-microbial interaction may play a role in localized melanoma. We value systematic MG and MB profiling in randomized trials to avoid baseline differences attributed to host-microbe interactions. Subject terms: Cancer metabolism, Metagenomics, Cancer metabolism __________________________________________________________________ MIND-DC was a randomized, placebo-controlled, phase 3 trial of adjuvant blood-derived natural dendritic cell (nDC)-based therapy in patients with stage III melanoma, showing that nDC-induced immune responses did not translate into survival benefit. Here the authors report that, despite randomization, baseline differences in fecal metagenomics and serum metabolomics profiles between treatment arms might have influenced the clinical outcome of the trial. Introduction The rise in incidence of melanoma worldwide has led to an increasing number of patients with regional positive lymph nodes (stage III) being diagnosed each year, especially in the western world^[73]1. Over the last decade, there has been tremendous progress in the clinical management of stage III melanoma with the advent of adjuvant and neoadjuvant immune checkpoint inhibitors (ICI). Before the era of (neo)adjuvant ICI in localized melanoma, patients with operable clinically positive nodes systematically underwent full lymphadenectomy of the involved sites^[74]2,[75]3. Surgery alone is insufficient to achieve a cure in most patients with high-risk stage III melanoma. Thus, systemic adjuvant therapy has been investigated over the last decades in patients with high-risk melanoma. The development of effective adjuvant therapies for patients with high-risk melanoma has included ipilimumab (an anti-CTLA-4 antibody), pembrolizumab and nivolumab (both monoclonal antibodies against programmed death 1 [PD-1]), and combination of BRAF and MEK inhibitors for patients whose tumors harbor a BRAF mutation^[76]1,[77]4–[78]12, leading to US Food and Drug Administration approvals. More recently, pembrolizumab and the combination of anti-CTLA-4 and PD-1 antibodies showed benefit in the neoadjuvant setting in patients with high-risk node-positive melanoma in early phase clinical trials^[79]13–[80]16. Prior to the modern era of ICI, vaccination involving dendritic cells (DC) has been developed owing to the special properties of these cells in coordinating innate and adaptive immune responses. The aim of DC immunization was to induce tumor-specific effector T cells that can exhibit a tumoricidal activity in a tumor antigen-specific manner through induction of a protective immunological memory to cancer antigens. Across the world, many investigators showed that DC-based vaccines were safe and induced the expansion of circulating CD4^+ T cells and CD8^+ T cells that were tumor antigen-specific. Objective clinical responses have been observed in 5–8% of patients^[81]17–[82]21. Long term follow-up of DC-vaccinated patients with metastatic melanoma (MM) reported up to 19% survival rates at 11 years, comparable to ipilimumab-treated patients. Survival significantly correlated with intense reactivities at the dermal injection site, and with eosinophilia^[83]22. In the past years, DC-based immunotherapy was performed in patients with stage IV HLA-A2.1 positive melanoma using intravenous, intradermal, and intranodal routes of administration of mature DC loaded with tumor-associated antigens (TAA) such as tyrosinase and gp100, and keyhole limpet hemocyanin (KLH) as a control antigen. All vaccinated patients showed a pronounced proliferative T cell or humoral response against KLH. TAA-specific T cell reactivities were monitored in post-treatment delayed-type hypersensitivity (DTH) skin biopsies by tetramer staining and functional analysis. Patients harboring peptide-specific T cell immunity exhibited the best clinical response, with eventually complete responses in a minority of patients with MM while rIL-2 or modified TAA did not further increase vaccine efficacy^[84]23,[85]24. A direct correlation between DC-induced tumor-specific T cells detected in DTH skin biopsies and a favorable clinical outcome was observed in patients with MM^[86]25,[87]26. The “Melanoma Patients Immunized With Natural DenDritic Cells (MIND-DC)” trial was a randomized phase III clinical trial testing adjuvant natural DC (nDC) therapy in high-risk stage III melanoma. The trial showed that adjuvant nDC treatment generated specific immune responses but did not translate into survival benefit^[88]27. Here, we investigated whether nDC loaded with TAA ex vivo could modulate fecal metagenomics (MG) and serum metabolomics (MB) profiles that might in turn, influence clinical outcomes. Indeed, shotgun MG-based taxonomic composition of feces at baseline was associated with objective response rates in patients with stage IV melanoma treated with anti-PD-1 alone or combined to anti-CTLA-4 in several independent cohorts^[89]28–[90]31. In a meta-analysis incorporating new cohorts, McCulloch et al. confirmed that baseline microbiota composition was associated with 1-year progression-free survival. Bacteria associated with favorable response during ICI belonged to Lachnospiraceae and Bifidobacteriaceae families including Ruminococcus spp, Mediterraneibacter spp., and Blautia spp.^[91]32. Here, we show that a relative deficiency in the primary biliary acid (BA) cholic acid or F. prausnitzii, or high levels of fatty acids (FA) and acylcarnitines are associated with reduced recurrence-free survival (RFS), especially in the nDC treatment arm. Moreover, the relative abundance of beneficial F. prausnitzii in stool anti-correlate with serum BA and FA. Therefore, we hypothesize that the pharmacodynamic effects of the nDC might have been influenced by the host-microbiome dialogue in patients harboring a deviated lipid metabolism (including carboxylic acids, BA and acylcarnitines) and gut microbiota. Results Metagenomics-based profiles at baseline (T1) and at 4 weeks (T2) are associated with 2 year-RFS (2Y-RFS) The MIND-DC trial ([92]NCT02993315) randomized 148 eligible patients with resected stage IIIB and IIIC melanoma between 2018 and 2021, of which 144 patients were included in this translational cohort (n = 95 in the nDC arm, versus n = 49 in the placebo (PL) arm) (Supplementary Table [93]1). Five patients (all in the nDC arm) never received any injection, three being related to tumor recurrence. One was censured (because of a follow up inferior to 1 month). Primary and secondary clinical outcomes are reported separately^[94]27. Shotgun MG sequencing of stools at baseline (pre-therapy, T1) and after 4 weeks of therapy start (at the day of the first and the third biweekly intranodal injections of nDC or PL, T2) was undertaken, and the presence and abundance of species-level genome bins (SGBs) was estimated^[95]33. Here, we describe a longitudinal follow up of the shotgun MG-based stool composition in 88 and 85 patients with stage IIIB/C melanoma at T1 and at T2, respectively (Supplementary Table [96]2). Firstly, we correlated the microbiota composition of the T1 (Fig. [97]1A) and T2 (Fig. [98]1B) feces samples with 2Y-RFS. While the alpha-diversity was not associated with 2Y-RFS (p = 0.2 at T1 and p = 0.1 at T2, Shannon index), the beta-diversity significantly differed between those patients prone to stay cancer-free (no recurrence at 2 years, 2Y-noR) versus those who will experience a 2-year recurrence (2Y-R) at both time points (p < 0.01 at T1 and T2, Bray-Curtis dissimilarity). We focused on differentially abundant SGBs according to 2Y-RFS in the whole population and found a relative overrepresentation of Faecalibacterium prausnitzii (whether considering [99]SGB15318 or [100]SGB15322) in patients with favorable prognosis at T1 (Fig. [101]1A, Supplementary Data [102]1) and T2 (Fig. [103]1B, Supplementary Data [104]1). Fig. 1. Metagenomics-based profiles show taxonomic signatures associated with recurrence at 2 years (2Y-R) in two different time points. [105]Fig. 1 [106]Open in a new tab A, B Linear model coefficients for microbial SGBs associated either with 2y-R or 2y-noR, corrected for age, gender and treatment arm at baseline (A, T1 n = 86) or after 4 weeks of therapy start (B, T2 n = 83). Positive values indicate species-level genome bin (SGB) association with 2Y-R (orange), while negative values indicate a positive association for the corresponding SGB with 2Y-noR (blue). Only associations with p ≤ 0.05 are reported since no association has Benjamini-Hochberg Q < 0.2. Refer to [107]Supplementary Data 1 for linear model coefficients (MaAsLin2, coefficient) for microbial SGBs after arcsine square root (arcsin-sqrt) transformation (AST). Source data are provided as a Source Data file. By splitting the whole cohort into 4 groups according to treatment arm and outcomes, we obtained small sample sizes (Supplementary Table [108]2). This prevented any significant associations to pass the FDR correction for the MG analysis (all Q-values ≥ 0.2). Linear model coefficients (MaAsLin2, coefficient) for microbial SGBs that were found associated either after arcsine square root (arcsin-sqrt) transformation (AST) or centered-log-ratio (CLR) transformation with 2Y-R with p < 0.05 at T1 and T2 are detailed in Supplementary Fig. [109]1A–D and Supplementary Data [110]2. In brief, F. prausnitzii [111]SGB15318 was relatively over-represented in 2Y-noR in the nDC arm at T1 and T2 (Supplementary Fig. [112]1B–D, p < 0.05). Of note, Ruminococcaceae [113]SGB14899, Streptococcus salivarius SGB8007, and Streptococcus parasanguinis SGB8071 followed the same behavior only in the nDC arm (Supplementary Fig. [114]1B,D, p < 0.05). In addition, the prevalence and relative abundances of distinct SGBs of F. prausnitzii tended to be reduced in the nDC arm compared with the PL arm (Supplementary Fig. [115]2A, B). The MG biomarker evolution between the two timepoints was assessed using two methods: paired-Wilcoxon test (Wilcoxon signed-rank test) (Supplementary Fig. [116]3) and linear regression adjusted for the clinical features treatment arm, age, gender, tumor stage (IIIB vs IIIC), Eastern Cooperative Oncology Group performance status (ECOG-PS), and body mass index (BMI) (Supplementary Fig. [117]4). Of note, the nDC treatment barely impacted on the shift of the MG taxonomic composition, except for a few taxa (Blautia sp MSK 20 85 SGB4828, p = 0.021, and Ruminococcus torques SGB4608, p = 0.024) (Supplementary Fig. [118]4). Of note, these dynamics did not pass the FDR correction. We may impute these weak associations to the small sample size (limiting the power of the analysis) and to the lack of clear effect of the treatment arm in this negative trial. We concluded that gut microbiota composition was associated, albeit weakly, with the prognosis of stage III melanoma, with F. prausnitzii ([119]SGB15318 and [120]SGB15322), referenced for its homeostatic^[121]34,[122]35 and antitumor properties in MM^[123]30,[124]36, as the most prominent MG species (MGS) found at T1 and T2 in this cohort. Serum metabolic changes following the first cycle of immunization In parallel to MG, the mass spectrometry-based serum MB was serially assessed at T1 and T2 in 95 patients in the nDC arm and 49 patients in the PL arm. The Volcano plot and principal component analysis (PCA) revealed differences between the two time points overruling those observed between the treatment groups (Fig. [125]2A, B, Supplementary Fig. [126]5A, Supplementary Data [127]4). In particular, the lipid metabolism became perturbed at T2, with the accumulation of very long and long chain saturated FA (such as the carboxylic acids docosanoid acid (also called “behenic” acid), nonadecanoic acid, tetradecanedioic acid, arachidonic acid, hexacosanoic acid, palmitic acid), and mono- or poly-unsaturated FA (such as 10-nonadecenoic acid, oleic acid, linoleic acid, linolenic acid, stearidonic acid, docosenoic acid) as well as acylcarnitines (glutarylcarnitine, carnitine C4:0, C4:0 (OH), C6:0, C8:1, C10:1, C12:0, C14:2) (Fig. [128]2B, p < 0.05, Mann–Whitney). Besides the shift in circulating FA and in the acylcarnitine shuttle found at T2, we observed perturbations of polyamine biosynthesis and acetylation between T1 and T2 (N-acetylputrescine, spermidine, N1-acetylspermine) (Fig. [129]2B, p < 0.05, Mann–Whitney). In addition, branched-chain amino acids (BCAAs) (valine/leucine/isoleucine), and γ-glutamyl dipeptides involved in inflammation, oxidative stress, and glucose regulation were significantly increased at T2 (Fig. [130]2B, p < 0.05, Mann–Whitney)^[131]37. These metabolic shifts could not be explained by clinical events between the two time points since few patients experienced flu-like symptoms (19%) or started new medications (12%) between T1 and T2 (Supplementary Table [132]3). Fig. 2. Longitudinal metabolic patterns showing a shift in lipid metabolism associated with recurrence at 2 years (2Y-R). [133]Fig. 2 [134]Open in a new tab A Principal component analysis (PCA) plot representing the distribution of serum metabolomics (MB) overtime and according to 2y-R (orange, n = 83) versus no recurrence at 2 years (2y-noR, blue, n = 56). Circle: 2y-noR at baseline (T1); triangle: 2y-noR after 4 weeks of therapy start (T2); Square: 2y-R at T1; Crosses: 2y-R at T2. B Volcano-plots based on MB showing differences (p <  0.05, two-sided Mann–Whitney U-test with no adjustment) between T1 (green dots, n = 143) and T2 (yellow dots, n = 143) with a cut-off in the T2/T1 fold change (FC) ≥ 0.3. Metabolites with T2/T1 FC ≥ 0.3 with a p <  0.05 were colored green. X-axis: log2 fold change of metabolites; Y-axis: fold change of –log10. C Hierarchical clustering of MB according to 2y-noR (n = 56) versus 2y-R (n = 83) at T1 and T2 and treatment arm. Dark gray: 2y-R at T1; Light orange: 2y-R at T2; Light gray: 2y-noR at T1; Light blue: 2y-noR at T2. Dark blue: placebo (PL) arm (n = 47); Dark orange: natural Dendritic Cell (nDC) arm (n = 92). Targeted MB computed as normalized areas of identified metabolites. Heatmap illustrating the changes in metabolite abundances according to the median of each metabolite in the two subgroups of opposite prognosis, highlighting the fatty acids (FA). Rows are samples, columns are metabolites. Heatmap data are log2 normalized and centered around the average abundance computed from all the samples for each metabolite. Red/blue colors are ion signal higher/lower than average and gray are missing values. Samples are sorted following biological conditions and metabolites clustered following the ward.D2 algorithm, with euclidean distance. D Relative abundance of Carnitine C14:2 (left panel) and Linolenic acid (right panel) in 2y-noR (blue: n = 56) and 2y-R (orange: n = 83 at T1). Boxplots indicates the interquartile range Q1 to Q3 with Q2 (median) in the center. The range of outliers is depicted by whiskers. The p value are related to the group comparison using the two-sided Mann–Whitney U-test with no adjustment. E Recurrence-free survival (RFS) analysis using the Kaplan–Meier estimator (Log-Rank (Mantel Cox) test) to assess low FA versus high FA (calculated based on the sum of relative abundances of 13 most significant FA or carboxylic acids) at T1 (left panel) and at T2 (right panel). To assess the clinical relevance of these early changes, we investigated which metabolic profiles were associated with 2Y-RFS using an unsupervised hierarchical clustering across all metabolites (Fig. [135]2C). We concluded that the above-mentioned medium and long chain-acylcarnitines and FA linolenic acid (Fig. [136]2D, p < 0.05 Mann-Whitney), acetylated polyamines (Supplementary Fig. [137]5B, p < 0.05 Mann–Whitney), as well as BCAAs and γ-glutamyl dipeptides (Supplementary Fig. [138]5C, p < 0.05 Mann–Whitney) were negatively associated with 2Y-RFS at T1 in univariate analysis. In contrast, serum levels of ornithine, a precursor of spermidine, were higher in patients with favorable prognosis (Supplementary Fig. 5[139]D, p < 0.05 Mann–Whitney) in univariate analysis. The significance of FA in the temporal metabolic shift and clinical recurrence prompted us to evaluate the RFS according to the median of the sum of the 13 most relevant FA abundances. Levels above the FA median at T2 (p = 0.025, right) and to a lesser extent at T1 (p = 0.055, left) were associated with shorter RFS (Fig. [140]2E). Attempting to ascribe these lipid profiles to the gut microbiome, we used the MG data, and annotated the organism-specific gene hits according to the UniRef90 (UR90)^[141]38. Based on these annotations, metabolic pathways for each sample were then defined using the MetaCyc hierarchy of pathway classifications^[142]39. The Linear model-based analysis contrasting hits separating patients who recurred of their melanoma from the ones who did not recur showed that the FA beta-oxidation and biosynthesis pathways at T1 were associated with overall recurrence (Supplementary Fig. [143]6A). Hence, these findings suggest that time and/or therapy have perturbed a delicate and preexisting disbalance of lipid synthesis or degradation, at the level of the mono- and poly-unsaturated and saturated fatty and carboxylic acid metabolism, coinciding with melanoma recurrence. Taxonomic and metabolomic differences between treatment arms at randomization Our findings indicate that the gut taxonomic composition at T1 and the serum metabolic profile shift after only 2 injections of nDC treatment were associated with the 2Y-RFS. Although we did not anticipate any significant difference in clinical characteristics between the two treatment arms at baseline given the process of randomization (Supplementary Table [144]1), we took a deeper dive into potential pre-existing differences in the fecal microbial ecosystem defined by shotgun MG-based sequencing. Strikingly, while the stool compositional diversity did not differ between the two arms (p = 0.43, Shannon index), the beta-diversity^[145]40 of the taxa present in feces from stage III melanoma randomized to PL was significantly different from that of individuals about to receive nDC (p = 0.02, Bray-Curtis dissimilarity). Patients randomized in the PL arm exhibited a relative over-representativity of health-related and immunogenic MGS F. prausnitzii [146]SGB15322, Blautia massiliensis SGB4826, and Dorea formicigenerans SGB4575 compared with the nDC arm^[147]32,[148]41 (Fig. [149]3A, Supplementary Data [150]3). Indeed, the nDC treatment arm tended to harbor higher proportions of individuals lacking F. prausnitzii spp. (Fig. [151]3B) and more specifically distinct SGBs of F. prausnitzii or bearing lower relative abundance of these SGBs compared to the PL arm (Supplementary Fig. [152]2B). The same observation could be made comparing stage IV melanoma from publicly available databases with healthy volunteers (HV)^[153]42 (Supplementary Fig. [154]2A). Fig. 3. Significant differences in the microbiota taxonomic profiles at randomization. [155]Fig. 3 [156]Open in a new tab A Linear model coefficients for microbial SGBs differentially abundant either in the natural dendritic cell (nDC, n = 56) or placebo (PL, n = 31) arms, corrected for age and gender. Positive values indicate species-level genome bin (SGB) association with nDC (orange), while negative values indicate a positive association for the corresponding SGB with PL (blue). Only associations with p ≤ 0.05 are reported since no association has Benjamini-Hochberg Q < 0.2. Refer to [157]Supplementary Data 3 for linear model coefficients (MaAsLin2, coefficient) for microbial SGBs after arcsine square root (arcsin-sqrt) transformation (AST). Source data are provided as a [158]Source Data file. B Prevalence of Faecalibacterium prausnitzii, i.e., proportion of individuals with its absence between healthy volunteers (HV, n = 5345), patients into PL arm (n = 31), all patients with melanoma into MIND-DC trial (MEL, n = 88) and nDC arm (n = 57). The p values are related to the group comparison using the Chi-square test (p < 0.0001). Recurrence-free survival (RFS) analysis using the Kaplan–Meier estimator (Log-Rank (Mantel Cox) test) to assess the predictive value of Gemmiger formicilis (C, left panel) and Lachnospira pectinoschiza (C, right panel) and F. prausnitzii (D) using relative abundances at T1. The two groups of patients were defined by reference values of relative abundances (MetaPhlAn 4) from publicly available HV cohort: high if ≥ median and low if MT1 :MATH] , and T2, [MATH: MT2 :MATH] , was performed using the following linear regression adjusted for the clinical covariables [MATH: X :MATH] (age, sex, BMI, stage, and ECOG-PS), considering [MATH: MT2 :MATH] as response and [MATH: MT1 :MATH] as offset: [MATH: MT2=MT1< /msub>+β0+βDC+βX :MATH] . The intercept [MATH: β0 :MATH] represents evolution of [MATH: M :MATH] ( [MATH: MT2MT1< /msub> :MATH] ) in the PL arm, and [MATH: βDC :MATH] represents the impact of the nDC (i.e., the difference of the evolution between the two arms). The Wald test p values of [MATH: β0 :MATH] and [MATH: βDC :MATH] were provided and considered as statistically significant when p < 0.05. The evolution in the nDC arm was estimated by [MATH: β0 :MATH] reversing the arm reference (i.e., from the model [MATH: MT2=MT1< /msub>+β0+βPL+βX :MATH] ). Machine Learning. All ML models were developed using R. Feature selection and prediction were based alternatively on two different outcomes: the 2Y-R and treatment arm (nDC versus PL arms). For each type of outcome, three datasets per omic (clinical features only, MB, MG, or MB and MG) were defined from the timepoint that was used: T1, T2, and T2-T1/T1 (pre treatment + evolution until T2 i.e., T2-T1) and a third dataset constructed by joining T2-T1 and T1 values. For each model, clinical features were always included in both feature selection and prediction phases. The ML pipeline is based on a first step of feature selection using Boruta feature selection algorithm based on eXtreme Gradient Boosting (XGboost) algorithm^[266]82 to identify most relevant features among clinical and biological markers (both in MB and/or MG). A second step consisted in re-training the model from the subset of selected features re-including clinical variables in order to control potential confounding bias. A last fit of the model using selected biomarkers was applied on our training data, enabling the model to prepare for future label predictions on new data. For the multi-omics model, we performed a second step of feature selection from the metabolites and MGS selected in each omics in the first step (+clinical variables) to obtain our final model. Missing values of metabolites were imputed using the multiple imputation by chained equations (MICE) method using the mice R package^[267]83. We performed 50 imputations with 50 iterations to capture the uncertainty of the imputation procedure. The feature selection procedure described above was repeated using the 50 imputed datasets, and the metabolites selected in more than 75% of these 50 iterations were retained for the final step (assuming that they are robust to the randomness of the imputation). A single last imputation was performed to retrain the model in the final step. The model explainer of the different final model was based on the Shapley Additive exPlanations (SHAP) analysis, which is a visualization tool based on the following construction: SHAP values are weights associated to features for each patient, positive when the value of a marker for this patient tends to increase the prediction as class 1 (2Y-R), negative otherwise (2Y-noR). The more the absolute value of SHAP value increases, the more the feature is likely to impact the prediction (as class 1 if SHAP_value > 0, class 0 otherwise). SHAP values are thus positive or negative continuous values. Correlations between biomarkers were analyzed via the mixOmics package in R^[268]84, using the DIABLO multiblock sPLS-DA method to display explanatory relationship between pathways and then displayed into a circosplot. The prediction performance of the whole model pipeline (feature selection to model fitting) was evaluated using the bootstrap optimism corrected area under the receiver operating characteristic (AUROC) curve. Due to the lack of external validation cohort, the optimism of the AUROC was corrected using the 0.632+ bootstrap method^[269]85. The confidence interval of these optimism corrected AUROC was obtained using two-stage bootstrap methods proposed by ref. ^[270]86 (50 internal samples, 500 external samples). The 0.632+ estimators are displayed on each figure. Reporting summary Further information on research design is available in the [271]Nature Portfolio Reporting Summary linked to this article. Supplementary information [272]Supplementary Information^ (3.6MB, pdf) [273]Peer Review File^ (2.4MB, pdf) [274]41467_2024_45357_MOESM3_ESM.pdf^ (82.4KB, pdf) Description of Additional Supplementary Files [275]Supplementary Data 1^ (18.9KB, xlsx) [276]Supplementary Data 2^ (33.5KB, xlsx) [277]Supplementary Data 3^ (18.1KB, xlsx) [278]Supplementary Data 4^ (18.2KB, xlsx) [279]Supplementary Data 5^ (16.8KB, xlsx) [280]Reporting Summary^ (1,002.4KB, pdf) Source data [281]Source Data^ (7MB, xlsx) Acknowledgements