Abstract Background Human papillomavirus (HPV)-associated oropharyngeal cancer (OPC) is increasing in prevalence, but the drivers of metastasis remain poorly understood, which impacts the ability to personalise management decisions. Much of the genomic research to date focuses on the HPV-negative population. Here, we utilise single-cell and spatial single-cell techniques to understand the drivers of metastasis. Methods Patients with HPV-positive OPC and cervical lymph node metastases treated with curative surgery had matched samples from the primary and lymph nodes collected for research. Single-cell RNA sequencing, single-cell spatial sequencing (Visium) and in-situ spatial platforms were performed. Cancer clones were delineated using inferred copy number variation. Expression phenotypes and interactions with the tumour microenvironment were compared between the metastasising and non-metastasising cancer clones. Results Individual cancer clones have varied abilities to metastasise and undergo clonal expansion in the lymph node, with only a subset of clones present in the primary expanding in the lymph node. Four mechanisms were identified as defining the metastatic phenotype, including protein translation adaptation, immunoproteasome dysfunction and immune evasion, suppression of the IFN immune response and cap-independent protein translation. Conclusions This research elucidates multiple mechanisms driving the expansion of cancer clones in HPV-positive oropharyngeal cancer. By detailing the roles of translational adaptation, immunoproteasome dysfunction, suppression of the interferon immune response and cap-independent protein translation, we provide insights into how these processes contribute to immune evasion and tumour survival. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-025-04392-5. Keywords: HPV-positive oropharyngeal cancer, Single-cell transcriptomics, Spatial transcriptomics, Cancer clone, Metastasis Background Cancers of the head and neck comprise a diverse range of cancers, which represent the 7th most common malignancy worldwide, with approximately 900,000 new cases and 450,000 deaths per year [[54]1]. Historically, cancers in the oropharynx region were commonly associated with smoking and alcohol use. However, infection with human papillomavirus (HPV) is becoming an increasingly common cause and is now responsible for up to 80% of oropharyngeal cancers [[55]2]. The HPV-positive oropharyngeal cancer (HPV +) patient population differs from HPV-negative (HPV −), with earlier age of onset, smaller primary tumours and larger lymph node burden but improved survival outcomes [[56]3–[57]5]. The cellular mechanisms underlying these clinical differences in HPV + and HPV − patients are poorly understood [[58]6, [59]7], resulting in missed opportunities to understand the defining differences in cancer cell behaviour, metastasis and treatment response. Despite aggressive treatments, up to 25% of patients with potentially curable HPV + oropharyngeal cancer experience disease relapse, after which most patients die within 2 years [[60]8]. To reduce the risk of relapse, adjuvant radiation given with or without chemotherapy can be offered. However, identifying patients who will benefit the most from these toxic therapies with high-risk disease remains difficult without accurate prognostic biomarkers [[61]9]. Cancer cells are genetically heterogeneous, leading to cancer phenotypes that are transcriptionally and functionally distinct from one another [[62]10]. Primary tumours typically exhibit greater clonal diversity than metastases [[63]11]. This observation is an emergent property, because metastatic processes are partly driven by cells acquiring genetic mutations that impact their phenotypes through clonal selection within their microenvironmental context [[64]12–[65]15]. Identifying the phenotypic characteristics and molecular mechanisms of clones prone to metastasis not only offers a chance to identify biomarkers indicating a poor prognosis but also holds the potential for uncovering novel drug targets. In HPV − oropharyngeal cancer, the epithelial-to-mesenchymal transition (EMT) differentiates between primary and metastatic cells [[66]16]. Moreover, AXL, known for its role in cell migration, EMT, invasion and proliferation [[67]17], and AURKB, a pivotal regulator of mitosis [[68]18], have both been recognised as significant contributors to the development of invasive metastatic cell phenotypes in HPV − oropharyngeal cancers. In HPV + oropharyngeal cancers, there is limited understanding of the cellular pathways underlying metastasis, which are expected to differ from those in HPV − cancers given the unique clinical and aetiological factors. The inability to clear HPV infections stems from complex mechanisms involving the suppression of the host cell’s antiviral response (involving NFKB, STING and IFN), the sustained viral replication facilitated by the retinoblastoma tumour suppressor (RB), and evasion of immune surveillance by inhibiting apoptosis via TP53 [[69]19]. To sustain viral replication, HPV employs various mechanisms to manipulate the host cell. Specifically, viral proteins E6 and E7 incapacitate key tumour suppressor regulators like TP53 [[70]20] and RB [[71]21], disrupting apoptosis and cell cycle regulation. Furthermore, E6 directly interferes with deoxyribonucleic acid (DNA) damage repair processes [[72]22]. While these strategies effectively support viral replication, they render the host cell incapable of managing defective DNA damage repair, leading to a gradual increase in genomic instability over time [[73]22]. Ultimately, this increasing genomic instability culminates in malignant transformation [[74]23]. Simultaneously, the virus employs multiple mechanisms to evade the host’s immune responses. This immune evasion enables infected cells to remain undetected and allows for the tolerance of emerging pre-malignant and malignant lesions. Therefore, many mechanisms that maintain viral infection may also contribute to cancer cell evasion of anti-tumour responses, an area that has not been comprehensively explored. Here, we present a comprehensive analysis of primary and regionally metastatic cancers in patients with HPV + oropharyngeal cancer who have undergone curative surgical resection. Using single-cell techniques, we use copy number variation (CNV) to define cancer clones and study clonal population phenotypes regarding metastasis and microenvironmental dynamics. We identify virally mediated protein translation relief and dysregulation of immunoproteasome and IFN pathways, accompanied by a shift towards cap-independent translation underlying metastatic clonal selection. Furthermore, our study validates immune cell evasion by sequencing matched tumour samples in situ. Methods Participants and specimen collection Patient cohort Patients were an average age of 55 and compromised 75% men (n = 3) and 25% women (n = 1) (Patient characteristics in: Additional file 1: Table S1). Newly diagnosed, previously untreated patients with node positive, HPV + oropharyngeal squamous cell carcinoma were approached for participation if surgery was their planned treatment. Consent was obtained before surgery. Tissue samples from the resected primary and affected lymph nodes were taken. Tissue sample processing Within 30 min of surgical resection, tumour samples from the primary and lymph node metastases were tumour banked. For spatial experiments, half of the tumour sample was sectioned and snap frozen in O.C.T (Tissue-Tek, CAT# 4583) on dry ice. After freezing, samples were moved to − 80 °C for long-term storage. The remaining sample was cut into 2 mm × 2 mm chunks with a scalpel blade. For single cell analysis, 2–3 tumour chunks were placed into a cryovial with 90% fetal calf serum (FCS) + 10% dimethyl sulfoxide (DMSO). Cryovials were placed into a CoolCell in a − 80 °C freezer overnight, then transferred to Vapour Phase for long-term storage. Blood processing Blood was collected in 10 mL BD Vacutainer ethylene-diamine-tetra-acetic acid (EDTA) tubes (Becton Dickinson, CAT# 367,525). The sample was processed for plasma and buffy coat as previously described [[75]24]. Samples were stored at − 80 °C. SNP genotyping DNA was extracted from Buffy Coat samples using the QIAamp DNA Blood Mini Kit (Qiagen, CAT# 51,106). Genotyping was performed using the UK Biobank Axiom array (ThermoFisher Scientific, CAT# 902,502) at the Ramaciotti Centre for Genomics. Imputation was performed using the Michigan Imputation Server, with Minimac4 and the Haplotype Reference Consortium panel. Preparation of single cell suspensions The presence of cancer tissue was confirmed on haematoxylin and eosin staining before sequencing. Tissue samples were defrosted in a 37 °C water bath. Contents were transferred to a 6 cm culture dish and washed with RMPI-1640 (Gibco, CAT# 11,875,093) + 10% foetal calf serum (FCS). Samples were moved to a 2 mL Lo-Bind Eppendorf containing 100 µL of RPMI-1640 and mechanically dissociated using scissors and snips. Samples were chemically dissociated with the Miltenyi Tumour Dissociation Kit (Human) (Miltenyi Biotech, CAT# 130–095-929) according to the manufacturer instructions. Samples were incubated for 20–30 min on a shaking incubator at 37 °C until all tumour chunks were dissociated. Sample solution was passed through a 100 µm cell strainer into a fluorescence-activated cell sorting (FACS) tube with RPMI-1640 + 10% FCS. Cells were counted, and viability was assessed using Trypan Blue. If samples had < 80% cell viability, then dead cells were removed with the EasySep Dead Cell Removal (Annexin V) Kit (StemCell Technologies, CAT# 17,899) according to manufacturer instructions. Samples were pooled equally (primaries pooled together and lymph nodes pooled together) to a final concentration of 1500 cells per µL in PBS + 10% FCS. Samples were passed through a 70 µM cell strainer. Single-cell capture and cDNA library preparation The 10 × Genomics Chromium instrument (10 × Genomics) was used to partition viable cells with barcoded beads, and cDNA from each cell was prepared using the 10 × Genomics Single Cell 3′ Library, Gel Bead and Multiplex Kit (v3) (CAT# 10X-1000075) as per the manufacturer’s instructions. Patient samples were pooled and multiplexed. Cell numbers were optimised to capture approximately 5000 cells per individual sample and 20,000 cells in the total pool. Samples were loaded onto a 10 × Genomics Chromium Single Cell Chip Kit B (CAT# 10X-1000153). Libraries were generated with the 10 × Genomics Chromium Single Cell 3’ Library construction kit (v3) (CAT# 10X-1000078). scRNA-seq data processing Library sequencing The libraries were sequenced at the Ramaciotti Centre for Genomics on an Illumina NovaSeq 6000 (NovaSeq Control Software v 1.7.0/Real-Time Analysis v3.4.4) using a NovaSeq S4 200 cycles kit (Illumina, 20,482,067 as follows: 28 bp (Read 1), 91 bp (Read 2) and 8 bp (Index). A median sequencing depth of 30,000 reads/cell was targeted for each sample. The sequencer generated raw data files in binary base call (BCL) format. The BCL files were demultiplexed and converted to FASTQ formats using Illumina Conversion Software (bcl2fastq v2.19.0.316). Bioinformatics pipeline The cell ranger—v (3.1.0) count pipeline was used for alignment, filtering, barcode counting and UMI counting from FASTQ files. The pipeline was executed on a high-performance cluster with a 3.10.0–1127.el7.x86_64 operating system. Using the STAR aligner, ribonucleic acid (RNA) sequencing reads were mapped to the human GRCh38 transcriptome and HPV 16 and 18 transcriptome [[76]25–[77]27]. DropletQC In order to distinguish between true cells and cell-free ambient RNA, DropletQC. Empty drops were excluded from the analysis [[78]28]. Demultiplexing Single-nucleotide polymorphism (SNP) data was generated for each individual as described above. Multiplexed patient pools were demultiplexed using Demuxafy [[79]29]. Doublets were identified and removed. Single cell data The aggregated single cell gene expression data generated by CellRanger was used as the input for the Seurat analysis software 4.0 [[80]30]. Expression levels for each transcript was determined using the number of unique molecular identifiers assigned to the transcript. Quality control and filtering steps were performed to remove outlier genes and cells. Cells expressing greater than 25% mitochrondrial genes were removed. Cell–cell normalisation was performed using a regularised negative binomial regression (sctransform) [[81]31]. Principal component analysis was performed on the filtered and normalised gene expression matrix and the first 15 principal components that explained the majority of the variance in the data were retained. To visualise the patterns of gene expression in each cell, uniform manifold approximation and projection for dimension reduction (UMAP) [[82]32] plots were generated. Integration of data Seurat was used to identify common anchors between datasets to integrate primary and lymph node data for each individual into the same object [[83]33]. CNV estimation and identification of malignant cells Clonal architecture was inferred using the Numbat package [[84]34]. Cancer cells from each individual were examined separately, with cells from both the primary and lymph node metastasis grouped together, using the individual’s immune cells as a normal reference. The default running parameters were used for all samples except patient 1. In this sample, the clones did not become stable until the log-likelihood ratio (LLR) reached 400. This was thought to be due to the high number of malignant cells in this sample. Cancer cells with similar CNV profiles were grouped together in a clonal group. Immune cell classification Cancer cells were identified using EPCAM expression. The immune cells were subtyped using automated methods using a reference dataset for immune cells [[85]35]. This reference dataset does not include exhausted CD8^+ t cells or macrophages, so these were added manually. CD8^+ T-central memory cells (CD8 TCM) were identified using the reference dataset. Markers of cytotoxicity (GNLY, GZMK, TNF, FASLG, PRF1, IL2, IFNG) and exhaustion (LAG3, BTLA, CTLA4, PDCD1, HAVCR2) were used to classify the CD8 TCMs further. One sub-population of CD8 TCMs expressed high cytotoxicity and exhaustion markers and were labeled as CD8TCM-exhausted cells. For macrophages, the monocyte population was identified using the reference dataset. Expression counts within the monocytes of FCGR2A was calculated. Any cells expressing FCGR2A > 2 × the mean of expression was assigned as a macrophage. After cell filtering, normalisation and cell classification, there was 14,157 cancer cells, 36,324 immune cells and 308 stromal/endothelial cells for analysis. Differential gene expression analysis To examine the phenotype of the expanding clones, the cancer clones were grouped into transcriptionally similar clusters with the assigned clones overlayed. The non-expanding clones in the primary were compared with the expanding clones in the lymph node for differential gene expression analysis. Differential gene expression analysis was performed with the “FindAllMarkers” function in Seurat [[86]30]. Calculations were corrected for patient pool. Where there were transcriptionally and clonally distinct clusters, multiple comparisons were made per patient. Patient 1 and patient 3 had two comparisons done each. Patient 1 had two transcriptionally distinct clusters in the primary—“P-cluster A”, made up of clone 4, and “P-cluster B”, made up of clone 6. These were each separately compared to the lymph node (LN) cluster, denoted 1 A and 1B. Similarly, in patient 3, the primary cluster was compared to each of the LN clusters, “LN-cluster A”, made up of primarily clone 3 and “LN-cluster B”, primarily made up of clones 4 and 5 (denoted comparison 3 A and 3B). Differentially expressed genes (DEG) analysis was performed for each patient and analysed with ReactomePA [[87]36] for enriched pathways. Custom plots for were designed by Dr Muskovic, see code repository. Visium spatial gene expression O.C.T embedded tissue samples were processed using the Visium Spatial Gene Expression slide and reagent kit (10 × Genomics, PN-1000186), following the manufacturer’s instruction. Briefly, 10 μm cryosections were placed into the capture areas of the Visium slide. Tissue morphology was assessed with H&E staining and imaging using a Leica DM6000 Power Mosaic microscope equipped with a 20 × lens (Leica). The imaged sections were then permeabilized for 18 min using the supplied reagents. The permeabilization condition was previously optimised using the Visium Spatial Tissue Optimization reagent kit (10 × Genomics, PN-1000192), and Visium Spatial Tissue Optimization slide kit (10 × Genomics, PN-1000191). After permeabilization, cDNA libraries were prepared using the Library Construction Kit (10 × Genomics, PN-1000190), checked for quality and sequenced on a NovaSeq 6000 platform (Illumina, US). Sequencing targeted 50,000 reads per occupied spot on the Visium slide. Around 300 million pair-ended reads were obtained for each tissue section. Read 1, i7 index and Read 2 were sequenced with 28, 8 and 91 cycles respectively. Visium spatial transcriptomics data processing Three out of four patients had suitable tissue for spatial sequencing for both the primary and lymph node; however, in patient one, only the lymph node sample had sufficient tissue. Reads were demultiplexed and mapped to the reference genome GRCh38 using the Space Ranger Software v1.0.0 (10 × Genomics). Count matrices were loaded into the Seurat v3.2 ([88]https://github.com/satijalab/seurat/tree/spatial) and STutility ([89]https://github.com/jbergenstrahle/STUtility) R packages for all subsequent data filtering, normalisation, filtering, dimensional reduction and visualisation. All spatial spots determined to be over tissue regions by Space Ranger were kept for subsequent analysis. Poor quality tissue locations were then filtered out based on a cut off of 500 unique genes. Genes detected in more than 10 locations were also kept for analysis. Data normalisation was performed on independent tissue sections using the variance stabilizing transformation method implemented in the SCTransform function in Seurat. We applied non-negative matrix factorization (NMF) to the normalised expression matrix using the STutility package (nfactors = 20). NMF reduction was then used for clustering using Seurat with all 20 factors as input (RunUMAP, FindNeighbors and FindClusters functions). Deconvolution of Visium spots To deconvolute cell-type compositions, we applied the SPOTlight algorithm [[90]37], using non-negative matrix factorisation (NMF) to decompose the single-cell expression (SCE) count matrix V into matrices W and H. We standardised V and initialised W with cell-type-specific marker genes to guide biologically relevant results. Using non-negative least squares (NNLS) regression, we mapped transcriptomes from spatial locations (V’) to H’ using W as a basis, creating cell-type distribution profiles. The resulting deconvolution matrix represents cell-type proportions in each spatial capture location, enabling spatial composition analysis and reuse with additional datasets via a pre-trained model. This was converted to a correlation matrix using tidyverse [[91]38]. Xenium transcriptomics Xenium spatial transcriptomics was performed on the three patients that had suitable tissue samples using a custom probe panel consisting of 262 genes (Additional file 2: Table S2). Patients 3 and 4 had both primary and lymph node analysed, patient 2 had only lymph node and there was insufficient tissue for patient 1. The Xenium Onboard Analysis software (2.0.0.10) was run simultaneously with Xenium Analyzer (2.0.1.0) to decode the reads and process DAPI stain morphology images. The resulting count matrices, along with transcript spatial data for two patients (patients 3 and 4), were loaded into the Seurat R package (5.0.1) ([92]https://github.com/satijalab/seurat). Patient 2 required cell segmentation to be conducted using a molecule-based approach with Baysor package (0.7.1) ([93]https://github.com/kharchenkolab/Baysor) and a minimum transcript threshold of 15. The results were then converted to a Seurat object. Cells that had less than 20 transcripts or less than 5 genes detected were filtered. Normalisation was run using Seurat’s SCTransform followed by PCA, clustering and UMAP. Since we had matched scRNA sequencing data for these tissues, cell annotation was performed by Seurat’s integration and label transfer approach (FindTransferAnchors and TransferData). The annotations were further validated through the expression of marker genes, particularly for the immune subtypes. To define major histocompatibility complex (MHC) activity in cancer cells, we examined the aggregated expression of the HLA complex genes (HLA-A, HLA-B, HLA-C, HLA-DRA and HLA-DQB2), along with the stand-alone expression of the PSMB8 gene. Thresholds were set at two standard deviations below the mean for both the expression of HLA genes and PSMB8. A cell needed to pass both thresholds to be called “MHC positive”. Correlation plots To visualise differences between clone statuses, these tables were combined and transformed into a long format for plotting. Bar plots were generated with ggbarplot [[94]39] to compare cell-type values between “Expanding” and “Non-Expanding” clones, with significance assessed using t tests. Quantification and statistical analysis Statistical analysis performed and software used Fisher’s exact test was used to determine a difference between the percentage of clones in the primary versus the lymph node. Reactome PA identified enriched pathways; P values were adjusted using an FDR control method. The difference in gene expression between expanding and non-expanding cancer clones was compared. A normality test was performed and the data was not normally distributed, therefore, statistics were performed using the Wilcoxon signed rank test. Immune cell and cancer clone co-occurrence was established using the Visium spatial data. A T test was performed to ascertain the difference between the expanding and non-expanding clones. The distance between T cells and MHC + versus MHC − cancer cells were compared. Statistics were performed using the Wilcoxon signed rank test. Where multiple tests were performed in a single analysis, a Bonferroni correction was performed to obtain an adjusted P value. P values are reported with asterix: * < 0.05, ** < 0.01, *** < 0.001. Results Evidence of clonal selection favouring phenotypic predisposition to metastasis It has long been recognised that the genomic profiles of cancer clones influence cellular phenotypes, impacting clinical outcomes [[95]40]. Using haplotypes and single-cell RNA-sequence (scRNA) data, we called CNVs from matched primary and regional lymph node metastases from four HPV + patients (Fig. [96]1A, B) [[97]34]. An overview of the CNV distribution across all individual cells highlights various genomic alterations indicative of the genomic instability typically observed in cancers (Additional file 3: Fig. S1). We did not observe a consistent enrichment of oropharyngeal cancer oncogenes identified by the cancer genome atlas (TCGA) [[98]41] in the CNV regions (PIK3CA, TRAF3, E2F1) (Additional file 3: Fig. S1). A possible explanation is that TCGA data comprises HPV + and HPV − cancer samples. Nevertheless, the data illustrates the heterogeneous nature of HPV + cancer across patients. To understand clonal dynamics, subclonal phylogenies were generated by Numbat [[99]42] by aggregating data from single cells into a pseudobulk profile, capturing the evolutionary relationships between cells. From 14,157 single-cell CNV profiles, three to seven distinct clusters per patient were identified (Fig. [100]2A). Each cluster, suggesting a unique cancer clone, exhibited a characteristic CNV signature. By constructing a phylogenetic tree based on shared CNV events [[101]34], we identified the ancestral clones and genomic evolution of clones by acquiring new somatic events. As expected, the clonal evolution of cancers was unique for each patient and is summarised in detail in Fig. [102]2A and B. Using patient four as an example, we observe the development of clone two arising from the original ancestral clone through deletions at 3p and 15p. Subsequent deletions at 10q and 11q and a loss of heterozygosity (LOH) at 20q led to clone three, followed by a divergent event into clones four and six through a new LOH at four (clone six) and amplifications at 3q and 5q plus a deletion at 7q (clone four). We observe a final clonal evolution event, with clone five arising from clone four via somatic mutations at 4p^+, 8q^−, 19^+ and 20p^+ (Fig. [103]2). Similar patterns of clonal evolutionary events are observed across patients. Fig. 1. [104]Fig. 1 [105]Open in a new tab Study design and cell annotation. A Description of the patient samples. Patients with node positive, HPV + oropharyngeal cancer amenable to surgical resection were asked to participate. Primary sites included the tonsil or base of the tongue with affected nodal stations shown. B Study schematic. In parallel, the samples were processed for scRNA-seq and spatial RNA sequencing using the Visium platform. Single cells were annotated using a combination of automatic and manual processes and used to deconvolute the spatial data. The cancer cells were partitioned into clones based on single-cell CNV calls. C Canonical markers demonstrate cancer cells (EPCAM), Bcells (MS4A1), Tcells (CD4, CD8B), NK cells (NCAM1) and monocytes (CD14, FCER1A, SERPINF1) Fig. 2. [106]Fig. 2 [107]Open in a new tab Selection leading to clonal expansion in metastatic lymph nodes. A Clonal phylogenies and the chromosomal events between clones are shown for each patient. B CNV events which define each clone. Each column is a chromosome. Each row displays the CNV profile for a clone. Deletion, amplification and loss of heterozygosity (LOH) events are shown in blue, red and green, respectively. C Percentage representation of each clone in the primary and nodal metastasis for each patient. For each clone, the percentage make up of total cancer cells is shown for the primary (in a dark colour) as compared to the same clone in the nodal metastasis (lighter colour). Statistically significant changes in the percentage representation of a particular clone between the primary and LN are shown We next sought to understand the clonal diversity in primary and lymph node sites and identify cancer clones that show significant expansions into the metastatic lymph node sites. As shown in multiple cancers [[108]11], the clonal diversity was greater in the primary site compared to the lymph metastasis (Fig. [109]2C). This difference in diversity can be due to several factors, such as selection pressures on cancer clones, where only a subset of cells from the primary tumour possess the necessary genomic and phenotypic characteristics required to metastasise; tumour microenvironment and adaptation, where differences in immune evasion in the primary and metastatic site favour growth and expansion of certain clones. We observed significant differences in the clonal proportions between the primary and lymph node sites, demonstrating clonal expansion into the metastatic site, which consequently enabled us to identify genomic factors underlying metastatic behaviour. The results revealed a clear pattern, with one or two clones displaying substantial expansion from rare populations in the primary to dominant clones in the metastatic lymph node (Fig. [110]2C). We refer to these clones as expanding or metastatic clones. Correspondingly, clones common in the primary site are frequently absent or observed in very low frequencies in the lymph node site, suggesting an absence of metastatic phenotypes. We subsequently analysed the genomic differences between these clones to identify pathways and microenvironmental contexts that contribute to the metastatic phenotype. Virally induced dysregulation of protein translation is a defining characteristic in metastasising cancer clones In most cases, cancer mortality is primarily attributed to metastatic disease [[111]43], due to the emergence of cancer cell clones with aggressive phenotypes through a selective process. To identify the genomic phenotypes of expanding clones, we overlapped the clonal data with cell clustering to delineate the expanding and non-expanding cancer cells (Fig. [112]3A). We first examined the expression of HPV genes across patients and clones (Additional file 4: Fig. S2). Most cancer cells had zero counts for the E6 and E7 genes (Additional file 4: Fig. S2) and a summation of HPV gene counts showed generally low levels, though some cancer cells exhibited high overall HPV counts. There was no clear link between high HPV counts and clonal expansion. The strongest association was observed in Patient 4, where expanding clones (4 and 5) displayed significantly higher HPV counts in the LN (Additional files 3, 4:Fig. S1, S2). There was also no significant association with HPV gene counts and expanding clones when examined at a patient level (Additional files 5–8: Fig. S3–S6). Next, we identified the genes significantly differentially expressed between the expanding and non-expanding clones for each individual patient (Additional file 9: Table S3). We performed pathway enrichment analyses individually, with two separate analyses conducted in patient 1 and 3 as outlined in the methods (Fig. [113]3A, B and Additional file 10: Table S4). With each patient result compared side-by-side,Fig. [114]3B identifies three broad cellular pathway groups represented by genes whose expression levels significantly differ between metastatic expanding clones and non-expanding clones: cell cycle, immune regulation, and protein translation (Fig. [115]3B and Additional file 10: Table S4). In contrast to previous data [[116]16] in the HPV − population, we did not identify epithelial-to-mesenchymal transition as an enriched pathway. In the top ten enriched pathways for each patient comparison, protein translation pathways showed consistent downregulation in the expanding compared to non-expanding clones (Fig. [117]3B, Additional files 11–16: Fig. S7–S12). For example, the pathway “Formation of a pool of free 40S subunits” is enriched in comparison A for patient 1 with 59 differentially expressed genes from this pathway (P value for enrichment = 4.24 × 10^−23). Additional file 17: Fig. S13 shows that the genes found in this pathway in comparison 1 A consist of protein translation initiating factors and ribosomal subunit protein genes. Further analysis (Additional file 9:Table S3) demonstrates that many genes identified in comparison 1 A within this pathway are downregulated, like RPL35A (Log[2](FC) = − 0.89, P = 9 × 10^−251) and RPS26 (Log[2](FC) = − 0.59, P = 1.45 × 10^−107), indicating that expanding clones are downregulating protein translation. Given that comparison 3B exhibited distinct cell cycle and immune regulation pathways, we investigated whether it also displayed unique patterns of HPV gene expression (Additional file 18: Fig. S14). Overall, HPV expression was low across all clones and clusters, with no discernible pattern observed. Fig. 3. [118]Fig. 3 [119]Open in a new tab Downregulation of protein translation pathways in expanding metastatic clones. A The primary and LN cancer cells were clustered into transcriptionally similar groups, with the clones overlayed in their colours shown in the bar graphs. In most cases, the clones formed transcriptionally distinct clusters, which formed the comparisons for pathway enrichment analyses. Multiple comparisons have been done in patients 1 and 3 using these distinct clusters. B Reactome pathways which were significantly (study-wide P < 0.05) enriched, are shown. The colour relates to the P value of enrichment. These are grouped into three distinct pathways of cell cycle, immune regulation and protein translation. C Percentage of ribosomal protein gene expression in the expanding versus non-expanding clones. There is less expression in the expanding clones. A T test shows that this difference is highly statistically significant. P-cluster primary cluster, LN-cluster lymph node cluster The consistent downregulation of protein translational machinery in expanding clones is accompanied by a small but significant decrease in the expression levels of ribosomal protein-encoding genes in the expanding clones (P = 2.22 × 10^−16) (Fig. [120]3C). The presence of 18 enriched protein pathways, as revealed in Fig. [121]3B, all characterised by an abundance of protein translation initiation factors and ribosomal protein genes, collectively underscores the pivotal role of protein translation downregulation as a defining phenotypic factor in expanding clones. To understand the relationship between viral infection and dysregulation of protein translation in metastatic expanding clones, we tested for the relative expression levels of translation-initiating factors and regulatory kinases associated with virally activated translational relief, a process by which virally infected cells escape normal anti-viral mechanisms [[122]44, [123]45]. Under normal circumstances, a cell will downregulate protein translation and upregulate apoptosis once a virus is detected. We identified significant upregulation of two translation-initiating factors, EIF4E (P = 7.76 × 10^−14) and EIFG1 (P = 1.49 × 10^−31), in expanded cancer clones of the lymph node compared with non-expanding clones and corresponding significant downregulation of regulatory kinases EIF4EBP1 (P = 7.17 × 10^−09), EIF2AK2 (P = 4.4 × 10^−07) and EIF2S1 (P = 9.2 × 10^−06) (Fig. [124]4, Additional file 19: Table S5). Further evidence of translational relief is seen with upregulation of the stress kinases PPP1R15A (P = 4.36 × 10^–81) and PPP1CB (P = 2.27 × 10^−13) and downregulation of apoptosis associated genes TP53 (P = 3.69 × 10^−25), FAS (P = 2.74 × 10^−06) and BAX (P = 1.82 × 10^−13) (Fig. [125]4, Additional file 19: Table S5). Given that the patient 3 was the only patient to demonstrate immune dysregulation in the pathway enrichment analysis (Fig. [126]3), we sort to confirm if the proteasome and IFN inducible gene expression changes were uniform across all patients. Additional file 20: Table S6 shows that there was a shared direction of effect for the majority of the INF inducible genes across all 4 patients. Dysregulation of protein translation has previously been described in both viral infection [[127]44, [128]46, [129]47] and cancer [[130]48]. After an acute viral infection, the host cell attempts to reduce virus replication through a process called “host shut-off” [[131]49, [132]50], where ribosomal proteins involved in mRNA translation are downregulated and apoptosis is triggered. In normal circumstances, this process limits viral replication and induces apoptosis in infected cells, thereby containing the infection. However, in persistent viral infection like HPV, these mechanisms fail to eradicate the virus. Our results provide evidence of virally induced translational relief in expanding clones, where host cells’ shut-off mechanisms are normalised using stress kinases to keep EIF2AK in its dephosphorylated state. This maintains protein translational machinery and viral replication and turns off apoptotic pathways. Fig. 4. Fig. 4 [133]Open in a new tab Gene expression differences between metastasising expanding clones and non-expanding clones. Genes are grouped into eight key pathways: Protein translation initiating factors, protein translation regulatory kinases, protein translation stress kinases, apoptosis, cap-dependent translation, cap-independent translation, Proteasome, IFN-inducible genes. The non-expanding cancer clone’s gene expression is in the dark colours and expanding clone’s expression in the light colours. Genes coloured in blue are downregulated, red are upregulated and green are changes which are not significant. Line within boxplot = median, dot = mean Virally induced immunosuppression through interferon and immunoproteasome pathways In HPV + cervical cancer, viral infection triggers significant alterations in the microenvironment [[134]51]. During persistent infection, there is a notable decline in the presence of cytotoxic CD8^+ and helper CD4^+ T cells within the epithelial tissue. In the precancerous stage, cytotoxic T cell populations maintain an immunosuppressed profile, characterised by increased regulatory T cells. Invasive cancers demonstrate a high immunogenic state marked by elevated levels of cytotoxic and helper T cells and occasionally includes a surge in regulatory T cell numbers. Notably, an abundance of regulatory T cells is associated with a more unfavourable prognosis [[135]51]. Viral infection generally results in immune evasion via suppression of IFN-stimulated gene transcription [[136]47] and in HPV-infection, downregulation of the immunoproteasome [[137]45]. Silencing the IFN regulatory pathway has also been demonstrated to be a feature of metastatic cancer cells [[138]52]. We next sought to understand the consequences of HPV infection on immune evasion and its relationship to clonal expansion. Comparing the expression levels of IFN-inducible genes, we observed a downregulation of the IFN regulatory pathway in expanding metastatic cancer clones compared to non-expanding clones (Fig. [139]4, Additional file 19: Table S5). The effect is observed in genes involved in the JAK/STAT pathway, with JAK2 (P = 0.007), STAT1 (P = 1.19 × 10^−06) and STAT2 (ns) all downregulated amongst the expanding clones relative to non-expanding ones. In support of this, we also observed downregulation of regulators of antiviral immune response MX1 (ns), MX2 (ns), ISG15 (P = 3.52 × 10^−05) and IFIT3 (ns) amongst expanded cancer clones. Furthermore, the main immunoproteasome genes PSMB8 (P = 1.45 × 10^−18) and PSMB9 (P = 1.32 × 10^−56) are downregulated, along with the associated regulatory protein TAP1 (P = 4.95 × 10^−08). These results suggest that through mechanisms to dampen the IFN inflammatory response and reduce antigen presentation, expanding clones can avoid the host immune response, metastasise and expand. Upregulation of cap-independent translational pathways is related to metastatic clonal behaviour The process of mRNA production in healthy cells involves cap-dependent translation mediated by the protein complex EIF4F. However, in cancer cells, cap-dependent translation is suppressed in favour of cap-independent translation, mediated by EIF4G1. This transition enables the sustained expression of genes that promote cell survival and proliferation [[140]53]. Compared with healthy cells, we observe a higher ratio of EIF4G1:EIF4F as expected. However, we hypothesise that cap-independent translation pathway variation would impact metastatic cell phenotypes. To evaluate this, we compared the expression levels of cap-dependent translation genes between the expanded metastatic clones and non-expanding clones. The results display a clear pattern of upregulation of EIF4G1 (P = 1.49 × 10^−31), coupled with the downregulation of genes RPS2 (P = 5.5 × 10^−110), RPS15 (P = 1.19 × 10^−05) and RPL35A (P = 0.0098) controlled by cap-dependent translation (Fig. [141]4). Additionally, we observed the upregulation of genes involved in cancer cell proliferation and angiogenesis, with HIF1A (P = 7.76 × 10^−09), VEGFA (P = 3.44 × 10^−58) and PCBP1 (ns) all upregulated in expanded versus non-expanded clones. Interestingly, whereas in other cancers, acquisition of somatic mutations has been linked to dysregulation of initiating factor genes [[142]54], we do not observe any evidence for that here (EIF4A1, EIF4E3, EIF4G1, EIF4G2, EIF4EBP1; Additional file 3: Fig. S1). Our results identify the relationship between genomic phenotypes and clonal expansion into lymph node metastases, with expanding clones displaying a combination of dysregulated pathways favouring their survival. These can be summarised as the downregulation of cap-dependent translation and apoptosis, alongside the upregulation of cap-independent translation, co-expression with genes associated with cancer cell proliferation and metastatic processes, and evasion of immune response through the proteasome and IFN pathways. Evasion of immune cells within the primary tumour is a distinct cellular phenotype of metastatic clones Our analysis of expanding clones that result in clonal dominance in the lymph node metastasis identified immune evasion through virally induced interferon and immunoproteasome pathways. To provide supportive evidence for this observation, we generated in-situ spatial RNA-seq data from matched primary and lymph node biopsies. We interrogated the relationship between cancer clonal subtypes and immune cell interactions within the cancer microenvironmental context. Numerous studies have highlighted the importance of the immune microenvironment within the primary site in influencing the metastatic potential of cancer cells [[143]52, [144]55], and we aimed to test this in our cohort. After classifying cancer clones and immune cell subtypes, we tested for the correlation in co-occurrence of pairwise combination of cells. We identified distinct differences between the T cell correlation with expanding and non-expanding cancer clones (Fig. [145]5). Expanding, metastatic clones predominately exhibit negative or neutral correlations (− 0.361 < R < 0.01) with T cell subsets compared to more positive correlations in the non-expanding clones (− 0.334 < R < 0.678). This observation is more pronounced within primary tumours. For example, the mean correlation in the primary between CD8 Exhausted cells (CD8Ex) and expanding and non-expanding clones (non-expanding = 0.599; expanding = − 0.12) is more disparate than in the lymph node (non-expanding = 0.295, expanding = 0.01). To evaluate this further, we tested for a significant difference in the median correlation between the expanding and non-expanding cancer clones across all patients. Figure [146]6A shows a difference in the co-occurrence of three T cell subtypes between expanding and non-expanding cancer clones. Particularly in the primary, there is a dominant negative association of the expanding clones and T cells. A very similar pattern is seen in the B cell subsets (Fig. [147]7). The correlation analysis identified a distinct trend where expanding clones exhibit negative or neutral correlations with B cells, in contrast to the behaviour observed in non-expanding clones, which have consistently strong co-occurrences. Summary correlations using mean correlation scores (Fig. [148]6B) confirm this pattern. Notably, within the primary tumour, a substantial disparity exists in the correlations between expanding and non-expanding clones with B memory cells and plasmablasts. Similarly, significant differences emerge in the correlations with B-memory (P = 0.002) and B-naïve (P = 0.026) cells within the lymph node. In the monocyte populations CD14 monocytes (CD14Mono), classical dendritic cells (cDC), plasmacytoid dendritic cells (pDC), a similar relationship to that observed in T and B cell subsets (Additional file 21: Fig. S15). The mean correlation scores (Fig. [149]6C) identify a tendency to have positive correlations with monocytes and non-expanding clones, in contrast to the negative correlations noted with expanding clones. This trend is particularly pronounced in the lymph nodes, where significant differences between expanding and non-expanding clones exist, within the CD14 monocyte (P = 0.0182) and cDC subsets (P = 0.0104). Fig. 5. [150]Fig. 5 [151]Open in a new tab Relationship between expanding and non-expanding cancer clones and T cell subtypes. The cell subtype signatures from annotated single cell data were used to determine the cell composition of each Visium spot using a seeded NMF regression [[152]37]. A weighted combination of cell types which reside in each spot were calculated and visualised as a pie chart. With these weighted combinations, correlation analyses were performed, examining the likelihood of a certain cell-types co-locating with one another, generating a correlation plot. Patients 2–4 have a correlation plot for the primary (left) and LN (right). Patient 1 has LN only. Each correlation plot shows the likelihood of each cell-type co-locating within a Visium spot. Orange indicates a positive correlation, blue colour indications negative correlation. CD8TEM CD8 T effector memory cells, CD4Prol CD4 proliferating cells, T-reg T regulatory cells, MAIT mucosal-associated invariant T cells, dnT double negative t cells, CD8Ex CD8 exhausted cells, gdT gammadelta T cells, ILC innate lymphoid cells Fig. 6. [153]Fig. 6 [154]Open in a new tab Relationship between immune cell and cancer clone co-occurrence across primary and lymph node sites. The spatial deconvolution matrix was used to compute immune cell subgroup correlations across spatial locations for each patient. The merged correlation matrices from four patients were separated data by cancer clone status (“Expanding” vs. “Non-Expanding”). A positive correlation between cell types is above 0, a negative correlation is below 0. For visualisation, the data was transformed into a long format and created bar plots with ggplot2 to compare cell-type values, using T tests to assess significance Fig. 7. [155]Fig. 7 [156]Open in a new tab Spatial data and correlative analysis for cancer clones and B cell subsets. Correlative analysis and pie chart plots are shown for the primary (left) and LN (right). B-intermed = B intermediate cells Results from the in-situ data demonstrate clone-specific immune evasion within the tumour microenvironment in cells which metastasise. The observed dysregulation in the interferon and immunoproteasome pathways identified in differential gene expression analysis, offers a credible explanation for this observed phenotype. Suppression of the IFN pathway affects antigen presentation and T cell engagement of cancer cells Given our findings of downregulation of IFN genes and immune evasion, we sought to further investigate the downstream effects of these gene changes on antigen presentation. Using a custom designed panel derived from scRNA-seq data (Additional file 2: Table S2), we were able to conduct in-situ spatial profiling on 3 out of the 4 patients. This true cellular resolution enabled evaluation of the distance between distinct cell types. Using an aggregate of HLA and PSMB expression, we dichotomised cancer cells into MHC positive (MHC^+) and MHC negative (MHC^−). Figure [157]8 demonstrates that cells which are MHC^+ are more closely associated with T cells than MHC^− cells, confirming that cancer cells processing and presenting antigen are more likely to be recognised by T cells. This pattern is consistent with all specimens tested (Additional file 22: Fig. S16). Fig. 8. [158]Fig. 8 [159]Open in a new tab In-situ spatial data depicting cancer and T cells in the primary (A) and lymph node (B) of patient 3. The MHC^+ cancer cells (red) are more closely associated with T cells (green) compared with the MHC^− cancer cells (blue). Distance between individual cells has been calculated and boxplots show a significantly shorter distance between MHC^+ cancer cells and T cells compared with the MHC^− cancer cells Discussion It is widely acknowledged that cancer cells exhibit a range of diverse phenotypic properties [[160]56]. Moreover, distinct cancer clones within tumours significantly influence the responses to anti-cancer treatments, subsequently impacting overall survival outcomes [[161]57]. In locally advanced cancers, surgical resection aims to remove all the macroscopically identified cancer, with the addition of adjuvant therapies like radiation and chemotherapy to reduce recurrence rates by eradicating microscopic residual disease [[162]58]. It has been shown that clones present in recurrent cancer are often sparsely represented in the primary tumour and that clonal complexity decreases in metastases [[163]12, [164]54]. While the mechanisms are not always resolved, it is clear that clonal selection occurs for cell phenotypes that enhance survival and metastatic processes. In the case of HPV + oropharyngeal cancer, persistent HPV infection leads to alterations in the host immune response [[165]45] which likely results in significant selection pressure on cells, which shapes the tumour microenvironment [[166]59]. This is evident when observing immune cell microenvironment changes through acute and chronic viral infection phases and pre-cancer and invasive cancer development with HPV infection [[167]51], highlighting the importance of examining cancer clones in their spatial context. Considering that HPV infection serves as the primary driver of carcinogenesis, our findings delineating metastatic clones are understandably distinct from those observed in the HPV − population [[168]16]. Our initial results confirm observations presented in other cancers [[169]60], namely that there is reduced clonal complexity in metastases and that metastasising clones are sparsely represented in the primary tumour, confirming that clonal selection occurs. Through genomic analysis of cancer clones, we identified that the dysregulation of protein translation phenotypically defines expanding metastatic clones. Given that protein translation dysregulation is seen in both acute [[170]47] and chronic [[171]44, [172]46] viral infections and cancer [[173]53], we took a deeper look into the regulation of genes and identified several important mechanisms that define expanding cancer clones in HPV + oropharyngeal cancer. Our results suggest that virally induced translational relief is a mechanism that defines metastasising cancer clones in HPV + oropharyngeal cancer. This observation aligns with prior work demonstrating HPV E6 protein is linked to translational control in infected fibrosarcoma cancer cell lines [[174]44], with E6 integration inducing the production of EIF2AK2, which in turn leads to phosphorylation of EIF2S1 and a decrease in protein translation and E6 production rapidly decreases. However, despite significant downregulation, the small amount of remaining E6 can initiate translational relief or rescue mechanisms. E6 can promote dephosphorylation of EIF2AK2 through stress kinase PPP1R15A, allowing protein translation to normalise and apoptosis to be inhibited [[175]44]. Furthermore, PPP1R15A is associated with tight regulation of the IFN-mediated anti-viral response [[176]61]. In an acute viral infection, PPP1R15A is required to relieve protein translation inhibition to allow IFN and stress granule production in a pulsatile fashion. This allows the cell to restrict protein translation while releasing inflammatory mediators in periodic bursts. The dynamics of this process are different in chronic infection. When cells are exposed to stress, such as chronic viral infection [[177]62], PPP1R15A levels are continually elevated with fewer bursts, reducing stress granule release over time and creating an environment conducive to viral maintenance. Our results show that metastasising clones have increased expression of PPP1R15A and decreased levels of IFN-inducible genes, suggesting that these compensatory mechanisms favour viral replication and are critical in defining the metastatic phenotype. In addition to its role in viral response, PPP1R15A has previously been implicated in carcinogenesis and cancer growth. In mouse models of colorectal cancer, PPP1R15A potentiates carcinogenesis by enhancing IL-6 production and STAT4 activation [[178]63]. It has also been shown to promote autophagy in liver cancer [[179]64] and upregulate the RAC1-GTPase pathway in breast cancer [[180]65]. This highlights a potential treatment target for the prevention of carcinogenesis in patients with chronic viral infection and established cancer. Further perturbations of protein translation were identified in the metastatic cell phenotype, with the upregulation of cap-independent protein translation. Cap-independent mechanisms do not rely on the normal ribosomal machinery and, therefore, are not as susceptible to changes in functional subunit availability [[181]66]. Therefore, cancer clones that can utilise these cap-independent pathways to maintain protein translation may have an evolutionary advantage. Our data suggests that cap-independent translation is used to maintain cancer-related proteins in the expanding metastatic clones. This observation is supported by cell-specific co-expression with cancer survival genes VEGFA and HIF1A. These results align with previous research in breast cancer models that have demonstrated how cap-independent mechanisms selectively translate mRNAs that encode proteins critical for cancer cell survival like VEGFC, HIF1A and BCL2 [[182]67, [183]68]. Previous work shows that lung cancers rely on cap-independent mechanisms to maintain cell replication and that EIF4G1 is an adverse prognostic feature [[184]53]. Given that cap-independent translation appears to be a mechanism underlying selection of HPV + oropharyngeal cancer metastatic clones, we propose this may be a useful biomarker to identify patients with poor prognostic disease. Immune cell evasion was identified as a defining feature of metastatic cancer clones both as a cellular phenotype and through analysis of the tumour microenvironment. For HPV infection to persist, viral proteins E5, E6 and E7 mediate anti-inflammatory and immune-evading functions. For example, in human keratinocytes, E5 can actively suppress IFN and IFN-stimulated gene expression, key components of the innate anti-viral response [[185]69]. Here, we show that the downregulation of immunoproteasomes is an important component of immune evasion in the expansion and metastasis of clones in HPV + oropharyngeal cancer. The immunoproteasome is involved in antigen presentation and is a critical step in the inflammatory response to viral infection and cancer. E5 HPV protein is a powerful negative regulator of the anti-viral immune response by limiting the MHC I antigen repertoire via suppression of the immunoproteasome [[186]45], allowing infected cells to escape host immune surveillance [[187]70, [188]71]. The importance of the immunoproteasome in antigen recognition is further demonstrated, with high expression being a positive predictive biomarker of immune checkpoint inhibitor blockade response [[189]72]. Our data shows that immune evasion is a second phenotypic feature of metastatic clones, perhaps mediated through dysfunctional antigen processing and presentation. Although HPV + oropharyngeal cancer is considered an immunotherapy-sensitive cancer, the response is not universal. In the metastatic setting, survival is less than 2 years despite checkpoint inhibitor use. Ineffective antigen presentation via the immunoproteasome may be a particularly attractive therapeutic target in HPV + oropharyngeal cancer, given that HPV infection and carcinogenesis are dysregulators of this process. Restoration of the immunoproteasome by IFN-γ has been shown to improve immune system recognition of renal cancer cells [[190]73] and cervical cancer [[191]74]. Interestingly, when this approach was used in head and neck cancer [[192]75], highly heterogeneous responses were seen ex vivo. This heterogeneity could be explained by the mixture of HPV + and HPV − cancers analysed. Challenges for managing HPV + oropharyngeal cancer include the lack of reliable prognostic markers and paucity of druggable targets. This limits the ability to personalise treatment decisions in curable disease and offer effective treatment options in the metastatic setting. Conclusions We have identified mechanisms defining the metastatic clonal phenotype unique to HPV + oropharyngeal cancer. Given the aetiology of HPV-related cancer, it is expected that many of the cellular mechanisms are shared with those observed in chronic viral infection. EIF4G1 expression may be useful as a prognostic marker, which could be used to select patients more likely to metastasise and benefit from more aggressive curative therapies. Finally, the immunoproteasome likely plays a significant role in immune evasion in metastasising clones. Further investigation of immunoproteasome proteins as predictive biomarkers for checkpoint inhibitor response and direct targeting as a therapeutic strategy should be undertaken. Additional files. Supplementary Information [193]12916_2025_4392_MOESM1_ESM.pdf^ (65.6KB, pdf) Additional file 1: Table S1. Patient characteristics and survival outcomes. [194]12916_2025_4392_MOESM2_ESM.xls^ (38KB, xls) Additional file 2: Table S2. Custom Xenium probe list. [195]12916_2025_4392_MOESM3_ESM.pdf^ (1MB, pdf) Additional file 3: Fig. S1. Karyotype plots. For each patient, the karyotype plots, including the primary and the LN, are shown—amplification in red, deletion in blue and LOH in green. Chromosome 8 was the only chromosomal location which contained alterations in all patients. Locations of genes of interest are shown. The location of genes essential to HPV infection and cancer, show that CNVs are not consistently seen at these locations, except for MYC which is located in Chromosome 8. At the locations of pivotal genes identified in the ICGC cohortand translation initiation, there are not consistent alterations in these sites common across all patients. [196]12916_2025_4392_MOESM4_ESM.pdf^ (398.1KB, pdf) Additional file 4: Fig. S2. HPV gene counts. Viral counts displayed by clone. E6, E7 and all HPV viral counts displayed. Counts in the primary displayed in red, LN displayed in blue. The clones which were dominant in the primary are marked with a red line, the clones which were dominant in the LN are marked with a blue line. Statistically significant differences in the viral counts between the primary and LN within a clone are marked. [197]12916_2025_4392_MOESM5_ESM.zip^ (2.1MB, zip) Additional files 5–8: Fig. S3-S6. HPV gene counts for each individual patient. Each HPV gene is graphed separately. Red lines dennote expanding clones, blue lines dennote non-expanding clones. Piecharts to show the percentage of HPV genes in both the primary and lymph node. [198]12916_2025_4392_MOESM6_ESM.xls^ (1.5MB, xls) Additional file 9: Table S3. Differential gene expression analysis for the comparisons in Fig. [199]3. Each comparison is listed separately. [200]12916_2025_4392_MOESM7_ESM.xls^ (630.5KB, xls) Additional file 10: Table S4. Gene enrichment analysis results for each comparison in Fig. [201]3. Each comparison is listed separately. [202]12916_2025_4392_MOESM8_ESM.pdf^ (215.4KB, pdf) Additional file 11: Fig. S7. Differentially expressed genes in Patient 1A. The top ten enriched pathways, revealing the direction and extent of gene expression changes. [203]12916_2025_4392_MOESM9_ESM.pdf^ (211.7KB, pdf) Additional file 12: Fig. S8. Differentially expressed genes in Patient 1B. The top ten enriched pathways, revealing the direction and extent of gene expression changes. [204]12916_2025_4392_MOESM10_ESM.pdf^ (214.9KB, pdf) Additional file 13: Fig. S9. Differentially expressed genes in Patient 2. The top ten enriched pathways, revealing the direction and extent of gene expression changes. [205]12916_2025_4392_MOESM11_ESM.pdf^ (205.6KB, pdf) Additional file 14: Fig. S10. Differentially expressed genes in Patient 3A. The top ten enriched pathways, revealing the direction and extent of gene expression changes. [206]12916_2025_4392_MOESM12_ESM.pdf^ (210.8KB, pdf) Additional file 15: Fig. S11. Differentially expressed genes in Patient 3B. The top ten enriched pathways, revealing the direction and extent of gene expression changes. [207]12916_2025_4392_MOESM13_ESM.pdf^ (205.4KB, pdf) Additional file 16: Fig. S12. Differentially expressed genes in Patient 4. The top ten enriched pathways, revealing the direction and extent of gene expression changes. [208]12916_2025_4392_MOESM14_ESM.pdf^ (5.9MB, pdf) Additional file 17: Fig. S13. Functional protein association network from patient comparison 1A. [209]12916_2025_4392_MOESM15_ESM.pdf^ (1.2MB, pdf) Additional file 18: Fig. S14. Patient clones from Fig. [210]3.UMAP plot of the 7 clones shown in Fig. [211]3.HPV counts across clones in a violin plot. [212]12916_2025_4392_MOESM16_ESM.xls^ (22.5KB, xls) Additional file 19: Table S5. Mean expression levels of genes shown in Fig. [213]4. [214]12916_2025_4392_MOESM17_ESM.xls^ (23KB, xls) Additional file 20: Table S6. Shared direction of effect of the IFN inducible genes in Fig. [215]4. [216]12916_2025_4392_MOESM18_ESM.pdf^ (11.1MB, pdf) Additional file 21: Fig. S15. Spatial data and correlative analysis for cancer clones and monocyte subsets. Correlative and pie chart plots are shown for the primaryand the LN. CD14 Mono = CD14 positive monocytes, cDC = classical dendritic cells, pDC = plasmacytoid dendritic cells. [217]12916_2025_4392_MOESM19_ESM.pdf^ (498.8KB, pdf) Additional file 22: Fig. S16. In-situ spatial data for patient 2. Lymph node, patient 4 primary and patient 4 lymph node. MHC^+ cancer cells in red, MHC^− cells in blue and T-cells in green. Boxplots show the distance between cells. Acknowledgements