Abstract

   The single‐cell RNA sequencing (scRNA‐seq) quantifies the gene
   expression of individual cells, while the bulk RNA sequencing (bulk
   RNA‐seq) characterizes the mixed transcriptome of cells. The inference
   of drug sensitivities for individual cells can provide new insights to
   understand the mechanism of anti‐cancer response heterogeneity and drug
   resistance at the cellular resolution. However, pharmacogenomic
   information related to their corresponding scRNA‐Seq is often limited.
   Therefore, a transfer learning model is proposed to infer the drug
   sensitivities at single‐cell level. This framework learns bulk
   transcriptome profiles and pharmacogenomics information from population
   cell lines in a large public dataset and transfers the knowledge to
   infer drug efficacy of individual cells. The results suggest that it is
   suitable to learn knowledge from pre‐clinical cell lines to infer
   pre‐existing cell subpopulations with different drug sensitivities
   prior to drug exposure. In addition, the model offers a new perspective
   on drug combinations. It is observed that drug‐resistant subpopulation
   can be sensitive to other drugs (e.g., a subset of JHU006 is
   Vorinostat‐resistant while Gefitinib‐sensitive); such finding
   corroborates the previously reported drug combination (Gefitinib +
   Vorinostat) strategy in several cancer types. The identified drug
   sensitivity biomarkers reveal insights into the tumor heterogeneity and
   treatment at cellular resolution.

   Keywords: drug response annotation, single‐cell sequencing, transfer
   learning
     __________________________________________________________________

   A transfer learning framework for single cell drug response prediction.
   This framework integrates domain adaptation to learn cell line bulk
   pharmacogenomics and transfers the knowledge to infer drug
   sensitivities at single cells before treatment. This ranking‐based
   framework for drug sensitivity inference provides new strategies to
   account for intratumoral heterogeneity; providing new perspectives for
   biomarker discovery and drug combination applications.

   graphic file with name ADVS-10-2204113-g003.jpg

1. Introduction

   The personalized anti‐cancer therapies have achieved successes in
   several malignancies.^[ [38]^1 ^] However, even with positive responses
   after treatment, the existence of intratumoral heterogeneity can
   ultimately drive tumor progression or metastasis as one of the key
   factors contributing to lethal outcomes.^[ [39]^2 ^] The intratumoral
   heterogeneity is caused by many factors, such as genetic heterogeneity
   of cells^[ [40]^3 ^] and the existence of multiple independent cell
   subpopulations within a tumor.^[ [41]^4 ^] The intratumor heterogeneity
   is a well‐known issue leading to treatment failure.^[ [42]^4 ^]
   Currently, there are past studies which have evaluated the selective
   pressure after drug treatment using single cell data; for instance,
   genomics and single‐cell phosphoprotomics analysis on patient‐derived
   epidermal growth factor receptor (EGFR)‐mutated in vivo mTOR kinase
   inhibitor (mTORki) resistance glioblastoma (GBM) model have been
   proposed. Researchers identified promising drug combinations by
   resolving the related protein signal networks.^[ [43]^5 ^] Moreover,
   the drug combination efficacy was validated with the complete and
   lasting tumor suppression on GBM39 implanted mice model.^[ [44]^5 ^]
   Another research uses single‐cell proteomics to study the drug‐induced
   (anti‐EGFR therapy) resistance and tumor phenotypic switch (a switch
   from one signaling state to another state) in triple‐negative breast
   cancer (TNBC).^[ [45]^6 ^] EGFR is highly expressed in TNBC and was
   suggested as a therapeutic target.^[ [46]^7 ^] However, anti‐EGFR
   monotherapies for TNBC are normally failed in clinical.^[ [47]^6 ^]
   Researchers demonstrated that distinct cell subpopulations could
   independently grow in response to different anti‐cancer treatments.
   Such kind of anti‐cancer progression may be due to the expansion of
   previously untargeted (no tailored accurately) or undetected cell
   subclones.^[ [48]^6 ^] In addition, the treatment that does not target
   estrogen receptor (ER) may result in incomplete suppression of cancer
   cells and lead to a phenotype switch from TNBC to ER‐positive
   enrichment.^[ [49]^6 ^] In contrast, anti‐EGFR therapy may benefit when
   combined with anti‐ER treatment; it is suggested by the
   patient‐specific signaling signature (PaSSS) based predictions and was
   experimentally validated in two TNBC cell lines.^[ [50]^6 ^]

   However, unlike studies that were currently limited to a few cancer
   types and a small subset of drugs at the single‐cell level,^[ [51]^5 ^]
   more studies are mainly based on the tumor data profiled in bulk
   RNA‐seq. Among them, cancer cell lines are still the most widely
   applied laboratory models for oncology study and anti‐cancer drug
   discovery.^[ [52]^8 ^] In contrast to the limited drug sensitivity
   information in single cells, there are numerous drug efficacy profiling
   and studies in cancer cell lines. For example, in the Genomics of Drug
   Sensitivity in Cancer database (GDSC), over 1000 human cancer cell
   lines were screened with a variety of anti‐cancer compounds including
   both cytotoxic chemotherapeutics and targeted agents that were
   clinically approved or undergoing clinical development,^[ [53]^9 ^]
   allowing the identification of transcriptome‐based (bulk RNA‐seq) drug
   efficacy predictors and biomarkers. Other related perturbational
   databases such as Connectivity Map (CMAP),^[ [54]^10 ^] the Library of
   Integrated Network‐Based Cellular Signatures (LINCS‐L1000),^[ [55]^11
   ^] and Cancer Therapeutics Response Portal (CTRP)^[ [56]^12 ^] also
   include numerous compound sensitivity characteristics of cell lines for
   drug perturbation analysis. Those large‐scale pharmacogenomic datasets
   offer researchers an opportunity to better characterize the
   heterogeneity of perturbational biological features through machine
   learning technologies.^[ [57]^13 ^] For instance, Peilin et al.
   developed a deep learning model to impute drug responses in both tumor
   cell lines and cancer patients, demonstrating the clinical
   translational potential of the deep learning (DL) method for drug
   response biomarker discoveries.^[ [58]^14 ^] Hostallero et al. built a
   tumor tissue‐informed normalization deep learning (TINDL) model to
   impute the compound efficacy on tumor patients, which integrated
   transcriptome profiles of both the GDSC database and TCGA primary
   samples.^[ [59]^15 ^] To alleviate the discrepancies between cell lines
   and tumor tissues, TINDL integrates tumor type information of test
   samples to enhance the performance of drug efficacy prediction on
   patients.^[ [60]^15 ^] Similarly, Hossein et al. proposed an
   adversarial inductive transfer learning model for the drug efficacy
   prediction on four anti‐cancer compounds.^[ [61]^16 ^] The adversarial
   learning is an approach to improve data recognition despite the
   existence of dataset bias^[ [62]^17 ^] and was applied to bridge the
   gap between cell lines and clinical datasets (TCGA patient, clinical
   trials) to improve model generalization.^[ [63]^16 ^]

   We noticed that the above models were constructed based on bulk RNA‐seq
   data and many therapeutic strategies of patients are derived from mean
   recommendations informed by meta‐analysis. The major drawback of bulk
   RNA‐seq is that it only gives the average and a mixture profile of
   heterogeneous cells. Its limited capability conceals the uniqueness as
   well as the actual diversity of individual cells.^[ [64]^18 ^] It was
   suggested that a multitude of cancer cells with distinct molecular
   characteristics could coexist in the same tissue. In addition, a small
   proportion of pre‐existing or acquired resistant cells may evolve and
   gradually weaken the original medical strategy, and ultimately cause
   treatment failure.^[ [65]^19 , [66]^20 ^] These therapeutic resistances
   due to tumor heterogeneity suggest that those “one‐size‐fits‐all”
   evidence‐based medicine approaches, which is informed by bulk RNA‐seq
   analysis, may provide inappropriate medical solutions for outlier
   patients.^[ [67]^21 ^]

   It is noteworthy that this “one‐size‐fits‐all” treatment situation is
   now shifting to an era of precision medicine in many tumors such as
   breast cancer,^[ [68]^22 ^] metastatic colorectal cancer, and non‐small
   cell lung cancer.^[ [69]^23 ^] New diagnostics such as molecular
   imaging and gene expression quantification,^[ [70]^22 ^] as well as
   novel clinical trial designs, resulting in a reduction of treatment
   failure and a decreased morbidity rate.^[ [71]^23 ^] Scientists are now
   calling for more precise methodologies for personalized treatments.^[
   [72]^24 ^] The emergence and development of single‐cell sequencing
   technology offers us an opportunity to decipher previously undiscovered
   heterogeneity of cancer,^[ [73]^25 ^] facilitating the identification
   of previously hidden cell subpopulations with distinct
   characteristics.^[ [74]^26 ^] Recently, a variety of machine learning
   (ML) and statistical methods have been developed for different
   single‐cell transcriptomics tasks, such as subtype clustering^[ [75]^27
   ^] and cell type distinction.^[ [76]^28 ^] Nevertheless, due to the
   limited amount of scRNA‐seq data with both sequencing and drug efficacy
   information (e.g., cell viability assay), there are few ML models
   focused on single‐cell drug efficacy prediction, especially those cells
   sequenced prior to drug treatment. The scRNA‐seq is currently more
   expensive than bulk sequencing, prolonging its application in
   large‐scale pharmacogenomic research and clinical studies.^[ [77]^29 ^]
   Although the latest released version of Single Cell Portal from Broad
   Institute includes over four hundred single‐cell studies, few of them
   contain drug sensitivity data.^[ [78]^30 , [79]^31 ^] Based on the fact
   that multiple ML models have elevated the drug efficacy prediction
   performance on tumor patients by integrating transcriptome information
   from large cell line pharmacogenomic databases,^[ [80]^15 , [81]^16 ^]
   we hypothesized that the technical shortcoming of scRNA‐seq could be
   alleviated by integrating the plentiful transcriptome profiles from
   cell lines of large pharmacogenomic databases while keeping the
   microscopic resolution features of scRNA‐seq data. In summary, the drug
   perturbation effect information could be learned and migrated to the
   scRNA‐seq drug sensitivity prediction task and biomarker discovery.

   In this study, we leveraged transfer learning to learn the
   pharmacogenomic characteristics of cell lines in the GDSC database to
   infer the drug sensitivities of samples at the single cell level
   (Figure [82] 1 ). The source domain that we extract features is the
   bulk RNA‐seq from the GDSC database; and the target domain is the
   scRNA‐seq datasets for the learned model to infer drug sensitivities.
   Specifically, the drug response predictor was trained based on the
   expression profiles and binarized response labels (sensitive or
   resistant) of cell lines from the GDSC dataset. Direct predictions
   generated by the drug response predictor on scRNA‐seq (as observed in
   non‐ADDA results of Table [83]5, Table [84]6) are tested to exhibit
   poor performance, due to dataset bias. Such kind of dataset bias is
   known as domain shift.^[ [85]^32 ^] Therefore, we proposed the single
   cell drug response prediction framework by integrating adversarial
   domain adaptation (SCAD) to eliminate the domain shift problem. The
   model adopts the idea of adversarial discriminative domain adaptation
   (ADDA) composed of both adversarial learning and domain adaptation. The
   adversarial learning pits two networks against each other.^[ [86]^32 ^]
   In our SCAD framework, the feature extractor and domain (datasets)
   discriminator are trained in an adversarial manner.^[ [87]^32 ^] In the
   end, the feature extractor can extract domain‐invariant feature
   representations that the domain discriminator cannot distinguish
   whether the features come from the source domain or the target
   domain.^[ [88]^32 ^] Moreover, the adversarial learning is an approach
   to improve data recognition despite the existence of dataset bias.^[
   [89]^17 ^] Domain adaptation is a method to alleviate the unfavorable
   effects of dataset bias, reducing the differences between the source
   and target domain distributions to enhance the model generalization.^[
   [90]^32 ^] The domain‐invariant feature representations could guarantee
   the transferability.^[ [91]^32 ^] Based on this benefit, we proposed
   and adopted ADDA to minimize the domain discrepancy between cell lines
   (source domain) and scRNA‐seq dataset (target domain).^[ [92]^32 ^] In
   addition, we considered transfer learning as two independent tasks for
   the cells with different tolerant origins based on the mechanism that
   drug tolerance may occur due to pre‐existing cell populations before
   treatment or resistance‐acquired cells after drug exposure.^[ [93]^20
   ^] The details of the source domain and target domain datasets with
   shared compounds, which are proposed to train a model for drug
   sensitivity prediction at cells after drug exposure, are provided in
   Table [94] 1 . Similarly, the datasets for the inference of
   pre‐existing drug‐resistant cells prior to treatment are listed in
   Table [95] 2 .

Figure 1.

   Figure 1
   [96]Open in a new tab

   Architecture of SCAD model. a) Data splitting of the source domain and
   target domain. The SCAD applied a five‐fold cross validation resampling
   method. For each split, four‐fold of the source domain and target
   domain are used to train the model, the details of training of process
   are described in part (b). Another fold of the source domain data is
   selected as validation set for hyper‐parameter selection. The
   performance of SCAD is quantified by the average AUC and AUPR scores on
   the isolated testing set of the target domain. b) Domain adaptation of
   SCAD, includes a shared feature extractor that extracts the latent
   feature representations of the source domain and target domain samples;
   a domain discriminator to help the feature extractor to learn domain
   invariant features from both the source domain and target domain; a
   domain communal drug response predictor which was trained by minimizing
   the loss of the predicted and ground truth drug sensitivities of the
   source domain data.

Table 5.

   Average AUC results of SCAD among three different feature selections
   (prior to treat)
   Drugs Weight_all (non‐ADDA) Smote_all (non‐ADDA) Weight_all Smote_all
   Weight_tp4k Smote_tp4k Weight_PPI Smote_PPI
   Gefitinib 0.888 0.886 0.967 0.773 0.862 0.685 0.767 0.487
   Vorinostat 0.877 0.685 0.902 0.942 0.827 0.826 0.805 0.615
   AR‐42 0.568 0.922 0.968 0.801 0.784 0.854 0.775 0.737
   NVP‐TAE684 0.744 0.699 0.598 0.654 0.651 0.564 0.556 0.481
   Afatinib 0.684 0.733 0.880 0.925 0.746 0.878 0.564 0.547
   Sorafenib 0.737 0.696 0.582 0.771 0.739 0.666 0.609 0.620
   Cetuximab 0.812 0.666 0.923 0.798 0.854 0.648 0.638 0.582
   Average 0.759 0.755 0.831 0.809 0.780 0.732 0.673 0.581
   [97]Open in a new tab

Table 6.

   Average AUPR results of SCAD among three different feature selections
   (prior to treat)
   Drugs Weight_all (non‐ADDA) Smote_all (non‐ADDA) Weight_all Smote_all
   Weight_tp4k Smote_tp4k Weight_PPI Smote_PPI
   Gefitinib 0.909 0.915 0.973 0.824 0.907 0.755 0.833 0.587
   Vorinostat 0.869 0.723 0.904 0.958 0.830 0.873 0.828 0.708
   AR‐42 0.635 0.928 0.970 0.826 0.798 0.894 0.810 0.785
   NVP‐TAE684 0.766 0.719 0.664 0.695 0.664 0.638 0.624 0.534
   Afatinib 0.730 0.758 0.889 0.938 0.794 0.912 0.617 0.629
   Sorafenib 0.744 0.720 0.627 0.794 0.753 0.703 0.656 0.674
   Cetuximab 0.838 0.731 0.921 0.843 0.883 0.711 0.686 0.649
   Average 0.784 0.785 0.850 0.840 0.804 0.784 0.722 0.652
   [98]Open in a new tab

Table 1.

   scRNA‐seq after drug exposures and corresponding source domain data
   Dataset          Drug    Cell line   Type    No. Res No. Sens No. Genes
   Source domain
   GDSC           Etoposide  Pan‐Can    bulk      811      53      9738
   GDSC            PLX4720   Pan‐Can    bulk      746      88     11 937
   Target domain
   [99]GSE149215  Etoposide    PC9       10x      764     629      9738
   [100]GSE108383  PLX4720    451Lu   SMART‐seq   113      84     11 937
   [101]GSE108383  PLX4720    A375    SMART‐seq   46       62     11 937
   [102]Open in a new tab

   (Cell line: the cell lines that the bulk RNA‐seq or scRNA‐seq profiled;
   Type: Sequencing types (bulk: bulk RNA‐seq; 10x: 10x scRNA‐seq;
   SMART‐seq: SMART‐seq scRNA‐seq); No. Genes: the number of genes that
   shared by the source domain and target domain. For the source domain
   dataset, No. Res and No. Sens represent the number of drug‐resistant
   cell lines and the number of drug‐sensitive cell lines, respectively.
   For the target domain dataset, No. Res and No. Sens represent the
   number of drug‐resistant (single) cells and the number of
   drug‐sensitive (single) cells, respectively.)

Table 2.

   scRNA‐seq prior to drug exposure and corresponding source domain data
   Dataset          Drug    Cell line Type No. Res No. Sens N.A. No. Genes
   Source domain
   GDSC          Gefitinib   Pan‐Can  bulk   714     115     /    10 610
   GDSC          Vorinostat  Pan‐Can  bulk   774      60     /    10 610
   GDSC            AR‐42     Pan‐Can  bulk   811      80     /    10 610
   GDSC          NVP‐TAE684  Pan‐Can  bulk   358      37     /    10 684
   GDSC           Afatinib   Pan‐Can  bulk   682     150     /    10 684
   GDSC          Sorafenib   Pan‐Can  bulk   362      31     /    10 684
   GDSC          Cetuximab   Pan‐Can  bulk   739     122     /    10 684
   Target domain
   CCLE          Gefitinib   JUH006   10x    33       33    259   10 610
   CCLE          Vorinostat  JUH006   10x    33       33    259   10 610
   CCLE            AR‐42     JUH006   10x    33       33    259   10 610
   CCLE          NVP‐TAE684   SCC47   10x    60       60    472   10 684
   CCLE           Afatinib    SCC47   10x    60       60    472   10 684
   CCLE          Sorafenib    SCC47   10x    60       60    472   10 684
   CCLE          Cetuximab    SCC47   10x    60       60    472   10 684
   [103]Open in a new tab

   (For source domain dataset, No. Res and No. Sens represent the number
   of drug‐resistant cell lines and the number of drug‐sensitive cell
   lines, respectively. For target domain dataset, No. Res and No. Sens
   represent the number of drug‐resistant (single) cells and the number of
   drug‐sensitive (single) cells, respectively. N.A.: Drug sensitivity
   label is not available.)

2. Results

   It was suggested that the resistance to drug treatment can occur due to
   the growth of the acquired drug‐tolerant cells or the expansion of
   pre‐existing subclones.^[ [104]^19 ^] In this study, we evaluated the
   prediction performance of our SCAD model in two scenarios
   independently. Due to the unbalanced ratio of sensitive and resistant
   cell lines in the source domain, we conducted both weight sampling and
   smote sampling in model training processes (abbreviated as Weight and
   Smote, respectively). In addition, we considered three different
   strategies to explore the impact of performance in the feature
   selection step before training, namely “all” (which includes all genes
   shared by source domain and target domain), “tp4k” (the top four
   thousand highly variable genes in scRNA‐seq by the function of
   highly_variable_genes of Scanpy, which is a simple but popular method
   for denoising and feature selection strategy prior to downstream
   analysis on scRNA‐seq data^[ [105]^33 ^]), and “PPI” (a subset of 2128
   protein–protein interaction genes, a prior knowledge that is considered
   as the most informative gene set for drug sensitivity prediction^[
   [106]^34 ^]).

   In this study, we relied on average AUC and average AUPR scores as the
   performance metrics. We also applied a ranking‐based strategy to sort
   single cells from the inferred most positive (drug‐sensitive) cells to
   the most negative (drug‐resistant) one for cell stratification and
   biomarker exploration. Interestingly, we observed that drug‐resistant
   cell subpopulation to a specific compound may be sensitive to other
   drugs. This phenomenon offers us an opportunity to explore and identify
   promising drug pairs for drug‐combination.

2.1. Drug Sensitivity Prediction in Drug‐Tolerant Cells after Treatment

   Patients may suffer from drug tolerance within a short term due to the
   evolution of acquired drug‐resistant cells after treatments.^[ [107]^19
   ^] Therefore, parental cells (sequenced from drug untreated cell lines)
   are regarded as sensitive cells, while drug‐tolerant cells that
   survived after drug exposures are selected and considered as resistant
   cells.^[ [108]^35 ^] We conducted transfer learning on two GSE datasets
   with both scRNA‐seq profiles and drug response experiments from three
   different cell lines (see datasets of Experimental Section). The means
   and standard deviations (SD) of the AUC and AUPR scores for each drug
   are summarized in Table [109]S1, Supporting Information. We observed
   that both two metrics under “all” and “tp4k” feature selections were
   increased after domain adaptation, comparing to two baseline (non‐ADDA)
   models (Tables [110] 3 and [111]4 ). Among all three feature selection
   strategies, smote sampling which consists of all genes shared by the
   source domain and target domain obtained the highest prediction results
   for Etoposide (Smote_all: average AUC = 0.694, average AUPR = 0.736,
   (Tables [112]3, and [113]4). As for the prediction on scRNA‐seq from
   451Lu cell line after PLX4720 treatment, we observed that both average
   AUC and average AUPR scores are close to 0.5 (Weight_tp4k and
   Smote_tp4k), more like a random prediction. In contrast, better
   performance values (Smote_all: average AUC = 0.825, average AUPR =
   0.875; Smote_tp4k: average AUC = 0.937, average AUPR = 0.948) were
   achieved in scRNA‐seq from A375 cell line after PLX4720 treatment,
   comparing to 451Lu cell line (Table [114]3 and [115]4; Table [116]S1,
   Supporting Information). The distinct prediction performance of the
   same drug PLX4720 in the two cell lines suggests that other confounding
   factors exist in the drug sensitivity inference. In the study, the
   researchers identified biomarkers of PLX4720 (BRAFi) resistance in
   melanoma single cells; 451Lu and A375 cell lines were found to have
   distinct transcriptome profiles.^[ [117]^35 ^] In addition, they
   clustered the gene expression profiles of 451Lu_BR (BRAFi resistant),
   451Lu_Par (parental, drug untreated), A375_BR (BRAFi resistant), and
   A375_Par (parental, drug untreated) cells to observe the expression
   patterns among those BRAFi treated and untreated cells. They found that
   the untreated 451Lu cells are clustered with BRAFi‐resistant 451Lu
   cells. Similarly, the untreated A375 cells are clustered with
   BRAFi‐resistant A375 cells.^[ [118]^35 ^] The clustering result
   suggested that the cell type diversity could be more prominent than the
   drug perturbational effect (The clustering result can be seen in
   Figure [119]3c of ref. [[120]35].) Such phenomenon partially explains
   the distinct performance for drug PLX4720 between A375 cells and 451Lu
   cells (Tables [121]3 and [122]4).

Table 3.

   Average AUC scores of SCAD among three different feature selection
   strategies (post‐treat)
   Drugs Weight_all (non‐ADDA) Smote_all (non‐ADDA) Weight_all Smote_all
   Weight_tp4k Smote_tp4k Weight_PPI Smote_PPI
   Etoposide 0.566 0.546 0.669 0.694 0.583 0.671 0.500 0.544
   PLX4720 (451Lu) 0.450 0.334 0.381 0.381 0.508 0.504 0.382 0.354
   PLX4720 (A375) 0.727 0.845 0.694 0.825 0.78 0.937 0.767 0.798
   Average 0.581 0.575 0.581 0.633 0.624 0.704 0.550 0.565
   [123]Open in a new tab

   (Weight: weight sampling; Smote: smote sampling; all: all genes shared
   by source domain and target domain, tp4k: the top four thousand highly
   variable genes in scRNA‐seq, a simple but popular method for denoising
   and feature selection prior to downstream analysis on scRNAseq
   datasets,^[ [124]^36 ^] and PPI: a subset of 2128 protein‐protein
   interaction genes described in a recent study.^[ [125]^34 ^])

Table 4.

   Average AUPR results of SCAD among three different feature selections
   (post‐treat)
   Drugs Weight_all (non‐ADDA) Smote_all (non‐ADDA) Weight_all Smote_all
   Weight_tp4k Smote_tp4k Weight_PPI Smote_PPI
   Etoposide 0.450 0.582 0.694 0.736 0.588 0.688 0.55 0.578
   PLX4720 (451Lu) 0.511 0.424 0.478 0.461 0.535 0.531 0.453 0.429
   PLX4720 (A375) 0.793 0.875 0.769 0.875 0.805 0.948 0.799 0.861
   Average 0.584 0.627 0.647 0.691 0.643 0.722 0.601 0.623
   [126]Open in a new tab

Figure 3.

   Figure 3
   [127]Open in a new tab

   Identification of gene biomarker attributions to drug (Cetuximab)
   response heterogeneity. a) Density plot for IntegratedGradient
   attribution score of each gene in the scRNA‐seq expression data of the
   target domain. b) Heatmap for the top15 up‐regulated genes in p_Sens
   cells and the top15 up‐regulated genes in p_Res cells. In order to
   identify high confidence biomarkers, only genes with the largest
   absolute IntegratedGradient attribution scores (the subset of genes
   whose contribution values fall in the <5% interval or >95% interval in
   (a)) are retained for differential expression analysis. c) The gene
   enrichment analysis (Go Oncology Biological Process v2021) for
   identified candidate drug sensitive‐related genes for compound
   Cetuximab. d) Venn plot for the overlap of differentially expressed
   genes that is reported in a previous study^[ [128]^20 ^] and
   differentially expressed genes (p_Sens vs. p_Res cells) among genes
   with largest absolute IntegratedGradient attribution scores (those
   genes whose contribution value falls in the <5% interval or >95%
   interval in (a)) in scRNA‐seq. The bulk_de is the shorthand for the
   differentially expressed genes from the bulk RNA‐seq result in the
   previous study;^[ [129]^20 ^] sc_sens refers to the predicted gene set
   that may make cells susceptible to drugs; sc_res refers to the
   potential gene biomarkers that contribute to drug resistance. e)
   Progression free survival (PFS) status prediction on [130]GSE65021
   cohort based on the expression level of biomarker genes on
   [131]GSE65021 dataset; the biomarkers are identified from scRNA‐seq
   data.

2.2. Drug Sensitivity Prediction for Pre‐Existing Drug‐Resistant Cells before
Treatment

2.2.1. Evaluation of SCAD Based on Source Domain from All Cancer Types

   As mentioned in our introduction, several studies have evaluated
   single‐cell data on selection pressure following drug treatment.^[
   [132]^5 , [133]^6 ^] However, there are limited research that evaluate
   the feasibility of drug response prediction on single cells prior to
   drug treatment using machine learning. Therefore, we focus more on
   exploring the feasibility of machine learning in predicting single‐cell
   drug sensitivities before drug treatment. It was suggested that
   co‐existing distinct subpopulation prior to drug exposure in cancer may
   respond differently to drug treatments, facilitating treatment failure,
   and disease progression.^[ [134]^18 ^] Albeit an effective response may
   occur within a short time after drug exposure, the growth of
   pre‐existed intratumoral drug‐resistant subclones will eventually cause
   tumor progression or lethal outcome.^[ [135]^2 , [136]^19 ^] Therefore,
   successfully deciphering the intratumoral heterogeneity before drug
   treatment, which is more probable to be achieved by scRNA‐seq than bulk
   RNA‐seq due to technical advantage, may give informative references for