Abstract

   ScRNA-seq has the ability to reveal accurate and precise cell types and
   states. Existing scRNA-seq platforms utilize bead-based technologies
   uniquely barcoding individual cells, facing practical challenges for
   precious samples with limited cell number. Here, we present a scRNA-seq
   platform, named Paired-seq, with high cells/beads utilization
   efficiency, cell-free RNAs removal capability, high gene detection
   ability and low cost. We utilize the differential flow resistance
   principle to achieve single cell/barcoded bead pairing with high cell
   utilization efficiency (95%). The integration of valves and pumps
   enables the complete removal of cell-free RNAs, efficient cell lysis
   and mRNA capture, achieving highest mRNA detection accuracy (R = 0.955)
   and comparable sensitivity. Lower reaction volume and higher mRNA
   capture and barcoding efficiency significantly reduce the cost of
   reagents and sequencing. The single-cell expression profile of mES and
   drug treated cells reveal cell heterogeneity, demonstrating the
   enormous potential of Paired-seq for cell biology, developmental
   biology and precision medicine.

   Subject terms: Lab-on-a-chip, Sequencing, RNA sequencing, Stem-cell
   differentiation
     __________________________________________________________________

   Single-cell RNA-seq can reveal accurate and precise cell types and
   states. Here the authors present an scRNA-seq platform, Paired-seq,
   which uses differential flow resistance to achieve 95% cell utilisation
   efficiency for improved cell-free RNA removal and gene detection.

Introduction

   Many physiological functions of multicellular organisms are reflected
   in the temporal and spatial changes in gene expression between
   constituent cells^[46]1. Cellular heterogeneity presented by different
   gene expression profiles, functions and morphologies occurs not only in
   different tissues but also even within the same cell type.
   Transcriptomic profiling of individual cells has emerged as an
   essential tool for characterizing cellular diversities to have a
   complete catalog of cell types or their functions. However, traditional
   single-cell analysis methods can monitor only a few types of molecules
   for each cell^[47]2,[48]3. In 2009, single cell mRNA sequencing
   (scRNA-seq) was first introduced by Tang to analyze the whole
   transcriptome in single cells^[49]4. As one of the most powerful tools
   to understand the heterogeneity of biology^[50]5–[51]10, scRNA-seq
   contributes to discovering the cellular and molecular driving forces of
   biology, unveiling new biological insights about cell
   types^[52]11–[53]16, which has a broad impact on diverse biology
   fields, including development^[54]17,[55]18, immunology^[56]19,[57]20,
   neurobiology^[58]20, cancer^[59]8,[60]21–[61]23, gene
   regulation^[62]24, and epigenetics^[63]6,[64]25.

   For scRNA-seq, it comes first with the isolation of single cells from
   their native environment, such as a culture dish or cell suspension.
   Traditional methods, including limiting dilution^[65]26, capillary
   picking^[66]27, and laser capture microdissection (LCM)^[67]28, suffer
   from time and labor consumption, cell damage, and low throughput. In
   recent years, microfluidic devices characterized by their manipulation
   integration, low reagent consumption, size/volume compatibility, and
   external contamination isolation have demonstrated their capability in
   high-efficiency, high-viability, and low-cost single-cell isolation.
   After single-cell isolation, each cell must be processed and sequenced
   individually to obtain transcriptome information, which is labor
   intensive and cost prohibitive, especially when a large population of
   cells is needed to be processed. To address this problem, several novel
   high-throughput platforms have been reported, including
   Drop-seq^[68]13, inDrop^[69]12, Seq-well^[70]15, and
   Microwell-seq^[71]14, etc., which used barcoded beads to label
   individual cells during reverse transcription so that cDNAs from all
   the cells could be simultaneously pooled for amplification and
   sequencing^[72]29. By identifying the cell barcode and molecular index,
   the cell origin of cDNA could be inferred and the amplification bias
   could be corrected.

   Successful barcoding of individual cells relies on co-encapsulation of
   a single cell and barcoded bead within a single droplet or
   microwell^[73]30. Current high throughput scRNA-seq platforms utilize a
   limited dilution strategy for cell/bead encapsulation to ensure that
   there is no more than one cell or one bead in each reaction compartment
   based on Poisson statistics. Unfortunately, such a limiting dilution
   strategy for both cell and bead is wasteful of reagents and causes loss
   of cells, which is unacceptable when only a limited number of cells,
   such as stem cells, neuron cells, or circulating tumor cells (CTCs) are
   available^[74]10. Additionally, how to avoid the interference of
   cell-free RNAs produced during the preparation of cell suspension to
   achieve information about the true original cell is another challenge
   for scRNA-seq^[75]13. For the preparation of solid tissues, enzymatic
   digestion will destroy the extracellular matrix and disrupt the
   cell–cell junctions, releasing RNAs into the extracellular “soup”.
   Furthermore, cell death also results in the release of cellular RNAs in
   both tissues and blood samples. These cell-free RNAs would lead to
   noise in the data produced by scRNA-seq experiments. Thus, it is
   difficult to evaluate how faithfully the tissue and blood samples are
   represented by the scRNA-seq analysis.

   In order to realize both a parallel and an efficient processing for a
   limited number of cells, it is very necessary to develop a
   high-efficient single cell manipulation platform for scRNA-seq. Herein,
   we present a scRNA-seq platform, named Paired-seq, with high
   cells/beads utilization efficiency^[76]31, as well as excellent
   sequencing accuracy and sensitivity by integrating barcoding technology
   for cell tagging, droplet strategy for parallel compartmentalization,
   hydrodynamic differential flow resistance based isolation for single
   cell/bead, and micro-pumping structure for active fluidic control. Our
   Paired-seq chip allows automatic isolation enabling the pairing of
   single cell and single bead in a reaction unit with an efficiency up to
   95%. Thus, Paired-seq achieves efficient utilization of precious cells.
   After cell/bead capture and pairing, formation of picoliter droplets
   allows highly parallel processing of dozens to thousands of cells.
   Integration of valves and pumps enables the on-chip removal of
   cell-free RNAs in the cell chambers, making it possible to identify the
   true composition of the original sample. Cell lysis, mRNA capture and
   reversed transcription can be efficiently carried out in the tiny
   droplet by virtue of active pumping for fluid transportation and rapid
   mixing. Analysis results of sequencing data for External RNA Controls
   Consortium (ERCC) suggests that our method offers high accuracy
   (R = 0.955) and comparable sensitivity compared to other current
   scRNA-seq platforms. What is more, the lower reaction volume and higher
   mRNA capture and barcoding efficiency significantly reduce the cost of
   reagent and the sequencing cost. Using Paired-seq, we analyze the
   single-cell expression profile of mES cells and anti-cancer drug
   treated cells, revealing the heterogeneity of the cell population
   during differentiation and drug treatment processes which show an
   enormous potential of our platform for cell biology, developmental
   biology and precision medicine.

Results

Workflow of Paired-seq

   We designed and fabricated a microfluidic chip (Fig. [77]1a and
   Supplementary Figs. [78]1–[79]3) which contained hundreds to thousands
   of reaction units (Fig. [80]1c and Supplementary Fig. [81]2) for
   parallel single cell and single barcoded bead pairing and sample
   processing (Supplementary Movie [82]1). Each reaction unit is designed
   based on the hydrodynamic differential flow resistance principle to
   allow no more than one bead and cell to be captured in each bead
   capture chamber and cell capture chamber, respectively (Fig. [83]1d,
   a). The cell-free RNAs can be easily removed by injecting washing
   buffer while maintaining the single cells in the chambers. After bead
   and cell isolation, gas is introduced to form two droplets in each
   reaction unit containing one cell and one bead, respectively
   (Fig. [84]1d, b). These two droplets are then merged by turning off a
   separation valve located in between, thus forming a larger picoliter
   droplet containing exactly one bead and one cell (Fig. [85]1d, c).
   Because the bead-loading solution contains cell lysis buffer, mixing of
   the bead droplet with the cell droplet leads to cell lysis. The
   barcoded bead contains cell label and molecular index for
   cell/molecular barcoding, poly-(dT)[30] for mRNA capture and universal
   primer for cDNA amplification (Fig. [86]1b). Once released from the
   cell, poly A-tail mRNAs are captured by poly-(dT)[30] on the beads and
   reverse-transcribed to form the cDNAs (Fig. [87]1e). After subsequent
   bead recovery, cDNA amplification, library preparation and sequencing,
   the original information about the cell/molecule can be inferred to
   achieve an expression matrix for dozens to thousands of single cells
   (Fig. [88]1f, g).

Fig. 1. Paired-seq: a Platform for DNA Barcoding scRNA-Seq.

   [89]Fig. 1
   [90]Open in a new tab

   a Photograph of Paired-seq chip with a Quarter dollar coin. b Sequence
   of primers on the barcoded beads. The primers on beads contain mRNA
   capture poly-(dT)[30], molecule index, cell label, and universal
   primer. c Schematic diagram of cell and barcoded bead pairing on
   Paired-seq chip. Scale bar is 200 μm. d Schematic of the basic workflow
   for single cell and bead parallel manipulation on Paired-seq chip,
   including (a) capture and pairing of single cells and single beads (b)
   droplets generation to separate adjacent units (c) cell lysis and mRNA
   captured on barcoded beads. e After hybridizing to the primers on the
   barcoded beads, mRNAs are reverse-transcribed to produce cDNAs. All the
   cDNAs-attached beads are recovered from the chip and (f) subsequently
   amplified for library preparation in bulk. g Data analysis to generate
   single cell expression matrix. Millions of paired-end reads are
   generated from a Paired-seq library on a high-throughput sequencer. The
   reads are first aligned to a reference genome to identify the gene of
   origin of the cDNA. Next, reads are grouped by their cell barcodes, and
   individual UMIs are counted for each gene in each cell. The result is a
   “digital expression matrix” that each column corresponds to a cell,
   each row corresponds to a gene, and each entry is the integer number of
   transcripts detected from that gene, in that cell.

Chip design for Paired-seq

   The three-layer chip consists of a capture channel layer, a valve/pump
   actuation control layer, and an elastomeric membrane layer in between
   (Fig. [91]2a and Supplementary Figs. [92]1–[93]3). Each chip contains
   hundreds of reaction units consisting of a cell flow channel and bead
   flow channel connected by a connection channel (Fig. [94]2b). To break
   the limitation of limiting dilution and avoid wasting precious cells,
   Paired-seq chip is designed based on a paired differential flow
   resistance capture principle, so that a single cell

Fig. 2. Design criterion and structure characterization of Paired-seq chip.

   [95]Fig. 2
   [96]Open in a new tab

   a The 3D cartoon diagram of the capture layer and the control layer of
   the chip. b, d The simulation results of trapping mode (b), bypassing
   mode (c) and droplets forming mode (d). e The cartoon diagram of cross
   section for one unit in Paired-seq chip. f, g Structure
   characterization of Paired-seq chip. Top view image of a Paired-seq
   chip with 800 units (f) and one paired unit (g). h The 3D surface
   profiling of SU8 silicon model of the flow layer for one paired unit by
   3D laser scanning confocal microscopy.

   and a single bead can be isolated and paired with high efficiency. In
   the sample loading process, when a capture chamber is empty, flow
   resistance along the straight channel is lower than that in the long
   loop bypass channel, and the main stream flows along the straight
   channel, leading to a single cell/bead in the flow being trapped in the
   chamber (Fig. [97]2b, Trapping mode). The size of the trapped cell/bead
   is larger than that of the orifice of the capture chamber and thus will
   block the local flow and then dramatically increase the flow resistance
   along the straight channel. Consequently, the main flow redirects to
   the bypassing channel and subsequent cells/beads will flow into the
   bypassing stream, going to the next paired unit (Fig. [98]2c, Bypassing
   mode). This capture mechanism ensures that there is no more than one
   cell/bead captured in one chamber. Because the diameter of beads
   (20–40 μm) is larger than that of cells, an asymmetrical paired unit is
   designed with a wider channel for beads and a narrower channel for
   cells.

   To allow efficient cell lysis and mRNA capture to afford high mRNA
   detection sensitivity, valves and pumps were integrated in the
   Paired-seq chip. Firstly, to enable independent loading of cell and
   bead solutions, a blocking valve is designed orthogonally below the
   connection channel for each paired unit (Supplementary Movie [99]2).
   Secondly, as each chip consists of hundreds to thousands of reaction
   units, to avoid cross-contamination, the reaction unit is separated by
   air. This can be realized by reversely introducing an air flow in the
   capture channel and extra solution outside capture chamber is
   dispelled, forming water-in-air droplets and effectively separating
   each individual reaction unit (Fig. [100]2d, Droplets forming mode and
   Supplementary Movie [101]3). Formation of droplets allows hundreds of
   cells to be processed in parallel for high-throughput analysis.
   Finally, to facilitate the exchange of reagents between the paired
   chambers droplets, there is a driving pump below each capture chamber
   (Fig. [102]2a, e). By alternately activating the driving pumps for the
   cells and the beads, solutions in the two chambers can be easily
   transferred back and forth, thus allowing efficient mixing of the
   paired droplets.

   To better understand the flow characteristics around the microfluidic
   traps and to determine the optimal parameters for microfluidic channel
   design, a computational fluid dynamics (CFD) analysis was carried out
   using COMSOL 4.3 (COMSOL Multiphysics) to simulate the hydraulic
   resistance in the channel of the paired unit in the mode of trapping,
   bypassing and droplets forming, respectively (Fig. [103]2b–d).

High efficiency of single-cell assays on Paired-seq chip

   In order to demonstrate the feasibility of the chip design for
   scRNA-Seq, a Paired-seq chip with 800 and 2000 units was first
   fabricated (Fig. [104]2f–h and Supplementary Fig. [105]2). Based on the
   dimensions of the fabricated chip, the total volume of each reaction
   unit was calculated to be less than 400 picoliter.

   To test the single cell/bead isolation and pairing efficiency of the
   Paired-seq chip, barcoded beads and Calcein AM-strained K562 cells were
   loaded. Supplementary Movie [106]4 and Movie [107]5 illustrate the
   dynamic process of single-bead and single-cell trapping, confirming
   that the trapped cells/beads work as plugs to block the local flow and
   prevent the incoming of subsequent cells/beads. As expected, successful
   single cell/bead trapping and pairing was observed with high efficiency
   (Fig. [108]3a and Supplementary Fig. [109]4). Overall, the
   single-particle chamber occupancy ratio was found to be as high as 97%
   (Fig. [110]3b). The statistics of cell/bead occupancy rate and pairing
   rate are shown in Fig. [111]3c. A pairing rate of about 95% was
   achieved, which is a significant improvement compared to other
   scRNA-seq platforms. Finally, with a high speed flow of solution in the
   reverse direction, nearly 100% of the trapped barcoded beads could be
   recovered for downstream processing (Fig. [112]3c and Supplementary
   Fig. [113]5), which outperformed other platforms such as
   Drop-seq^[114]13. The combination of high loading rate, high pairing
   rate and remarkable recovery rate avoids loss of cell information.

Fig. 3. Profiling of Paired-seq chip.

   [115]Fig. 3
   [116]Open in a new tab

   a Image of Paired-seq chip loaded with single cells and single beads,
   which were compartmented in water-in-gas droplets. b The occupation
   ratio of single particle in Paired-seq chip. Error bars, mean ± s.d., n
   = 4. c The statistical chart of bead and cell occupation ratio, pairing
   ratio and bead recovery ratio. Error bars, mean ± s.d., n = 3. d Single
   cell capture efficiency with different numbers of input cells. Error
   bars, mean ± s.d., n = 3. e Change of cell chamber fluorescence
   intensity indication mixing efficiency of TAMRA dye solution in bead
   chamber with PBS in cell chamber under conditions of free diffusion and
   pump driving. f Characterization of DNA hybridization on the surface of
   barcoded beads with target DNA and random DNA. Source data are provided
   as a Source Data file.

   In addition to the capacity of compartmentalization of single
   cells/beads with high efficiency, Paired-seq chip was designed to
   capture cells with minimum loss even with low-input cell number.
   Different low numbers (40, 80, 100, 200, 300, 400, 500, 800) of input
   cells were injected, and the capture efficiency (Fig. [117]3d) was
   calculated. The result showed that as high as 90% of input cells could
   be captured. Such a high capture efficiency for a low input number of
   cells will be of great significance in dealing with precious cell
   samples.

Cell-free RNAs removal capability

   Preparation of a single-cell suspension sample remains one of the most
   difficult tasks for scRNA-seq to generate meaningful biological
   representative data. It is difficult to identify the true composition
   of the original sample because of the presence of cell-free RNAs
   derived from tissue digestion and cell death. Paired-seq chip allows
   independent loading and washing of cells and beads independently which
   can prevent the barcoded beads from being contaminated by cell-free
   RNAs in the cell solution. To verify the capability of cell-free RNAs
   removal on Paired-seq platform, TAMRA fluorescent dye and PBS solutions
   were loaded into the cell capture channel and bead capture channel,
   respectively. The connection channel was kept blocked for 6 h, and
   there was no observable increase of fluorescence intensity in the bead
   capture channel (Supplementary Fig. [118]6A, B, Supplementary
   Movie [119]6), indicating the excellent isolation effect of the
   blocking valve to avoid contamination from cell-free RNAs during
   cell/bead solution loading. Considering the low sensitivity of
   fluorescence imaging, a small number of RNA molecules could also be
   amplified in the subsequent reactions, such as PCR amplification and
   sequencing, which would affect the experimental results seriously.
   Therefore, we also used the sequencing method to further verify the
   isolation effect of the blocking valve and the cleaning effect. Total
   RNAs extracted from the same number of cells with a different species,
   considered as cell-free RNAs, were doped into human/mouse cell loading
   solution. Cells were captured in the chambers and washed with 1 × DPBS
   as the blocking valves were still activated. In neither test (mouse
   cells with human RNAs contamination or human cells with mouse RNAs
   contamination) did we detect obvious cell-free RNAs contamination from
   the other species (Supplementary Fig. [120]6C, D). Our results verified
   the complete isolation achieved by the blocking valve and the complete
   removal of background cell-free mRNAs in the cell suspension after
   washing, which is a significant advantage over other scRNA-seq
   platforms, such as Drop-seq and Seq-well.

Rapid cell lysis and mRNA capture with active pumping

   Other challenges in the preparation of single-cell transcriptome
   sequencing samples, such as the lengthy time for single-cell lysis and
   poor mRNA capture efficiency, will affect the quality of the sequencing
   library and further affect the accuracy and sensitivity of the
   sequencing results. To evaluate the efficiency of mixing between paired
   droplets on our Paired-seq chip, a paired-droplets array was generated
   with one droplet containing TAMRA solution and PBS in the other. The
   blocking valve was then turned off to allow mixing of solutions in the
   cell and bead chambers. A very slow increase of fluoresce intensity was
   observed in the bead capture chamber due to free diffusion. In
   contrast, immediately after activating the two driving pumps, the
   fluorescence intensity in the cell capture chamber increased sharply
   and reached saturation in a few seconds, demonstrating rapid mixing
   between paired droplets enabled by the driving pumps (Fig. [121]3e,
   Supplementary Fig. [122]7, and Supplementary Movie [123]7). As a
   result, in the picoliter reactor, a cell can be completely lysed within
   2 min (Supplementary Fig. [124]8 and Supplementary Movie [125]8) and
   FITC-labeled poly(A) DNA can be pumped from the cell chamber to the
   bead chamber and captured on beads within 20 s (Fig. [126]3f),
   indicating that the specific hybridization between mRNAs and barcoded
   bead can be performed rapidly on Paired-seq chip. To test the influence
   of shear forces on RNA quality or transcription, we compared the gene
   detection ability reflecting RNA integrity with different loading time.
   Our results suggested that at the loading time of 15 min and 40 min,
   the number of detected genes show no significant difference, suggesting
   that loading time does not cause spurious/stress to transcription
   (Supplementary Fig. [127]9A). We also analyzed the expression levels of
   9 genes (ARF1, CAST, CDK7, DBI, DDIT3, ENO2, ETF1, PLOD2, and RGS2)
   reported to have correlations with mechanical stress^[128]32. Herein,
   the nine genes were biologically well characterized in terms of protein
   function, including cell communication, cell signaling, cell cycle,
   stress response and calcium release. There were no remarkable
   differences of the gene expression described above between the samples
   with different loading time, indicating that the shear force did no
   damage to the cells (Supplementary Fig. [129]9B). Controllable, rapid,
   and efficient cell lysis and mRNA capture in picoliter chambers enabled
   by active pumping is essential for high sequencing accuracy and
   sensitivity.

Single-cell mRNA sequencing

   To assess the feasibility of scRNA-seq on this platform, we performed a
   mixed-species experiment with cultured human (K562) and mouse (3T3)
   cells, and the sequencing result of cell barcodes is shown in
   Supplementary Fig. [130]10. By avoiding the limiting dilution of cell
   and barcoded bead compartmentalization, 768 cell barcodes were
   successfully harvested with high quality in an 800-array Paired-seq
   chip, demonstrating very high efficiency on both the cells and the
   barcoded beads utilization (Fig. [131]4a). The result of the
   human-mouse experiment is shown in Fig. [132]4b, and each dot
   represents a cell barcode and number of UMIs derived from the
   human/mouse source. The closer the dots to the x-axis/y-axis, the
   higher purity of the cell barcode for the corresponding single species.
   Among all the 768 harvested cell barcodes, 386 were identified as human
   species and 376 as mouse, yielding less than 0.8% mixed-species dots
   (while 2.4% mixed-species dots with 2000-unit Paired-seq
   chip)(Fig. [133]4b and Supplementary Fig. [134]2). Compared with other
   available scRNA-seq platforms, our Paired-seq showed a very low doublet
   rate (Supplementary Fig. [135]10B). The results established excellent
   single-cell integrity for scRNA-seq and indicate an obvious advantage
   in detection of transcripts and genes of Paired-seq.

Fig. 4. Characterization of Paired-seq platform.

   [136]Fig. 4
   [137]Open in a new tab

   a Plot of the cumulative fraction of reads vs barcodes accumulates
   which are arranged in decreasing order of size (number of transcripts)
   for the human-mouse mixture experiment. b Human-mouse mixture
   experiment using Paired-seq. c The technical repeats from two
   independent experiments indicate high correlation of 0.979. d
   Cell-cycle state of K562 was measured by Paired-seq. The cells were
   ordered by their phase scores. e Accuracy and mRNA capture efficiency
   evaluated by ERCC sample. f, g Models of accuracy and sensitivity with
   a global dependency on sequencing depth. Each model has 26 parameters
   and is fitted to n = 20,772 samples. Bulk data (pink triangles) are
   displayed only for context. Solid curves show the predicted dependence
   on sequencing depth. f Accuracy is only marginally dependent on
   sequencing depth. Saturation occurs at 270,000 reads per cell in the
   model (dashed red line). Methods are ordered by performance on the
   basis of predicted correlation (R) at 1 million reads. g Sensitivity is
   critically dependent on sequencing depth. Saturation occurs at 4.6
   million reads per cell (dashed red line). The gain from 1 to 4 million
   reads per sample is marginal, whereas moving from 100,000 reads to 1
   million reads corresponds to an order-of-magnitude gain in sensitivity
   (dashed black lines). Methods are ordered by performance on the basis
   of predicted detection limit (#M, number of molecules at 1 million
   reads). h Accuracy of single-cell resolution for mES cells. Each dot
   represents a cell and each box represents the median and first and
   third quartiles per replicate and method. 72, 77, 53, 38, and 70 cells
   were used for Paired-seq, CEL-seq, Drop-seq, MARS-seq and SCRB-seq. i
   Fitted (solid line) and predicted (dashed line) curve of median genes
   detected for single mES cell to varying mapped reads according to
   experiment results of five different platforms. Two-tailed F-test was
   performed to generate P-value to assess the accuracy of the curve
   (P-value > 0.05). j The number of mapped reads for five different
   platforms when the detected number of genes were 2000. Source data are
   provided as a Source Data file.

   To test the reproducibility of Paired-seq, different numbers of cells
   were harvested at different sequencing depths and culture times. We
   collected 188 and 248 K562 cells at an average sequencing depth of
   18 × 10^3 and 39  ×10^3 mapped reads per cell, respectively. Technical
   replicates showed very high reproducibility (Pearson correlation,
   R = 0.979, Fig. [138]4c). In addition, our platform has the ability to
   evaluate the individual cell state according to cell-cycle scores,
   which were calculated for each human K562 cell based on previously
   reported phase-specific genes and methods^[139]13. Cells at different
   cell-cycle stages were clearly separated based on their cell-cycle
   scores (Fig. [140]4d). In general, Paired-seq presented high-efficient
   single-cell mRNA sequencing with reliable reproducibility and detection
   ability.

Excellent sequencing accuracy and sensitivity with low cost

   Deeper sequencing depth can enhance the sensitivity of gene detection,
   but it can also significantly increase the cost. In order to balance
   sensitivity, accuracy, and cost, we analyzed the relationship between
   the sequencing depth and accuracy/sensitivity at single cell level of
   different platforms at the same time. To estimate the accuracy and
   sensitivity of Paired-seq, we compared the results with recent
   scRNA-seq platforms using ERCC and mES cells. About 100k molecules of
   ERCC and single mES cells were compartmentalized in the cell capture
   chamber and paired with individual barcoded beads to generate scRNA-seq
   libraries from ERCC and mES cells. In order to further optimize the
   operation on chip, we firstly compared the quality of sequencing data
   for ERCC experiment by using Paired-seq with enzymatic process on and
   off chip (in tube). The results showed that the percentages of mapped
   reads for enzymatic process on chip were significantly higher than that
   in tube. Most of the unmapped reads (due to too short sequence) in
   off-chip sample could be traced back to primer on the barcoded beads,
   which confirmed that the insufficient enzymatic reaction brought in
   technical noise (Supplementary Fig. [141]11). Then we followed the same
   protocol for subsequent sample preparation on Paired-seq chip. A total
   of 55 ERCC captured barcoded beads were sequenced at a depth of 1
   million reads per bead. The ERCC sequencing data were processed with
   umis^[142]33 software based on the existing benchmark. Additionally, 70
   mES cells were sequenced at a saturated sequencing depth (over 0.5
   million mapped reads per cell) and downsampled to a normalized depth of
   0.5 million mapped reads per cell. Then the normalized data were
   randomly subsampled to reveal the corresponding changes in accuracy and
   sensitivity for each platform. The pipeline shown in Supplementary
   Fig. [143]12 was used to process the data for accuracy/sensitivity
   comparison with other platforms.

   The accuracy here is defined as the Pearson product-moment correlation
   coefficient (R) between log-transformed detected ERCC counts and input
   ERCC spike-in counts per droplet which had been previously
   established^[144]34. The high accuracy achieved (R = 0.973) indicates
   that Paired-seq is a reliable platform to identify marker genes with
   low expression level (Fig. [145]4e and Supplementary Fig. [146]12c).
   ERCC sequencing data from Paired-seq were analyzed with established
   data analysis methods, with UMI merging and without merging, yielding
   capture efficiencies of 16.8% and 50%, respectively. Both values are
   slightly higher than those of Drop-seq (12.8% and 47%),
   respectively^[147]13. Considering a global effect of sequencing depth,
   we also used a linear model, including an individual corrected
   performance parameter for each platform that could be ranked to account
   for the sequencing depth^[148]34. According to the model, we found that
   Paired-seq had the highest accuracy (R = 0.955) for ERCC detection
   (Fig. [149]4f) among 16 different kinds of scRNA-seq platforms
   (Fig. [150]4f and Supplementary Figs. [151]13 and [152]15). In
   addition, we compared the accuracy of Paired-seq and a series of other
   scRNA-seq methods, including CEL-seq2/C1, Drop-seq, MARS-seq, and
   SCRB-seq, at the normalized sequencing depth (Fig. [153]4h) with data
   for mES cells. The Pearson correlation coefficient (R) of reference
   gene expression values for each cell and average expression of all the
   cells were calculated and shown in the plot by different
   methods^[154]15. Paired-seq showed a very high accuracy (R = 0.991)
   possibly due to the small reaction volume, high mRNA capture efficiency
   and noises reduction with on-chip enzymatic operation. The results
   indicated that Paired-seq possessed superior performances in
   quantification of transcripts in single cells.

   Similarly, the sensitivity was compared with other platforms using ERCC
   and data for mES cells (Supplementary Fig. [155]14)^[156]34. According
   to the logistic regression model^[157]34 with ERCC, we achieved the
   detection limit of such as “10 molecules” for Drop-seq, “5 molecules”
   for inDrop, and also “2 molecules” for Paired-seq (Fig. [158]4g,
   Supplementary Fig. [159]15). The algorithm provided a fair comparison
   and demonstrated that sensitivity of Paired-seq was comparable to other
   modern single-cell approaches. In addition, in data processing for mES
   cells, we downsampled reads with normalized depth of each cell to
   varying lower mapped reads for each method, and drew the fitted curve
   of median genes detected for single mES cell versus different mapped
   reads (Fig. [160]4I). Although conventional methods, including
   CEL-seq2/C1 and SCRB-seq, have higher gene detection ability due to the
   use of liquid barcoding primers, the large reagent consumption greatly
   increases the cost and the complexity of manual operations, thus it is
   unpractical for high-throughput single-cell analysis. Compared with the
   high-throughput scRNA-seq platform (Drop-seq), Paired-seq detected more
   genes per cell than Drop-seq at different sequencing depths, which may
   be attributed to the effective mixing of reagents and enzymatic process
   on Paired-seq chip and the smaller reaction chamber. Most importantly,
   based on the analyses of the sequencing depth (mapped reads per cell)
   vs. the number of detected genes of mES cells for five different
   sequencing platforms, only 2673 mapped reads were needed for Paired-seq
   which was the lowest compared with others when the number of detected
   genes is 2000 (Fig. [161]4i, j). This result shows that the cost of
   sequencing for Paired-seq is the lowest.

Heterogeneous cellular subpopulations

   ScRNA-seq is a promising technology to identify and describe cellular
   subpopulations from heterogeneous populations of cells. ES cells are
   derived from a stage in which key early lineage specification events
   are occurring. Specifically, upon Leukemia Inhibitory Factor (LIF)
   withdrawal, ES cells will experience unguided differentiation and
   generate various subpopulations^[162]35. Compared to fully
   differentiated cell types, ES cells in serum are relatively
   homogeneous,

   with only some well-characterized fluctuations even in a short time
   after LIF withdrawal. Study of heterogeneity information from such
   relatively homogeneous cell populations poses a challenge for single
   cell sequencing. For further verification, the ability to distinguish
   such relatively weak heterogeneity by Paired-seq, mES cells were
   collected and analyzed in nine batches over 10 days after LIF
   withdrawal (Supplementary Figs. [163]16 and [164]17). Upon LIF
   withdrawal, the time series samples collected at Day 0, 2, 4, 7, and
   10, were assayed for the single-cell transcriptomes using Paired-seq.
   Replicate experiments were performed by different people on a few of
   the time points of this study. In the comparison of the biological
   replicates (Day0_1/2, Day7_1/2 and Day10_1/2), Paired-seq data makes
   their t-Distributed Stochastic Neighbor Embedding (t-SNE) points go
   together (Fig. [165]5a), suggesting the similar expression profiles of
   these replicates.

Fig. 5. Heterogeneity of differentiated mouse ES cells.

   [166]Fig. 5
   [167]Open in a new tab

   a t-SNE maps of mES samples from different days after LIF withdrawal.
   Different experimental batches are labeled with different colors. b
   t-SNE maps of mES samples by unsupervised clustering ID. Five distinct
   clusters are labeled with different colors. c Average and d
   distribution of key pluripotent factors and differentiated markers of
   different time points after LIF withdrawal. e KEGG pathway enrichment
   analysis for five clusters. f Biological process analysis of gene
   ontology enrichment for five clusters Source data are provided as a
   Source Data file.

   Overall, the combined single-cell expression profiles of these time
   points give five predominant cell clusters, which were readily
   correlated to the post-LIF times (Fig. [168]5b). The pseudo-time
   algorithm plots the trajectory that is concordant to the order of the
   sampling time (Supplementary Fig. [169]18). Some of the clusters enrich
   the markers of the differentiated cell types of expectation, such as
   Cytokeratin and Otx2 etc., and reflects the fluctuation of pluripotency
   factors, Zfp42, Pou5f1 and Sox2 etc., which validate the capability of
   Paired-seq^[170]36. (Fig. [171]5c, d). In addition to those well-known
   transcription factor and markers, 1594 genes of fluctuated expressions
   were identified. These genes were differentially expressed
   (p-value < 0.05, Supplementary Table [172]S3) among the five cell
   populations. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway
   enrichment analysis (Fig. [173]5e) and Gene Ontology (GO) enrichment
   analysis for biological process (Fig. [174]5f) revealed that these
   genes were mainly involved in some fundamental biological processes and
   pathways during cell differentiation (p-value < 0.05). In summary,
   Paired-seq is able to track down the population development and detect
   the fluctuated expressions of the key markers in the differentiation
   process. This is concordant to what has been described in
   inDrop^[175]12.

Heterogeneity of drug treated cancer cells

   To characterize the drug resistance and disease recurrence after
   anti-cancer treatments, Paired-seq was used to study the heterogeneity
   of anti-cancer drug treated cancer cells. Nocodazole, an antineoplastic
   agent and known as a cell cycle inhibitor that inhibits polymerization
   of microtubules^[176]37, and possibly influences the differential mRNA
   transcription related to cell cycle, was used as a model drug to study
   the drug response at single cell drug to study the drug response at
   single cell level. ScRNA-seq samples of K562 cells before and after
   drug treatment were processed on Paired-seq chip and the sequencing
   data were analyzed by t-SNE (Supplementary Figs. [177]19 and [178]20).
   Default unsupervised clustering on Seurat^[179]38 gives two putative
   clusters, which are readily associated to the treatment response,
   indicating an obvious phenotypic variability in response to the drug
   (Fig. [180]6A, B). As we can see, the untreated replicates share the
   same clusters whereas the treated sample goes to a relatively
   segregated cluster indicating the phenotypic change. We characterize
   total 1179 differential expression genes (DEG) across the treatment
   conditions. Figure [181]6c, d shows GO term enrichment analysis of the
   top 50 genes that are elevated in the treated cluster. These terms
   guide us to the genes that are correlated to the activated mitosis,
   such as ASPM, AURKA, CENPF, KIF208, TOP1, RNF8, SEPT7, SMC3, TPX2, and
   CENPE, which validates the accuracy of our single-cell RNA-seq assay
   (Fig. [182]6e). All of these results consistent with previous
   knowledge^[183]39 proved the heterogeneity of cancer cells in
   anti-cancer ability and drug resistance. The results show that
   Paired-seq can provide comprehensive genetic expression analysis of
   individual cells to reveal the heterogeneity in anti-cancer drug
   responses, thereby facilitating the development of optimized clinical
   anti-cancer strategies.

Fig. 6. Heterogeneity of Nocodazole treated cancer cells.

   [184]Fig. 6
   [185]Open in a new tab

   a t-SNE visualization of K562 cells (RNA-seq) colored by Nocodazole
   treatment time or b unsupervised clustering ID. c Biological process
   and d cellular components analysis of gene ontology enrichment for the
   top 50 genes that are elevated in the treated cluster. e t-SNE maps of
   Nocodazole treated and untreated K562 cells. Gene expression levels are
   indicated by shades of red. All the genes shown in the map are
   correlated to the activated mitosis. Source data are provided as a
   Source Data file.

Discussion

   In summary, we proposed a high-throughput single-cell RNA sequencing
   platform named Paired-seq with high cells/beads utilization efficiency,
   cell-free RNAs removal capability, high gene detection ability, and low
   cost. By using the differential flow resistance principle, Paired-seq
   overcomes the waste of reagents and loss of cells caused by traditional
   limiting dilution methods and achieved utilization efficiency up to 95%
   in both single cell/barcoded bead isolation and pairing.
   High-efficiency single cell/bead paring is a promising technology for
   analysis of precious and rare cells, such as stem cells, neuron cells,
   or CTCs. Furthermore, Paired-seq allows real-time observation of single
   cells that cannot be available in droplet-based platforms like Drop-seq
   or InDrop. Integration of controllable valves and pumps enables
   complete removal of cell-free RNAs, efficient cell lysis and mRNA
   capture. The clear background without cell-free RNAs helps to eliminate
   noise and faithfully reflects the true composition of the original
   sample. The efficient cell lysis and mRNA capture endow the highest
   mRNA detection accuracy (R = 0.955) and comparable sensitivity compared
   to other scRNA-seq platforms. The high gene detection capacity allows
   lower sequencing costs because it requires less sequencing depth to
   achieve the same number of detected genes. By using Paired-seq to
   investigate the single-cell expression profile of mES cells and
   anti-cancer drug treated cells, we verified the reproducibility and
   significant detection ability for varying genetic expression,
   presenting great potential for cell biology, developmental biology and
   precision medicine.

Methods

Chip fabrication

   The silicon mold for single cells and single beads manipulation was
   fabricated by conventional photolithography. Mask fabrication with
   twice overlay exposure was applied to produce three different heights
   of the flow layer. First, SU-8 3010 photo-resist (MicroChem) was coated
   on a silica wafer to produce the connection channel with 8 μm height.
   Then, GM 1070 photo-resist (Gersteltec) was coated on the same wafer to
   produce the cell capture channel with 30 μm height. Next, the
   micro-sphere capture channel was produced with SU8 3050 photo-resist
   (MicroChem). The control layer was fabricated with one step exposure
   using GM 1070 photo-resist. Finally, the mask was coated with 0.7% 1 H,
   1 H, 2 H, 2H-perfluorooctyldimethyl-chlorosilane/GH-135 (v/v) solution
   and dried to make the surface hydrophobic. The silica wafer with flow
   layer pattern was placed in a 60 mm plastic petri dish and PDMS
   precursor solution (10:1 of polydimethylsiloxane and curing agent) was
   poured on the silica wafer and cured at 75 °C for 10 min to the 80%
   repolymerization. The silicon wafer with control layer pattern was spun
   with PDMS precursor solution (23:1 of polydimethylsiloxane and curing
   agent) and then put on a horizontal heater at 48 °C for 7.5 min to the
   80% repolymerization. The flow layer and the control layer were
   perfectly aligned under the microscope, and bound at 48 °C for 25 min
   for the complete bond between two layers. Then the PDMS with flow layer
   and control layer pattern was peeled off, and punched for the inlets
   and outlets with a 1.0 mm puncher. The final chip was fabricated by
   bonding the integrated PDMS with a 2.5 mm × 7.5 mm glass immediately
   after oxygen plasma treatment.

Cell culture and preparation

   Human K562 cells (ATCC CCL-243, purchased from National Infrastructure
   of Cell Line Resource) and mouse 3T3 cells (ATCC CRL-1658, purchased
   from National Infrastructure of Cell Line Resource) were cultured in
   Dulbecco’s Modified Eagle Medium (DMEM, ThermoFisher) supplemented with
   10% Fetal Bovine Serum (FBS, ThermoFisher) and 1%
   penicillin-streptomycin (ThermoFisher) at 37 °C and 5% CO[2]. 3T3 cells
   were harvested by 0.25% trypsin-EDTA (Life Technologies) and
   re-suspended with 1 mL 1× DPBS in a 1.5 mL centrifuge tube. Unlike 3T3
   adherent cells, suspended K562 cells were pipetted up and down gently
   several times and then directly pipetted out, followed by
   centrifugation and suspension with 1 mL 1× DPBS in a 1.5 mL centrifuge
   tube. For mixed-species experiments, human K562 cells and mouse 3T3
   cells were mixed in a 1:1 ratio with a final concentration of 0.2%
   Poloxamer 188 (F68, ThermoFisher) and 4% FBS in 1× DPBS buffer.

   The mouse embryonic stem (mES) cells were J1 mouse embryonic stem cell
   (J1 mES cell): derived from the mouse 129 s4/SvJae strains called J1.
   They were kindly provided by Stem Cell Bank, Chinese Academy of
   Sciences. The culture flasks were pre-treated with gelatin at 37 °C and
   5% CO[2]. For the undifferentiated stage, mES cells were cultured in
   DMEM supplemented with 15% FBS, 2 mM l-glutamine, 0.1 mM non-essential
   amino acids (NEAA), 0.1 mM 2-mercaptoethnol and 1000 U mL^−1 LIF. LIF
   was removed for unguided mES differentiation. We collected mES cells on
   the 0th, 2nd, 4th, 7th, 10th days after LIF withdrawal for subsequent
   experiments. Before injecting into the chip, the mES cells were washed
   and re-suspended in 1× DPBS with a final concentration of 0.2% F68 and
   4% FBS.

Preparation of barcoded

   Commercial barcoded beads were purchased from ChemGenes Company
   (Wilmington, Massachusetts, USA; cat. Macosko-2011-10(V+)) described in
   Drop-seq^[186]13. The oligo synthesis scale was 10 μmole. Subsequently
   commercial barcoded beads were washed twice with 30 mL of TE/TW (10 mM
   Tris pH 8.0, 1 mM EDTA, 0.01% Tween), re-suspended in 10 mL TE/TW and
   passed through a 40 µm strainer (PluriSelect) into a 50 mL Falcon tube.
   Then they were placed at 4 °C for long-term storage. Before
   experiments, 1000 barcoded beads were used and re-suspended in 10 µL 2%
   sodium alga acid solution with 0.2% Triton X-100 for subsequent capture
   in the chip.

Paired-seq operation

   All aqueous suspensions were loaded into 1 mL plastic syringes. The
   blocking valve was turned on to disconnect the cell and bead chambers,
   resulting in no fluid exchange between them. Barcoded beads suspended
   in 2% sodium alga acid and 0.2% Triton X-100 were injected into the
   chip through the bead inlet at a flow rate of 0.2 mL h^−1, while 1 x
   DPBS was injected through the cell inlet buffer inlet at a flow rate of
   0.06 mL h^−1. After finishing the bead capture, the bead channel was
   washed with 1× DPBS to replace sodium alga acid and Triton X-100 while
   the driving pump for bead was activated to prevent bead escape. Then,
   the cell suspension and DPBS buffer were respectively injected into the
   chip through the cell inlet at 0.015 mL h^−1, with the speed of
   0.03 mL h^−1 of 1× DPBS in the bead channel. After finishing single
   cells capture, the cell driving pump for cell was pressure-forced to
   prevent the cells from escaping, and then the cell channel was washed
   with 1× DPBS to remove residual cells in the channel. Next, the buffer
   in the bead channel was replaced with lysis buffer (160 mM Tris pH 7.5
   (ThermoFisher), 0.16% Sarkosyl (Sigma), 16 mM EDTA, 0.5 U μL^−1 RNase
   Inhibitor (TransGen Biotech), 0.12% F68). Then the cell and bead inlets
   were unplugged and both bead and cell outlets were blocked. Gas was
   reversely injected into the channels, generating water-in-gas droplets,
   which contained single beads and single cells. Then the blocking valve
   was turned off to enable solution exchange between the paired chambers.
   The whole procedure could be real-time monitored to ensure complete
   lysis of cells. At the same time, the released mRNA molecules were
   captured by the paired beads. By alternately activating the driving
   pump for cells and the driving pump for beads, solutions in two
   chambers could be easily transferred back and forth, thus allowing
   efficient cell lysis and mRNA capture. After turning on the blocking
   valve, the cell channel and bead channel were washed with 1× DPBS
   independently. The driving pump for barcoded beads was activated to
   keep the trapping of mRNA captured beads, and the reverse transcription
   mix (1x RT buffer (Fermentas), 1 mM dNTPs (TransGen Biotech), 1 U μL^−1
   RNase Inhibitor, 2.5 μM Template_Switch_Oligo (Life Technologies), and
   10 U μL^−1 Maxima H-RT (Fermentas)) was injected in both channels. The
   chip was incubated at room temperature for 30 min followed by 42 °C for
   90 min.

   After reverse transcription, the beads were washed with TE-SDS (10 mM
   Tris pH 8.0, 1 mM EDTA, 0.5% Sodium Dodecyl Sulfate (Sigma)), 20 μL
   TE/TW, and 20 μL TE (10 mM Tris pH 8.0), with driving pump activated to
   trap the barcoded beads in the original position. Then 20 μL
   Exonuclease I mix (1x Exonuclease I Buffer and 1 U μL^−1 Exonuclease I
   (NEB)) was injected into the chip to remove the excess primers by
   incubating the chip at 37 °C for 45 min.

   The channels were then washed with 10 μL TE/SDS, 10 μL TE/TW, 10 μL
   ddH[2]O to remove Exonuclease I mix. After reducing the pressure of the
   bead driving pump, a high speed of solution was introduced to push the
   advance of beads, making them gather at the end of channel. With the
   help of water phase flow and gas phase flow in the direction of bead
   outlet, the barcoded beads could be collected from outlet into tubes
   without remnant.

Feasibility testing

   Total RNAs were extracted from human K562 and mouse 3T3 cells by using
   GeneJET RNA Purification Kit (Thermo Fisher) according to the manuals
   and protocols. The products were quantified by NanoDrop ND-2000. The
   RNAs released by 10^6 3T3 cells were mixed with 10^6 K562 cells and
   injected into the chip through the cell inlet, while the same amount of
   cell-free RNAs of K562 cells mixed with 10^6 3T3 cells were injected
   into another chip. Cell capture continued 30 min, and then the cell
   channel was washed by 1× DPBS. The cleaning process was also set at
   30 min. Subsequent manipulation was the same as the normal process. The
   sequencing data were aligned to hg19_mm10 to test the presence of
   cell-free mRNA information, which verified the capacity of the blocking
   valve and cleaning effect.

ERCC experiment

   External RNAs (ERCC RNA Spike-In Mix) were purchased from ThermoFisher.
   The originating ERCCs were diluted to 1.2 × 10^9 μL^−1 with 1x PBS + 1
   U μL^−1 RNase Inhibitor (Lucigen). After processing the ERCCs in a
   Paired-seq chip, the theoretical number of ERCCs contained in each cell
   chamber was about 10^5 molecules. In order to reduce low quality of
   ERCC reads by STAR, sequencing reads were aligned to a dual ERCC-human
   reference, where human sequences were used as “bait”.

cDNA amplification and library preparation

   All the collected beads were aliquoted into one PCR tube for PCR
   amplification. The PCR program was as follows: 95 °C for 3 min; and
   then four cycles of: 98 °C for 20 s, 65 °C for 45 s, 72 °C for 3 min;
   then 10 cycles of 98 °C for 20 s, 67 °C for 20 s, 72 °C for 3 min; then
   a final extension step of 5 min. The PCR products were purified using
   0.6x VAHTS DNA Clean Beads (Vazyme Biotech) according to the manual
   twice, and eluted in 11 μL H[2]O. The concentration of the purified
   products was quantified by qubit3.0.

   The 3′-end enriched sequencing library was prepared using a TruePrep
   DNA Library Prep Kit V2 for Illumina (Vazyme Biotech), according to the
   manufacturer’s instructions, except that the custom primer P5 was used
   in place of the kit’s oligos. The samples were then amplified as
   follows: 72 °C for 3 min, 98 °C for 30 s; and 12 cycles of: 98 °C for
   15 s, 55 °C for 30 s, 72 °C for 30 s; then a final extension step of
   5 min. The 3′-end enriched library products were purified using 0.6x
   VAHTS DNA Clean Beads (Vazyme Biotech), and eluted in 11 μL H[2]O. The
   concentration was quantified by qubit3.0. The fragment size of the
   3′-end enriched sequencing library was analyzed by Qsep-100, and the
   average size was between 450 and 650 bp. The libraries were sequenced
   on the Illumina Nextseq 550 according to the manufacturer’s
   instructions, except that Custom read 1 was used for priming of read 1.
   Read 1 was 21 bp; read 2 was 60 bp for all the experiments.

Single-cell responses to Nocodazole

   Nocodazole was purchased from Selleck (#S2775) and dissolved in DMSO at
   the concentration of 2 nM. Before the experiment, K562 cells were
   cultured in DMEM supplemented with 10% Fetal Bovine Serum, 1%
   Penicillin-streptomycin and 1 nM Nocodazole for 20 h. For flow
   cytometry analysis, the medium containing Nocodazole was removed, and
   the cells were re-suspended in 400 μL cold 1× PBS and 1100 μL cold
   fixing solution (100% ethyl alcohol) and stored overnight at 4 °C. The
   next day cells were centrifuged to remove the fixing solution, washed
   three times with 1× PBS and re-suspended in 500 μL 1× PBS. RNase A
   (ThermoFisher, 20 mg L^−1) was introduced to remove the interference of
   RNA at 37 °C for 1 h. Then the nuclear DNAs of K562 cells were stained
   by PI (ThermoFisher, 50 mg L^−1) in a dark place at 4 °C for 1 h. One
   million K562 cells in total were detected by flow cytometry, with the
   obvious fluorescence peak corresponding to 2n/4n DNAs, which
   represented the cell cycle position. For Paired-seq, after removing the
   Nocodazole, K562 cells were washed three times and re-suspended in DMEM
   supplemented with 0.2% F68 and 4% FBS.

Data sources

   Raw read data from published studies were downloaded from either ENA or
   SRA, as listed in Supplementary Table [187]5. We followed the same
   protocol of the data analysis and confirmed with the authors about the
   details of the data analysis process^[188]33.

Data analysis workflow for ERCC sample

   ERCC sequencing data prepared on the Paired-seq platform was processed
   with umis workflow to obtain a digital expression matrix for
   performance estimation and comparison. The raw sequencing data fastq
   files were first transformed to a single fastq file with “UMIs fastq
   transform” using Drop-seq mode. Then pseudo-alignment was performed
   with Rapmap to obtain the sam file. After that, a digital expression
   matrix was produced by “UMIs tagcount”. For bulk RNA-seq and other
   scRNA-seq platforms, we used the processed data provided by Svensson et
   al.^[189]33.

   For each individual cell or sample, specific ERCC spike-in molecules
   were proved to be detected with at least one copy observed. After
   discarding the undetected ERCC spike-in types, we calculated the
   Pearson correlation coefficient (R) between detected ERCC counts (UMI)
   and input ERCC counts from the equation below as the accuracy of each
   sample:
   [MATH: <msub><mrow><mi
   mathvariant="normal">log</mi></mrow><mrow><mn>10</mn></mrow></msub><mfe
   nced close=")" open="(" separators=""><mrow><msub><mrow><mi
   mathvariant="bold">UMI</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>=</mo><mi>α
   </mi><mo>⋅</mo><msub><mrow><mi
   mathvariant="normal">log</mi></mrow><mrow><mn>10</mn></mrow></msub><mfe
   nced close=")" open="(" separators=""><mrow><msub><mrow><mi
   mathvariant="bold">Input</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>+</mo><mi>c
   </mi><mo>.</mo> :MATH]
   1

   We can get the UMI efficiency (mRNA capture efficiency) of UMI based
   protocols at the same time, namely β in the following equation:
   [MATH: <msub><mrow><mi mathvariant="bold">UMI</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub><mo>=</mo><mi>β</mi><mo>⋅</mo><m
   sup><mrow><mrow><mo>[</mo><mrow><mi
   mathvariant="bold">Input</mi></mrow><mo>]</mo></mrow></mrow><mrow><mi>α
   </mi></mrow></msup><mo>.</mo> :MATH]
   2

   When we model the relation between read depth and performance metrics
   for individual protocols, we use a linear model with a quadratic term
   for read depth to capture diminishing returns on investment. The model
   considers the read depth effect to be global, and has a categorical
   performance parameter for each protocol:
   [MATH: <mi
   mathvariant="bold">metric</mi><mo>=</mo><msup><mrow><mi>a</mi></mrow><m
   row><mn>2</mn></mrow></msup><mo>⋅</mo><msub><mrow><mi
   mathvariant="normal">log</mi></mrow><mrow><mn>10</mn></mrow></msub><mfe
   nced close=")" open="(" separators=""><mrow><msub><mrow><mi
   mathvariant="bold">reads</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>+</mo><mi>b
   </mi><mo>⋅</mo><msub><mrow><mi
   mathvariant="normal">log</mi></mrow><mrow><mn>10</mn></mrow></msub><mfe
   nced close=")" open="(" separators=""><mrow><msub><mrow><mi
   mathvariant="bold">reads</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>+</mo><msub
   ><mrow><mi mathvariant="normal">performance</mi></mrow><mrow><mi
   mathvariant="normal">protocol</mi></mrow></msub><mo>+</mo><mi>ε</mi><mo
   >.</mo> :MATH]
   3

   Here the performance metric will plateau and saturate when
   [MATH: <msub><mrow><mi
   mathvariant="normal">log</mi></mrow><mrow><mn>10</mn></mrow></msub><mfe
   nced close=")" open="(" separators=""><mrow><msub><mrow><mi
   mathvariant="bold">reads</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>=</mo><mo>−
   </mo><mfrac><mrow><mi>b</mi></mrow><mrow><mn>2</mn><mi>a</mi></mrow></m
   frac><mo>.</mo> :MATH]
   4

   For sensitivity calculation, we transformed the detected spike-in count
   into a binary variable (detected (1) or undetected (0)). Then we built
   a logistic regression model with Python scikit-learn package for each
   sample:
   [MATH: <mi mathvariant="normal">p</mi><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi
   mathvariant="bold">detected</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>=</mo><mfra
   c><mrow><mn>1</mn></mrow><mrow><mn>1</mn><mo>+</mo><msup><mrow><mi>e</m
   i></mrow><mrow><mo>−</mo><mfenced close=")" open="("
   separators=""><mrow><mi>a</mi><mo>×</mo><mi>log</mi><mfenced close=")"
   open="(" separators=""><mrow><msub><mrow><mi
   mathvariant="bold">M</mi></mrow><mrow><mi
   mathvariant="bold">i</mi></mrow></msub></mrow></mfenced><mo>+</mo><mi>b
   </mi></mrow></mfenced></mrow></msup></mrow></mfrac><mo>+</mo><mi>ε</mi>
   <mo>.</mo> :MATH]
   5

   The sensitivity was calculated as the molecule count when the detection
   probability equals to 0.5, namely:
   [MATH: <mi mathvariant="normal">detection</mi><mspace
   width="0.25em"></mspace><mi
   mathvariant="normal">limit</mi><mo>=</mo><mo>−</mo><mfrac><mrow><mi>b</
   mi></mrow><mrow><mi>a</mi></mrow></mfrac><mo>.</mo> :MATH]
   6

Processing of the Paired-seq data

   For all the sequencing results, each dataset was generated from one
   single chip. One single experiment was pooled to generated one data.
   Paired-seq sequencing libraries produce paired-end reads: Read 1
   contained a cell barcode (12 bases) and a UMI (8 bases); Read 2
   contained mRNA information. The reads would be preprocessed with the
   following steps, correcting bead, filtering low-quality reads, trimming
   read 2 including polyA, adapter and primer, alignment, assigning gene
   tags, generating digital gene expressing. The beads with the
   twenty-first base as A, C or G only were used in our experiments. If
   the number of continuous T bases at the end of read 1 was less than or
   equal to 12, we inserted “N” bases before T bases. Otherwise, the pair
   of reads was dropped. Filtering low-quality reads was based on the base
   quality of the cell barcode and UMI. Respectively, cell barcode and UMI
   should have only one base with quality lower than 20 at most.
   Otherwise, the read pair was discarded. At least 5 contiguous bases of
   TSO and at least 6 contiguous bases of A with no mismatch were examined
   for read 2 and were hard clipped off the read. At least 6 contiguous
   bases of primer with one mismatch allowed considered for read 2 and
   hard clipped off read. The read pair was discarded, if the length of
   read 2 was less than 26 after trimmed. STAR alignment tool was used to
   align read 2 with the reference genome. For human and mouse mixed
   cells, we used hg19_mm10 mentioned in Drop-seq as reference genome.
   This program from Drop-seq added a tag “GE”. We kept the unique mapping
   with gene tags. Then unique UMIs for each gene of each cell were
   counted to generate digital gene expression.

Theory supplement

   Using the Darcy–Weisbach equation to determine pressure difference in a
   microchannel and solve the continuity and momentum equations for the
   Hagen–Poiseuille flow problem, we obtained the pressure difference ∆P =
   ƒLρV2/2D, where ƒ is the Darcy friction factor, L is the length of the
   channel, ρ is the fluid density, V is the average velocity of the
   fluid, and D is the hydraulic diameter, respectively. D can be further
   expressed as 4 A/R for a rectangular channel, and V as Q/A, where A and
   R are the cross-sectional area and perimeter of the channel, and Q is
   the volumetric flow rate. The Darcy friction factor, ƒ, is related to
   aspect ratio, α, and Reynolds number, Re = ρVD/μ, where μ is the fluid
   viscosity. The aspect ratio is defined as either height/width or
   width/height such that 0 ≤ α ≤ 1. The product of the Darcy friction
   factor and Reynolds number is a constant that depends on the aspect
   ratio, i.e., ƒ × Re = C(α), where C(α) denotes a constant that is a
   function of the aspect ratio, α. After simplifications, by applying the
   Darcy–Weisbach equation to a rectangular channel, we obtain the
   expression:
   [MATH: <mi
   mathvariant="normal">Δ</mi><mi>P</mi><mo>=</mo><mfrac><mrow><mi>C</mi><
   mrow><mo>(</mo><mrow><mi
   mathvariant="normal">α</mi></mrow><mo>)</mo></mrow></mrow><mrow><mn>32<
   /mn></mrow></mfrac><mfrac><mrow><mi>μ</mi><msup><mrow><mi>L</mi><mi>Q</
   mi><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow><mrow><msup><m
   row><mi>m</mi><mi>A</mi></mrow><mrow><mn>3</mn></mrow></msup></mrow></m
   frac><mo>.</mo> :MATH]
   7

   In the simplified circuit diagram of the trap., fluid can flow from
   junction a to b (c to d) via path 1 or 3 (path 2 or 4) (Supplementary
   Fig. [190]21). Ignoring minor losses due to bends, widening/narrowing,
   etc., Eq. [191](7) is applied separately for paths 1 and 3 (path 2 or
   4), and because the pressure drop is the same for both paths, we equate
   both expressions to yield
   [MATH:
   <mfrac><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>1</mn></mrow></msub
   ></mrow><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>3</mn></mrow></msu
   b></mrow></mfrac><mo>=</mo><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>C</mi></mrow><mrow><mn
   >3</mn></mrow></msub><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>3</mn></mrow
   ></msub></mrow></mfenced></mrow><mrow><msub><mrow><mi>C</mi></mrow><mro
   w><mn>1</mn></mrow></msub><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>1</mn></mrow
   ></msub></mrow></mfenced></mrow></mfrac></mrow></mfenced><mo>⋅</mo><mfe
   nced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>L</mi></mrow><mrow><mn
   >3</mn></mrow></msub></mrow><mrow><msub><mrow><mi>L</mi></mrow><mrow><m
   n>1</mn></mrow></msub></mrow></mfrac></mrow></mfenced><mo>⋅</mo><msup><
   mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>R</mi></mrow><mrow><mn
   >3</mn></mrow></msub></mrow><mrow><msub><mrow><mi>R</mi></mrow><mrow><m
   n>1</mn></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>
   2</mn></mrow></msup><mo>⋅</mo><msup><mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>A</mi></mrow><mrow><mn
   >1</mn></mrow></msub></mrow><mrow><msub><mrow><mi>A</mi></mrow><mrow><m
   n>3</mn></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>
   3</mn></mrow></msup><mspace width="0.25em"></mspace><mfenced close=")"
   open="(" separators=""><mrow><mi mathvariant="normal">a</mi><mspace
   width="0.25em"></mspace><mi mathvariant="normal">to</mi><mspace
   width="0.25em"></mspace><mi
   mathvariant="normal">b</mi></mrow></mfenced> :MATH]
   8

   and
   [MATH:
   <mfrac><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>2</mn></mrow></msub
   ></mrow><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>4</mn></mrow></msu
   b></mrow></mfrac><mo>=</mo><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>C</mi></mrow><mrow><mn
   >4</mn></mrow></msub><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>4</mn></mrow
   ></msub></mrow></mfenced></mrow><mrow><msub><mrow><mi>C</mi></mrow><mro
   w><mn>2</mn></mrow></msub><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>2</mn></mrow
   ></msub></mrow></mfenced></mrow></mfrac></mrow></mfenced><mo>⋅</mo><mfe
   nced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>L</mi></mrow><mrow><mn
   >4</mn></mrow></msub></mrow><mrow><msub><mrow><mi>L</mi></mrow><mrow><m
   n>2</mn></mrow></msub></mrow></mfrac></mrow></mfenced><mo>⋅</mo><msup><
   mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>R</mi></mrow><mrow><mn
   >4</mn></mrow></msub></mrow><mrow><msub><mrow><mi>R</mi></mrow><mrow><m
   n>2</mn></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>
   2</mn></mrow></msup><mo>⋅</mo><msup><mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>A</mi></mrow><mrow><mn
   >2</mn></mrow></msub></mrow><mrow><msub><mrow><mi>A</mi></mrow><mrow><m
   n>4</mn></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>
   3</mn></mrow></msup><mspace width="0.25em"></mspace><mfenced close=")"
   open="(" separators=""><mrow><mi mathvariant="normal">c</mi><mspace
   width="0.25em"></mspace><mi mathvariant="normal">to</mi><mspace
   width="0.25em"></mspace><mi
   mathvariant="normal">d</mi></mrow></mfenced><mo>,</mo> :MATH]
   9

   where subscripts 1 and 3 denote paths 1 and 3, respectively. For path
   1, the length, L1, is assumed to be that of the narrow channel to
   simplify analysis. This is valid because most of the pressure drop
   occurs along the narrow channel. For the trap to work, the volumetric
   flow rate along path 1 must be greater than that of path 3, i.e.,
   Q1 > Q3. Using the relationships A = W × H and P = 2•(W + H), where H
   is the height of the channel, we arrive at
   [MATH:
   <mfrac><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>1</mn></mrow></msub
   ></mrow><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>3</mn></mrow></msu
   b></mrow></mfrac><mo>=</mo><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><mi>C</mi><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>3</mn></mrow
   ></msub></mrow></mfenced></mrow><mrow><mi>C</mi><mfenced close=")"
   open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>1</mn></mrow
   ></msub></mrow></mfenced></mrow></mfrac></mrow></mfenced><mo>⋅</mo><mfe
   nced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>L</mi></mrow><mrow><mn
   >3</mn></mrow></msub></mrow><mrow><msub><mrow><mi>L</mi></mrow><mrow><m
   n>1</mn></mrow></msub></mrow></mfrac></mrow></mfenced><mo>⋅</mo><msup><
   mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>W</mi></mrow><mrow><mn
   >3</mn></mrow></msub><mo>+</mo><msub><mrow><mi>H</mi></mrow><mrow><mi>C
   </mi></mrow></msub></mrow><mrow><msub><mrow><mi>W</mi></mrow><mrow><mn>
   1</mn></mrow></msub><mo>+</mo><msub><mrow><mi>H</mi></mrow><mrow><mi>C<
   /mi></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>2</m
   n></mrow></msup><mo>⋅</mo><msup><mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>W</mi></mrow><mrow><mn
   >1</mn></mrow></msub></mrow><mrow><msub><mrow><mi>W</mi></mrow><mrow><m
   n>3</mn></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>
   3</mn></mrow></msup><mspace width="0.25em"></mspace><mo>></mo><mspace
   width="0.25em"></mspace><mn>1</mn> :MATH]
   10

   and
   [MATH:
   <mfrac><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>2</mn></mrow></msub
   ></mrow><mrow><msub><mrow><mi>Q</mi></mrow><mrow><mn>4</mn></mrow></msu
   b></mrow></mfrac><mo>=</mo><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><mi>C</mi><mfenced close=")" open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>4</mn></mrow
   ></msub></mrow></mfenced></mrow><mrow><mi>C</mi><mfenced close=")"
   open="("
   separators=""><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>2</mn></mrow
   ></msub></mrow></mfenced></mrow></mfrac></mrow></mfenced><mo>⋅</mo><mfe
   nced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>L</mi></mrow><mrow><mn
   >4</mn></mrow></msub></mrow><mrow><msub><mrow><mi>L</mi></mrow><mrow><m
   n>2</mn></mrow></msub></mrow></mfrac></mrow></mfenced><mo>⋅</mo><msup><
   mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>W</mi></mrow><mrow><mn
   >4</mn></mrow></msub><mo>+</mo><msub><mrow><mi>H</mi></mrow><mrow><mi>B
   </mi></mrow></msub></mrow><mrow><msub><mrow><mi>W</mi></mrow><mrow><mn>
   2</mn></mrow></msub><mo>+</mo><msub><mrow><mi>H</mi></mrow><mrow><mi>B<
   /mi></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>2</m
   n></mrow></msup><mo>⋅</mo><msup><mrow><mfenced close=")" open="("
   separators=""><mrow><mfrac><mrow><msub><mrow><mi>W</mi></mrow><mrow><mn
   >2</mn></mrow></msub></mrow><mrow><msub><mrow><mi>W</mi></mrow><mrow><m
   n>4</mn></mrow></msub></mrow></mfrac></mrow></mfenced></mrow><mrow><mn>
   3</mn></mrow></msup><mspace width="0.25em"></mspace><mo>></mo><mspace
   width="0.25em"></mspace><mn>1</mn><mo>,</mo> :MATH]
   11
   [MATH:
   <msub><mrow><mi>C</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>α</mi></mr
   ow><mo>)</mo></mrow></mrow></msub><mo>=</mo><mspace
   width="0.33em"></mspace><mi>f</mi><mo>×</mo><mi>R</mi><mi>e</mi><mo>=</
   mo><mn>96</mn><mo>×</mo><mfenced close="" open="("
   separators=""><mrow><mn>1</mn><mo>−</mo><mn>1.3553</mn><mo>⋅</mo><mi>α<
   /mi><mo>+</mo><mn>1.9467</mn><mo>⋅</mo><msup><mrow><mi>α</mi></mrow><mr
   ow><mn>2</mn></mrow></msup></mrow></mfenced><mo>−</mo><mfenced
   close=")" open=""
   separators=""><mrow><mn>1.7012</mn><mo>⋅</mo><msup><mrow><mi>α</mi></mr
   ow><mrow><mn>3</mn></mrow></msup><mo>+</mo><mn>0.9564</mn><mo>⋅</mo><ms
   up><mrow><mi>α</mi></mrow><mrow><mn>4</mn></mrow></msup><mo>−</mo><mn>0
   .2537</mn><mo>⋅</mo><msup><mrow><mi>α</mi></mrow><mrow><mn>5</mn></mrow
   ></msup></mrow></mfenced><mo>.</mo> :MATH]
   12

   Note that this final expression does not contain any fluid velocity
   term, implying that a properly designed trap will work for all
   velocities in the laminar flow regime.

Reporting summary

   Further information on research design is available in the [192]Nature
   Research Reporting Summary linked to this article.

Supplementary information

   [193]41467_2020_15765_MOESM1_ESM.pdf^ (5.7KB, pdf)

   Description of Additional Supplementary Files
   [194]Supplementary Information^ (4.5MB, pdf)
   [195]Reporting Summary^ (202KB, pdf)
   [196]Peer Review File^ (515.5KB, pdf)
   [197]Download video file^ (800.9KB, mp4)

   Supplementary Movie 1. Paired-seq operation on chip
   [198]Download video file^ (688.4KB, mp4)

   Supplementary Movie 2. Function of control layer
   [199]Download video file^ (272.6KB, mp4)

   Supplementary Movie 3. Paired unit isolation
   [200]Download video file^ (276.2KB, mp4)

   Supplementary Movie 4. Beads capure-1 row
   [201]Supplementary Movie 5. Cell capture^ (199KB, mp4)
   [202]Supplementary Movie 6. Blocking effect^ (672.2KB, avi)
   [203]Supplementary Movie 7. Mixing^ (325.4KB, mp4)
   [204]Download video file^ (4.1MB, avi)

   Supplementary Movie 8. single cell lysis
   [205]Supplementary Data 1^ (204.4KB, xlsx)
   [206]Supplementary Data 2^ (83.2KB, xlsx)

Acknowledgements