Sponsored Links
-->

Monday, October 1, 2018

single-cell-sequencing.jpg : Nature News & Comment
src: www.nature.com

Single cell sequencing examines the sequence information from individual cells with optimized next generation sequencing (NGS) technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment.


Video Single cell sequencing



Background

A typical human cell consists of about 2 x 3.3 billion base pairs of DNA and 600 million bases of mRNA. Usually a mix of millions of cells are used in sequencing the DNA or RNA using traditional methods like Sanger sequencing or Illumina sequencing. By using deep sequencing of DNA and RNA from a single cell, cellular functions can be investigated extensively. Like typical NGS experiments, the protocols of single cell sequencing generally contain the following steps: isolation of a single cell, nucleic acid extraction and amplification, sequencing library preparation, sequencing and bioinformatic data analysis. It is more challenging to perform single cell sequencing in comparison with sequencing from cells in bulk. The minimal amount of starting materials from a single cell make degradation, sample loss and contamination exert pronounced effects on quality of sequencing data. In addition, due to the picogram level of the amount of nucleic acids used, heavy amplification is often needed during sample preparation of single cell sequencing, resulting in the uneven coverage, noise and inaccurate quantification of sequencing data.

Recent technical improvements make single cell sequencing a promising tool for approaching a set of seemingly inaccessible problems. For example, heterogeneous samples, rare cell types, cell lineage relationships, mosaicism of somatic tissues, analyses of microbes that cannot be cultured, and disease evolution can all be elucidated through single cell sequencing. Single cell sequencing was selected as the method of the year 2013 by Nature Publishing Group.


Maps Single cell sequencing



Single cell genome (DNA) sequencing

Single cell DNA genome sequencing involves isolating a single cell, performing whole-genome-amplification (WGA), constructing sequencing libraries and then sequencing the DNA using a next-generation sequencer (ex. Ion Torrent, Illumina). A genome constructed in this fashion is commonly referred to as a single amplified genome or SAG. It can be used in microbiome studies, in order to obtain genomic data from uncultured microorganisms. In addition, it can be united with high throughput cell sorting of microorganisms and cancer. One popular method used for single cell genome sequencing is multiple displacement amplification and this enables research into various areas such as microbial genetics, ecology and infectious diseases. Furthermore, data obtained from microorganisms might establish processes for culturing in the future. Some of the genome assembly tools that can be used in single cell genome sequencing include: SPAdes, IDBA-UD, Cortex and HyDA.

Method

Multiple displacement amplification (MDA) is a widely used technique, enabling amplifying femtograms of DNA from bacterium to micrograms for the use of sequencing. Reagents required for MDA reactions include: random primers and DNA polymerase from bacteriophage phi29. In 30 degree isothermal reaction, DNA is amplified with included reagents. As the polymerases manufacture new strands, a strand displacement reaction takes place, synthesizing multiple copies from each template DNA. At the same time, the strands that were extended antecedently will be displaced. MDA products result in a length of about 12 kb and ranges up to around 100 kb, enabling its use in DNA sequencing. In 2017, a major improvement to this technique, called WGA-X, was introduced by taking advantage of a thermostable mutant of the phi29 polymerase, leading to better genome recovery from individual cells, in particular those with high G+C content. Other methods include MALBAC.

Limitations

MDA of individual cell genomes results in highly uneven genome coverage, i.e. relative overrepresentation and underrepresentation of various regions of the template, leading to loss of some sequences. There are two components to this process: a) stochastic over- and under-amplification of random regions; and b) systematic bias against high %GC regions. The stochastic component may be addressed by pooling single-cell MDA reactions from the same cell type, by employing fluorescent in situ hybridization (FISH) and/or post-sequencing confirmation. The bias of MDA against high %GC regions can be addressed by using thermostable polymerases, such as in the process called WGA-X.

Single-nucleotide polymorphisms (SNPs), which are a big part of genetic variation in the human genome, and copy number variation (CNV), pose problems in single cell sequencing, as well as the limited amount of DNA extracted from a single cell. Due to scant amounts of DNA, accurate analysis of DNA poses problems even after amplification since coverage is low and susceptible to errors. With MDA, average genome coverage is less than 80% and SNPs that are not covered by sequencing reads will be opted out. In addition, MDA shows a high ratio of allele dropout, not detecting alleles from heterozygous samples. Various SNP algorithms are currently in use but none are specific to single cell sequencing. MDA with CNV also poses the problem of identifying false CNVs that conceal the real CNVs. To solve this, when patterns can be generated from false CNVs, algorithms can detect and eradicate this noise to produce true variants.

Applications

Microbiomes are among the main targets of single cell genomics due to the difficulty of culturing the majority of microorganisms in most environments. Single cell genomics is a powerful way to obtain microbial genome sequences without cultivation. This approach has been widely applied on marine, soil, subsurface, organismal, and other types of microbiomes in order to address a wide array of questions related to microbial ecology, evolution, public health and biotechnology potential.

Cancer sequencing is also an emerging application of scDNAseq. Fresh or frozen tumors may be analyzed and categorized with respect to SCNAs, SNVs, and rearrangements quite well using whole genome DNAS approaches Cancer scDNAseq is particularly useful for examining the depth of complexity and compound mutations present in amplified therapeutic targets such as receptor tyrosine kinase genes (EGFR, PDGFRA etc.) where conventional population-level approaches of the bulk tumor are not able to resolve the co-occurrence patterns of these mutations within single cells of the tumor. Such overlap may provide redundancy of pathway activation and tumor cell resistance.


Single Cell RNA Sequencing - YouTube
src: i.ytimg.com


Single cell DNA methylome sequencing

Single cell DNA methylome sequencing quantifies DNA methylation. This is similar to single cell genome sequencing, but with the addition of a bisulfite treatment before sequencing. Forms include whole genome bisulfite sequencing, and reduced representation bisulfite sequencing


Finding a Cure รข€
src: fightdipg.org


Single-cell RNA sequencing (scRNA-seq)

Current methods for quantifying molecular states of cells, from microarray to standard RNA-seq analysis, mostly depend on estimating the mean value from millions of cells by averaging the signal of individual cells. Given the heterogeneity of cell population, measurement of the mean values of signals overlooks the internal interactions and differences within a cell population that may be crucial for maintaining normal tissue functions and facilitating disease progression. Thus the cell-averaging experiments provide only partial information of the molecular state of the system.

Single-cell RNA sequencing (scRNA-seq) provides the expression profile of individual cells. Through gene clustering analyses, rare cell types within a cell population can be identified, thereby making characterization of the subpopulation structure of a heterogeneous cell population possible. While tumor heterogeneity can be attributed to accumulated mutations, even genetically identical cells, under the same environment, display high variability of gene and protein expression levels. However, RNA with low copy number, which may exert important functions in the cells, is usually undetectable or regarded as noise in traditional cell-averaging methodsNeeds citation. Single-cell RNA sequencing on a large number of single cells can identify such uncommon RNA and also reveal the copy-number distribution of the whole mRNA population in individual cells.

Experimental procedures

Despite the advances in sequencing technologies, it is currently not possible to sequence RNA directly from a single cell. Thus, in the current scRNA-seq protocols, RNA still needs to be converted to cDNA for sequencing. Principally, the current scRNA-seq methods contain the following steps: isolation of single cell and RNA, reverse transcription (RT), amplification, library generation and sequencing.

The ideal scRNA-seq preserves and accurately quantifies the initial relative abundance of mRNA in a cell, covers the entire transcript lengths with equal representation at each position, and retains strand information. However, a variety of noise and bias may be introduced in various steps of scRNA-seq protocol. For example, the step of reverse transcription is critical as the efficiency of the RT reaction determines the percentage of a cell's RNA population that is eventually analyzed by the sequencer. The processivity of reverse transcriptases and the priming strategies used will affect full-length cDNA production and the generation of libraries biased toward 3' or 5' end of genes.

In the amplification step, either PCR or in vitro transcription (IVT) is currently used to amplify cDNA. One of the advantages of PCR-based methods is able to generate full-length cDNA. However, different PCR efficiency on particular sequences (for instance, GC content and snapback structure) will also be exponentially amplified, producing libraries with uneven coverage. On the other hand, while libraries generated by IVT can avoid PCR-induced sequence bias, specific sequences may be transcribed inefficiently, thus causing sequence drop-out or generating incomplete sequences. Several scRNA-seq protocols have been published: Tang et al., STRT, SMART-seq, CEL-seq, and Quartz-seq. These protocols differ in terms of strategies for reverse transcription, cDNA synthesis and amplification, and the possibility to accommodate sequence-specific barcodes (i.e. UMIs) or the ability to process pooled samples.

Applications

The number of circulating tumor cells (CTC) in peripheral blood of cancer patients has been shown to correlate to prognosis. However, it is challenging to enumerate and characterize the isolated CTCs as they are often contaminated with a large number of leukocytes and erythrocytes. Single cell RNA-seq could be applied to differentiate cancer cells from normal blood cells and obtain the expression profiles of tumor cells at the same time. Similarly, single cell RNA-seq can also be used to analyze rare cell types in early human embryo and adult stem cells, both of which exist transiently and difficult to be characterized with current technologies. Finally, single cell analysis can be applied to the study of infectious diseases.


The biology behind single-cell RNA sequencing - YouTube
src: i.ytimg.com


Considerations

Isolation of single cells

There are several ways to isolate individual cells prior to whole genome amplification and sequencing. Fluorescence-activated cell sorting (FACS) is the most widely used approach and is employed by several high-throughput core facilities, such as Bigelow Laboratory Single Cell Genomics Center. Individual cells can also be collected by micromanipulation, for example by serial dilution or by using a patch pipette or nanotube to harvest a single cell. The advantages of micromanipulation are ease and low cost, but they are laborious and susceptible to misidentification of cell types under microscope. Laser-capture microdissection (LCM) can also be used for collecting single cells. Although LCM preserves the knowledge of the spatial location of a sampled cell within a tissue, it is hard to capture a whole single cell without also collecting the materials from neighboring cells. High-throughput methods for single cell isolation also include microfluidics. Both of FACS and microfluidics are accurate, automatic and capable of isolating unbiased samples. However, both methods require detaching cells from their microenvironments first, thereby causing perturbation to the transcriptional profiles in RNA expression analysis.

Number of cells to be analyzed

scRNA-Seq

Generally speaking, for a typical bulk cell RNA-sequencing (RNA-seq) experiment, ten million reads are generated and a gene with higher than the threshold of 50 reads per kb per million reads (RPKM) is considered expressed. For a gene that is 1kb long, this corresponds to 500 reads and a minimum coefficient of variation (CV) of 4% under the assumption of the Poisson distribution. For a typical mammalian cell containing 200,000 mRNA, sequencing data from at least 50 single cells need to be pooled in order to achieve this minimum CV value. However, due to the efficiency of reverse transcription and other noise introduced in the experiments, more cells are required for accurate expression analyses and cell type identification.


Single-cell RNA-seq highlights intratumoral heterogeneity in ...
src: science.sciencemag.org


See also

  • Single-cell analysis
  • Single-cell transcriptomics
  • Single cell epigenomics
  • DNA sequencing

File:Single cell RNA-Seq workflow.pdf - Wikimedia Commons
src: upload.wikimedia.org


References

Source of article : Wikipedia