Search Thermo Fisher Scientific
Search Thermo Fisher Scientific
Next-generation sequencing (NGS) is a high-throughput sequencing method that enables sequencing of large and complex genomes (e.g., human genome) in a single day. In Illumina NGS systems, high-throughput generation of data is made possible by massively parallel sequencing of nucleic acid samples. Next generation sequencing protocol includes isolation of desired nucleic acids, fragmentation of isolated nucleic acids and preparation of samples for the sequencers (library preparation), sequencing reactions, and bioinformatic processing and analysis of sequencing data. This section covers an overview and key considerations for each main step of the workflow, as illustrated in Figure 1.
Figure 1. Next-generation sequencing workflow using Illumina systems.
Nucleic acid isolation is a crucial first step in the NGS workflow, regardless of whether you are sequencing genomic DNA (gDNA), total RNA, or different RNA types. It is important to select an isolation method or kit that enables proper lysis of the cells and tissue. This will, in turn, help you obtain the yield, purity, and quality needed for subsequent library preparation steps.
Considerations for nucleic acid isolation for NGS include the following to help ensure success in downstream steps.
Yield, purity, and quality of isolated nucleic acids should be assessed before proceeding to NGS library preparation. The following are methods commonly used for examination of these attributes:
In assessing RNA integrity, the RNA integrity number (RIN), obtained using an algorithm and microfluidic-based electrophoresis [1], and the integrity and quality (IQ) score, based on fluorometry [2], can provide quantitative values of the intact RNA population in the isolated sample.
In cases of low quantities of nucleic acids (e.g., when using single cells as the source) isolated DNA and RNA may be amplified using polymerases appropriate for whole genome amplification (WGA) and whole transcriptome amplification (WTA), respectively, to increase the amount of starting template prior to NGS library preparation. WGA and WTA can help obtain more sequencing reads, better coverage, improved sensitivity, and better variant detection from limited sample amounts. Phi29 DNA polymerase is commonly used for WGA because of its high processivity, reduced bias, high fidelity, and ability to synthesize DNA isothermally at a low temperature [3].
After isolation and purification, nucleic acids are prepared so that they can be processed and read by the sequencers. These prepared, ready-to-sequence samples are commonly known as “libraries” because they represent a collection of molecules that are sequenceable. The library preparation procedure may vary depending on the methods and reagents used, but the general steps in library preparation for Illumina systems are as follows:
Figure 2. Workflow of NGS library preparation for Illumina systems.
Note that additional steps may be performed in different library preparation workflows. For instance, size selection or purification of fragments of desired sizes is a common step to enhance sample quality. Library amplification by PCR is usually performed after adapter ligation, especially when working with a low quantity of starting material. When RNA is used, cDNA synthesis is part of the library preparation workflow. Target enrichment may be performed when sequencing is needed only for a defined set of genes or genomic regions (rather than the whole genome or transcriptome). (Learn more: DNA sequencing library preparation)
Regardless of the steps involved, the final prepared NGS libraries should consist of DNA fragments of desired lengths with adapters at both ends. When relying on a commercial kit for NGS library preparation, look for reagents or protocols that can offer a simple procedure with less hands-on time while still ensuring high-quality libraries with good yield.
Prior to the sequencing reactions, DNA libraries undergo clonal amplification. In this process, DNA fragments of the libraries are amplified so that fluorescent signals of single-base incorporation in the subsequent sequencing reaction are strong enough to be detected by the sequencers.
The Illumina platform utilizes solid-phase amplification in which each fragment in the library first anneals to the primers on the sequencing chip (known as the flow cell) via the adapters. Through a series of amplification reactions known as bridge amplification [4] (Figure 3A), each fragment forms a cluster of identical molecules called clonal clusters (Figure 3B); therefore, every cluster represents one primary library molecule. Note that clonal amplification on a patterned flow cell with predefined arrays employs a different method called exclusion amplification (ExAmp) chemistry. The ExAmp technology involves instantaneous amplification of a DNA fragment after binding to the primer on the patterned flow cell, excluding other DNA fragments from forming a polyclonal cluster [5].
This process of clonal amplification should not be confused with library amplification, which is carried out to increase library input before loading onto a flow cell.
Figure 3. Clonal amplification steps. (A) Bridge amplification. (1) The complementary strand of a DNA fragment in the library is synthesized from the flow cell’s priming oligo. (2) After removal of the original strand, the complementary strand folds over and anneals with the other type of flow cell oligo. A double-stranded bridge is formed after synthesis of its complementary strand. (3) The double-stranded bridge is denatured, forming two single strands attached to the flow cell. (4) The process of bridge amplification repeats, and (5) more clones of double-stranded bridges are formed. (B) Cluster generation. The double-stranded clonal bridges are denatured (only one strand is shown here for simplicity), the reverse strands are removed, and the forward strands remain as clusters for sequencing.
The step after clonal amplification is sequencing by synthesis (SBS), in which nucleotides incorporated by a DNA polymerase into the complementary DNA strand of the clonal clusters are detected one base at a time.
The Illumina sequencing technology utilizes fluorescent dye–labeled dNTPs with a reversible terminator to capture fluorescent signals in each cycle, relying on a process called cyclic reversible termination [6] (Figure 4). In each cycle, only one of four fluorescent dNTPs is incorporated by the DNA polymerase, based on complementarity, and then unbound dNTPs are washed away. Images of the clusters are captured after the incorporation of each nucleotide; the emission wavelength and fluorescence intensity of the incorporated nucleotide are measured to identify the base that was incorporated in each cluster during that cycle. After imaging, the fluorescent dye and the terminator are cleaved and released, followed by the next cycle of synthesis, imaging, and deprotection. Since each base is sequenced one cycle at a time, this process is repeated “n” cycles to achieve a read length of “n” bases.
Figure 4. Sequencing by cyclic reversible termination. For simplicity, sequencing primers are not shown. Note that some Illumina systems may use two-channel and one-channel SBS chemistry, instead of four-channel chemistry (four fluorophore colors) as illustrated in this figure [7].
The final step in the NGS workflow is processing, analysis, and interpretation of the sequencing data generated. Bioinformatic tools are used to convert raw sequencing data into meaningful results. As NGS generates gigabases of raw data, the ability and availability of computing power to process and analyze such massive amounts of data is one of the bottlenecks of the workflow.
This step of the NGS workflow can be roughly categorized into three stages as shown in Table 1. Applications and goals of NGS experiments often dictate how the data are processed and analyzed, as well as which bioinformatic tools are used. (Learn more: NGS data analysis)
Stage | Undertaking |
---|---|
Processing: Cleanup of sequencing data |
|
Analysis: Investigation of sequence relevance, variance, distinctiveness, novelty, etc. |
|
Interpretation: Prediction of gene functions and biological relevance |
|
In conclusion, NGS is a powerful technique that generates massive amounts of data that can lead to new biological insights. Although its workflow involves a number of processes and considerations, understanding the basic principles of the key steps can help you plan NGS experiments, obtain high-quality data, and achieve meaningful results.
For Research Use Only. Not for use in diagnostic procedures.