Search Thermo Fisher Scientific
Search Thermo Fisher Scientific
The Smart Deep Basecaller (SDB) is an innovative new basecalling algorithm that allows you to obtain improved Sanger sequencing output with reduced manual review time. The Smart Deep Basecaller is available for use in Sequencing Analysis Software 8.
Figure 1. KB vs SDB in dye blob region
Compared to KB Basecaller, Smart Deep Basecaller provides:
Note: SDB basecalling is enabled for data from SeqStudio Flex, SeqStudio, 3730, and 3500 series genetic analyzers.
Automated, command-line basecalling capabilities are available to allow batch processing. When used with 3730 series genetic analyzers, the Smart Deep Basecaller can be used for automatic basecalling as part of the instrument run.
By producing more accurate basecalls with higher quality scores, Smart Deep Basecaller generates longer read lengths after quality trimming. As shown below for SeqStudio Flex, 3730, SeqStudio, and 3500 datasets, read lengths were increased between 9-16%. The advanced algorithm in SDB allows for greater accuracy in the 5’ and 3’ ends to optimize the number of bases per read.
Figure 2. KB vs SDB Read Length Comparison Graph. Q20 CRL (Quality value 20 contiguous read length)
Smart Deep Basecaller provides greater robustness for basecalling through common artifacts, such as dye blobs, N-1 peaks, and mobility shifts, as well as difficult sequences, such as GC-rich templates.
| Minimal QV (QV 1) Trimming | QV 20Trimming | QV 30Trimming | |||
Basecaller | KB | SDB | KB | SDB | KB | SDB |
Mixed Reference Positions | 185 | 185 | 182 | 185 | 173 | 183 |
Mixed Called Positions | 749 | 211 | 441 | 202 | 307 | 192 |
Correct Mixed calls | 166 | 185 | 163 | 185 | 158 | 183 |
False negatives | 19 (10.3%) | 0 (0.0%) | 19 (10.4%) | 0 (0.0%) | 15 (8.7%) | 0 (0.0%) |
False positives | 583 (74.7%) | 26 (11.5%) | 278 | 17 (8.0%) | 149 | 9 (1.4%) |
Insertion/ deletion errors | 0 | 0 | 0 | 0 | 0 | 0 |
False negative reduction | 100% | 100% | 100% | |||
False positive Reduction | 95.5% | 93.9% | 94.0% |
Based on 164 SeqStudio samples containing artifacts such as dye blobs, N-1 peaks, and mobility shifts, the Smart Deep Basecaller results provide a 100% false negative reduction and 94% false positive reduction compared to KB.
Smart Deep Basecaller’s advanced algorithm also provides new functionality to support basecalling through heterozygous insertion deletion (het indel) variants.
KB vs SDB in heterozygous insertion deletion region. In the example above, Smart Deep Basecaller outperforms KB by correctly basecalling through the het indel region with high quality values.
The optional Enhanced View for Smart Deep Basecaller provides greater resolution at the 3’ end of plasmid sequencing data by utilizing the high quality basecalls produced. This provides greater clarity and confidence in results.
Figure 4. SDB Enhanced View trace visualization
Manual review time is reduced with Smart Deep Basecaller because it helps provide fewer low quality basecalls and fewer false positive calls. This eases Sanger sequencing data review, freeing senior staff time to work on other tasks and reducing training time for new staff.
Figure 5. Number of edits needed
Data from Applied Biosystems SeqStudio Flex, SeqStudio, 3500, and 3730 Genetic Analyzers can be basecalled with Smart Deep Basecaller.
Sridar Chittur, Lab Director, and Andrew Hayden, Research Support Specialist
Microarray & HT Sequencing Core, Center for Functional Genomics, University at Albany, State University of New York
Beta testing of Smart Deep Basecaller showed superior performance over traditional methods, delivering more accurate results. CFG's commitment to innovation in Sanger sequencing highlights their support for scientific progress and global research empowerment.
Werner Sterr, Sequencing Manager at Thermo Fisher Scientific GeneArt GmbH
Watch this on-demand webinar and learn about the testing process implemented to introduce the new Smart Deep Basecaller for sequencing QC. Discover how this innovative tool can provide optimized basecalling results, including longer reads and more robust basecalling in challenging sequences.