The Smart Deep Basecaller (SDB) is an innovative new basecalling algorithm that allows you to obtain improved Sanger sequencing output with reduced manual review time. The Smart Deep Basecaller is available for use in Sequencing Analysis Software 8.

Two electropherograms with basecalls and quality value bars, one for KB Basecaller and one for SDB Basecaller

Figure 1. KB vs SDB in dye blob region


Compared to KB Basecaller, Smart Deep Basecaller provides:

  • Increased read lengths—more high quality basecalls at 5’ and 3’ ends
  • More accurate pure and mixed base calls 
  • Improved basecalling accuracy through artifacts such as dye blobs, N-1 peaks, and mobility shifts and difficult sequences such as GC-rich templates
  • New functionality to support basecalling through heterozygous insertion deletion (het indel) variants
  • Reduced manual review time—fewer edits and fewer false positives
  • Optional Enhanced View trace visualization with a cleaned-up baseline and increased resolution in the 3’ end of plasmid (pure base) sequences

Note: SDB basecalling is enabled for data from SeqStudio Flex, SeqStudio, 3730, and 3500 series genetic analyzers.

Automated, command-line basecalling capabilities are available to allow batch processing. When used with 3730 series genetic analyzers, the Smart Deep Basecaller can be used for automatic basecalling as part of the instrument run.


Increased throughput with longer reads
 

By producing more accurate basecalls with higher quality scores, Smart Deep Basecaller generates longer read lengths after quality trimming.  As shown below for SeqStudio Flex, 3730, SeqStudio, and 3500 datasets, read lengths were increased between 9-16%.  The advanced algorithm in SDB allows for greater accuracy in the 5’ and 3’ ends to optimize the number of bases per read.

Bar graph showing Q20 CRL read lengths for SeqStudio Flex StdSeq, 3500 StdSeq, 3730 XLRSeq, 3730 LongRead Seq, and SeqStudio LongSeq

Figure 2. KB vs SDB Read Length Comparison Graph. Q20 CRL (Quality value 20 contiguous read length)


Greater robustness
 

Smart Deep Basecaller provides greater robustness for basecalling through common artifacts, such as dye blobs, N-1 peaks, and mobility shifts, as well as difficult sequences, such as GC-rich templates.

 

Minimal QV (QV 1) Trimming

QV 20

Trimming

QV 30

Trimming

 Basecaller

KB

SDB

KB

SDB

KB

SDB

 Mixed Reference Positions

185

185

182

185

173

183

 Mixed Called Positions

749

211

441

202

307

192

 Correct Mixed calls

166

185

163

185

158

183

 False negatives

19

(10.3%)

0

(0.0%)

19

(10.4%)

0

(0.0%)

15

(8.7%)

0

(0.0%)

 False positives

583

(74.7%)

26

(11.5%)

278

(54.1%)

17

(8.0%)

149

(28.6%)

9

(1.4%)

 Insertion/ deletion errors

0

0

0

0

0

0

 False negative reduction

100%

100%

100%

 False positive Reduction

95.5%

93.9%

94.0%

Based on 164 SeqStudio samples containing artifacts such as dye blobs, N-1 peaks, and mobility shifts, the Smart Deep Basecaller results provide a 100% false negative reduction and 94% false positive reduction compared to KB.

Smart Deep Basecaller’s advanced algorithm also provides new functionality to support basecalling through heterozygous insertion deletion (het indel) variants.

Het_indel


KB vs SDB in heterozygous insertion deletion region. In the example above, Smart Deep Basecaller outperforms KB by correctly basecalling through the het indel region with high quality values.


Improved 3’ end resolution with Enhanced View
 

The optional Enhanced View for Smart Deep Basecaller provides greater resolution at the 3’ end of plasmid sequencing data by utilizing the high quality basecalls produced.  This provides greater clarity and confidence in results.

SDB Enhanced View electropherogram and KB view electropherogram

Figure 4. SDB Enhanced View trace visualization


Fewer edits = less manual review time
 

Manual review time is reduced with Smart Deep Basecaller because it helps provide fewer low quality basecalls and fewer false positive calls.  This eases Sanger sequencing data review, freeing senior staff time to work on other tasks and reducing training time for new staff.

A line graph showing the number of sequence edits needed using KB and SDB Basecallers

Figure 5. Number of edits needed


Instrument requirements
 

Data from Applied Biosystems SeqStudio Flex, SeqStudio, 3500, and 3730 Genetic Analyzers can be basecalled with Smart Deep Basecaller.

Learn about the Sanger sequencing workflow ›

Sridar-Chittur

Uncovering the Science Behind Success: SUNY CFG’s Journey testing Smart Deep Basecaller

Sridar Chittur, Lab Director, and Andrew Hayden, Research Support Specialist
Microarray & HT Sequencing Core, Center for Functional Genomics, University at Albany, State University of New York

Beta testing of Smart Deep Basecaller showed superior performance over traditional methods, delivering more accurate results. CFG's commitment to innovation in Sanger sequencing highlights their support for scientific progress and global research empowerment.

Read blog

Picture

Webinar: Testing Applied Biosystems Smart Deep Basecaller for Sanger Sequencing QC

Werner Sterr, Sequencing Manager at Thermo Fisher Scientific GeneArt GmbH

Watch this on-demand webinar and learn about the testing process implemented to introduce the new Smart Deep Basecaller for sequencing QC. Discover how this innovative tool can provide optimized basecalling results, including longer reads and more robust basecalling in challenging sequences.

Watch webinar

Style Sheet for Global Design System