Epitope tags are an extremely useful tool for all types of research. However, there are many things to consider before using epitope tags to ensure that experiments go as planned. This article goes through various topics, such as how to select an epitope tag, design an epitope tag fusion construct, improve sensitivity, how to combine tags, and a troubleshooting guide for when no/low expression of the epitope-tagged protein is observed.

Explore Epitope Tag Antibodies  Explore the epitope tag guide
 


Selecting an epitope tag

Choice of an epitope tag depends on many factors, but downstream application is the most important one. If the desired downstream application is recombinant protein purification, epitope tags like 6X-His, GST (Glutathione-S-transferase), and MBP (Maltose Binding Protein) are widely used in affinity purification. Peptide tags, such as HA (Hemagglutinin), Myc, DYKDDDDK, and V5 are popularly used for detection in western blot, immunocytochemistry, and co-immunoprecipitation. Fluorescent protein tags (GFP, RFP, mCherry) are often used for visualization of recombinant proteins in a cellular context and have been used in live cell imaging, reporter assays, Fluorescence Resonance Energy Transfer (FRET), and cell sorting.

The size of the tag, with respect to the size of the protein of interest, is a very important factor to be considered for tag selection as well. Usually, it is not desirable to tag a protein of large size with large polypeptides like GFP, GST, or MBP. Such a bulky design might interfere with protein conformation and function. Use of smaller peptide tags offer more flexibility and minimize effect on protein function.

Another factor governing epitope tag selection is the availability of specific antibodies against the desired epitope tag or a well-established purification method of the tagged proteins.


Introducing an epitope tag to a protein of interest

Two standard approaches used to introduce epitope tags in cloned genes are:

  • Oligonucleotide with an epitope tag: A PCR-based approach employs an oligonucleotide having an epitope tag sequence, which is then used to amplify the gene of interest in-frame for expression.
  • Use of epitope tag vector: Another method commonly used to introduce an epitope tag is to clone your gene of interest in-frame in an expression vector containing the epitope tag that is most suitable for the desired host.


Factors to consider when designing an epitope tag fusion construct

  • In-frame fusion: It is of utmost importance to genetically engineer an epitope tag in-frame with the coding sequence of the target gene (if the tag is introduced using a PCR approach) or of the tag vector. Failure to ensure these details can result in no expression or aberrant expression of the target protein. This can be confirmed by checking nucleotide sequences of the cloned vector and running it through an online translation tool to see if the desired protein is being expressed.
  • Stop codon: It is critical to ensure the presence of a stop codon for translation termination in any coding sequence. Without a stop codon, the signal to remove the ribosome from the transcript is missing. This can lead to stalling of the ribosome at the end of the transcript. Such aberrant transcripts are degraded and have no expression of desired protein.
  • Positioning of the tag: Epitope tags are commonly positioned near the end of the N-terminus or C-terminus of a target protein. There is minimal risk of interference with protein function and conformation if the tag is introduced at the N- or C- termini, unless either extremity is known to have a specific function. Depending upon the requirements of the experiment, epitope tags can also be introduced internally in the target gene, if necessary.


Using a linker sequence between epitope tags and the target gene while cloning

Selection of a suitable linker sequence to join the target protein and the epitope tag is an important aspect of a recombinant fusion. Direct fusion of the protein and tag (especially large protein tag) domains without a linker may lead to undesirable outcomes, such as misfolding of the fusion proteins, low protein yield, or impaired function.

Researchers often use flexible linkers containing small non-polar glycine (Gly) or polar serine (Ser) residues. These residues provide a certain degree of movement. Primarily, stretches of Gly and Ser residues (“GS” linker) are used to improve folding and stability of a fusion protein. The most widely-used flexible linker has the sequence of (Gly-Gly-Gly-Gly-Ser)n. This linker does not have any repeats that can lead to homologous recombination. The length of the GS linker can be optimized by adjusting copy number “n” to attain the appropriate separation of the target protein and epitope tag. Another flexible linker used for improved stability and folding is Gly-Gly-Gly-Gly (GGGG). There are other various kinds of linkers used based on their function and utilities. Different linkers are used to increase protein expression, improve biological activity, enable targeting, and can also be cleaved by proteases.


Improving the sensitivity of epitope tags

In order to improve the sensitivity of epitope tag detection, many researchers use tandem repeats (usually 3X) of peptide tags within the coding sequence. There are commercially available vectors with tandem repeats of epitope tags for various applications. The most frequently-used tags in tandem are 6X-His for better purification and HA, DYKDDDDK, or V5 tags for improved detection in western blot, immunocytochemistry, and co-immunoprecipitation.

However, while the introduction of epitope tags in tandem does improve sensitivity in detection techniques, it may also make it difficult to elute off an affinity resin and should be a consideration if the main purpose is purification rather than detection.


Epitope tags that are used for improving the solubility of proteins

Using fusion partners to improve the solubility of a recombinant protein is a common strategy that works by improving the folding of the protein which leads to higher solubilization. While several tags are available for this purpose, the most well characterized and commonly used are GST, MBP, and Trx.

Selecting a suitable fusion partner is an empirical science as there are no set rules for what tag will work best for which protein. However, it is important that the tag that is selected is both highly soluble and stable. A literature review of tags used for solubilizing similar proteins, or proteins of the same family, can be a useful indicator of what is best.

While a solubility tag can be fused both at the N-, as well as, the C-terminus, N-terminal fusions are usually preferred.


Combinatorial tagging

Often, a single tag is not enough. Combinatorial tagging, or using more than one tag, is employed when it is important for the fusion protein to have multiple functionalities. Depending upon the characteristics that are desired, different combinations of tags can be explored. For example, if the recombinant protein needs a solubility partner as well as affinity purification, then the fusion protein can be tagged with MBP and an affinity tag (e.g., 6X-His). On the other hand, if the protein is needed for intracellular localization studies, then the fusion protein can be tagged with a fluorophore, such as GFP.

Another case where combinatorial tagging is used is when a high-purity recombinant protein is needed for crystallization or mass-spectroscopic studies. Tagging the protein with dual-affinity tags enables it to undergo multiple rounds of purification which ensures purity as well as integrity of the full-length protein.

Tags can also be chosen based on experimental utility. For example, if the protein is needed for IP-based studies, then a tag such as Myc can be used in combination with MBP, which can serve as the affinity and/or solubility tag.

Choosing a combination of tags, much like choosing a tag itself, is driven largely by the needs of the end user. Although there are no hard and fast rules for dual tagging, generally the use of more than 2 tags is avoided. This is done with the idea of reducing the metabolic burden on the host and to reduce the complexity of the final protein.


Removing tags with proteases

Despite the tremendous utility of epitope tags, sometimes it is necessary to remove them once their purpose is served. This is especially important in cases where:

  • The tag affects downstream applications by impacting protein activity/conformation.
  • The tag hinders protein crystallization or mass spectroscopic studies, which require pure protein in the untagged form.
  • The purified protein is required for therapeutic purposes.
     

Some of the commonly used proteases are listed below:

ProteaseCleavage siteSize
TEV ProteaseGlu-Asn-Leu-Tyr-Phe-Gln/Gly27 kDa
PreScission ProteaseLeu-Phe-Gln/Gly-Pro46 kDa
ThrombinLeu-Val-Pro-Arg/Gly-Ser31+6 kDa
Factor XaIle-(Glu or Asp)-Gly-Arg/42+17 kDa
EnterokinaseAsp-Asp-Asp-Asp-Lys/31 kDa


To cleave the tag, a suitable cleavage site needs to be incorporated between the tag and the protein. Most commercial vectors offer multiple cleavage options, but the same can also be manually incorporated.

Tag removal is done after purification of the fusion protein and involves incubation of the protein with the protease. A second purification step is then needed to separate the protease from the pure protein. Size-exclusion chromatography (SEC) is commonly used for this purpose. As an alternative, affinity separation can be used as many commercially available proteases are affinity-tagged to facilitate removal. For ease of use, the tag on the protease can be matched with that of the protein so that both uncleaved tagged protein and protease can be removed in a single purification step.

To have a successful cleavage reaction:

  • It is essential to confirm that the cleavage site is not present in the protein of interest.
  • Time, temperature, and concentration of the protease needs to be carefully calibrated to avoid cleavage at secondary sites.
  • If using SEC to remove the protease, mass of the protein should be sufficiently different from that of the protease.


Troubleshooting guide for no/low expression of epitope-tagged protein

If the epitope-tagged protein is not detected, there are multiple issues that could be occurring. Below is a table with the possible issues, the reason for the issues, and solutions on how to fix them.

IssueReasonSolution
No detection of target protein or tag on WBReading frame errorSequence the construct to rule out internal start sites or premature stop codons that prevent in-frame translation of the tagged protein.
Sub-optimal transfer to blotting membraneStandardize transfer conditions to improve transfer efficiency.
Not enough sample was loaded on gelIncrease amount of lysate loaded.
Expression of the protein may be too lowStandardize induction conditions to increase expression and monitor using SDS-PAGE.
Insufficient exposureIncrease exposure time of blot to get signal.
Detection of target protein only; tagged protein not detectedReading frame errorSequence the construct to rule out internal start sites or premature stop codons that prevent in-frame translation of tagged protein.
Tagged fusion protein may have degradedInclude protease inhibitors in buffer. Check the construct for proteolytic sites.
Detection of either N- or C-tagged protein, not bothReading frame errorSequence the construct to rule out internal start sites or premature stop codons that prevent in-frame translation of tagged protein.
Partial proteolytic degradationAdd protease inhibitor to buffers. Check construct for proteolytic sites.
Target and tagged proteins showing different localizationsReading frame errorSequence the construct to rule out internal start sites or premature stop codons that prevent in-frame translation of tagged protein.
Partial proteolytic degradationAdd protease inhibitor to buffers. Check construct for proteolytic sites.

For Research Use Only. Not for use in diagnostic procedures.