Overview of Protein Expression Systems

Discover the Protein Expression Learning Lab

Whether you’re a beginner or a pro seeking optimization, seize the opportunity to elevate your protein expression skills with our Protein Expression Learning Lab, a comprehensive, self-paced, and interactive training resource.

Protein Expression Learning lab


Introduction to protein expression

Proteins are synthesized and regulated depending upon the functional need in the cell. The blueprints for proteins are stored in DNA and decoded by highly regulated transcriptional processes to produce messenger RNA (mRNA). The message coded by an mRNA is then translated into a protein. Transcription is the transfer of information from DNA to mRNA, and translation is the synthesis of protein based on a sequence specified by mRNA.

Simple diagram of transcription and translation. This describes the general flow of information from DNA base-pair sequence (gene) to amino acid polypeptide sequence (protein).

In prokaryotes, the process of transcription and translation occur simultaneously. The translation of mRNA starts even before a mature mRNA transcript is fully synthesized. This simultaneous transcription and translation of a gene is termed coupled transcription and translation. In eukaryotes, the processes are spatially separated and occur sequentially with transcription happening in the nucleus and translation, or protein synthesis, occurring in the cytoplasm.


Transcription and translation

Transcription occurs in three steps in both prokaryotes and eukaryotes: initiation, elongation and termination. Transcription begins when the double-stranded DNA is unwound to allow the binding of RNA polymerase. Once transcription is initiated, RNA polymerase is released from the DNA. Transcription is regulated at various levels by activators and repressors and also by chromatin structure in eukaryotes. In prokaryotes, no special modification of mRNA is required and translation of the message starts even before the transcription is complete. In eukaryotes, however, mRNA is further processed to remove introns (splicing), addition of a cap at the 5´ end and multiple adenines at the mRNA 3´ end to generate a polyA tail. The modified mRNA is then exported to the cytoplasm where it is translated.

 

Translation or protein synthesis is a multi-step process that requires macromolecules like ribosomes, transfer RNAs (tRNA), mRNA and protein factors as well as small molecules like amino acids, ATP, GTP and other cofactors. There are specific protein factors for each step of translation (see table below). The overall process is similar in both prokaryotes and eukaryotes, although particular differences exist.

 

During initiation, the small subunit of the ribosome bound to initiator t-RNA scans the mRNA starting at the 5’end to identify and bind the initiation codon (AUG). The large subunit of the ribosome joins the small ribosomal subunit to generate the initiation complex at the initiation codon. Protein factors as well as sequences in mRNA are involved in the recognition of the initiation codon and formation of the initiation complex. During elongation, tRNAs bind to their designated amino acids (known as tRNA charging) and shuttle them to the ribosome where they are polymerized to form a peptide. The sequence of amino acids added to the growing peptide is dependent on the mRNA sequence of the transcript. Finally, the nascent polypeptide is released in the termination step when the ribosome reaches the termination codon. At this point, the ribosome is released from the mRNA and is ready to initiate another round of translation.


Protein synthesis machinery

Summary of the primary components and features of prokaryotic and eukaryotic translational apparatus.

Component

Prokaryotes

Eukaryotes

Ribosomes

30S and 50S Subunits

40S and 60S Subunits

Template or mRNA

No further processing of mRNA transcript occurs after transcription.

mRNA is polycistronic and contains multiple initiation sites.

After transcription, the mRNA transcript is spliced to remove the noncoding regions (introns), and a cap structure (M7methyl guanosine) and a poly adenosine sequence are added at the 5' and 3' end of the message respectively.

The Cap structure and the poly A are important for export of mRNA to the cytoplasm, proper initiation of translation and stability of mRNA among other functions. The mRNA is usually monocistronic.

Features of translation

The Shine-Dalgarno sequence is present on the mRNA transcript, and a complementary sequence is present in the ribosomal subunit. This facilitates binding and alignment of the ribosome on the mRNA at the translation initiation site (AUG).

The first amino acid of the nascent polypeptide is formylated methionine.

Translation initiation occurs in two ways:

Cap-dependent translation: Cap structure and the cap binding proteins are responsible for proper ribosome binding to mRNA and recognition of the correct initiation codon. The first AUG codon in the 5’end of mRNA functions as the initiation codon. Sometimes Kozak sequence may be present around the initiation codon.

Cap-independent translation: Ribosome binding to mRNA occurs through 'internal ribosome entry site' (IRES) on mRNA.

Initiation factors

Three initiation factors are known, IF1, IF2, &IF3

More than three initiation factors, which are regulated by phosphorylation. The initiation step is the rate-limiting step in eukaryotic translation.

Elongation factors

EF-Tu & EF-Ts, EF-G

EF1(α, β, γ) and EF2

Termination or release factors

RF1 and RF-2

eRF-1


Post-translational modification

After translation, polypeptides are modified in various ways to complete their structure, designate their location or regulate their activity within the cell. Post-translational modifications (PTMs) are various additions or alterations to the chemical structure and are critical features of the overall cell biology.

Types of post-translational modifications include:

  • Polypeptide folding into a globular protein with the help of chaperone proteins to arrive at the lowest energy state
  • Modifications of the amino acids present, such as removal of the first methionine residue
  • Disulfide bridge formation or reduction
  • Protein modifications that facilitate binding functions:
    • Glycosylation
    • Prenylation of proteins for membrane localization
    • Acetylation of histones to modify DNA–histone interactions
  • Addition of functional groups that regulate protein activity:
    • Phosphorylation
    • Nitrosylation
    • GTP binding

Recombinant protein expression methods

In general, proteomics research involves investigating any aspect of a protein such as structure, function, modifications, localization or protein interactions. To investigate how particular proteins regulate biology, researchers usually require a means of producing (manufacturing) functional proteins of interest.

 

Given the size and complexity of proteins, chemical synthesis is not a viable option for this endeavor. Instead, living cells and their cellular machinery are usually harnessed as factories to build and construct proteins based on supplied genetic templates.

 

Unlike proteins, DNA is simple to construct synthetically or in vitro using well established recombinant DNA techniques. Therefore, DNA templates of specific genes, with or without add-on reporter or affinity tag sequences, can be constructed as templates for protein expression. Proteins produced from such DNA templates are called recombinant proteins.

 

Traditional strategies for recombinant protein expression involve transfecting cells with a DNA vector that contains the template and then culturing the cells so that they transcribe and translate the desired protein. Typically, the cells are then lysed to extract the expressed protein for subsequent purification. Both prokaryotic and eukaryotic in vivo protein expression systems are widely used. The selection of the system depends on the type of protein, the requirements for functional activity and the desired yield. These expression systems are summarized in the table below and include mammalian, insect, yeast, bacterial, algal and cell-free. Each system has advantages and challenges, and choosing the right system for the specific application is important for successful recombinant protein expression. The following table provides an overview of recombinant protein expression systems.

A general recombinant protein expression workflow consists of four steps:

  • Cloning/construct design—the construction of expression vectors to carry the target DNA fragment or gene of interest (GOI)
  • Expression—the delivery of the expression construct to the chosen host cells, the growth of those cells, and expression of the recombinant protein 
  • Purification—the sample preparation and purification of the target protein 
  • Analytics/characterization—the identification and quantitation of the isolated protein of interest

Critical parameters to select a host system

A number of recombinant protein expression host options exist for prokaryotic and eukaryotic proteins, including mammalian, insect, yeast, bacterial, algal, and cell-free systems.

 

The choice of an optimal expression system for a specific protein primarily depends on:

 

The origin of the target protein—Bacterial proteins are better expressed in bacterial systems, while mammalian proteins are better expressed in mammalian systems. For example, mammalian cell lines such as Chinese Hamster Ovary (CHO) and Human Embryonic Kidney (HEK) are the systems of choice for human protein production.

 

Required post-translational modifications—Proteins devoid of complex modifications (e.g., glycosylation, alkylation, phosphorylation, or specific proteolytic processing) are more easily produced in simpler systems with short turnaround times such as bacteria and yeast cells. In contrast, complex proteins with significant modifications should be produced in either insect or mammalian cells.

 

Solubility of the recombinant protein—Some proteins are not properly folded in bacterial systems, as a result, they tend to form insoluble aggregates (or inclusion bodies) that are difficult to extract. In these circumstances, higher eukaryotic systems should be preferred for the recombinant production of the target protein.

 

The intended application of the recombinant protein—Some downstream applications require large amounts of target protein. In these instances, simpler systems with short growth cycles (e.g., bacterial or yeast) might be the better solution. However, if the downstream application requires a functional protein that is reliant on the final protein conformation, then the choice might be insect or mammalian hosts.

Each system has advantages and challenges, making host choice very important for successful recombinant protein expression. We’ve compared in the enclosed table the different host systems for each critical parameter.


Mammalian protein expression

Mammalian expression systems can be used to produce proteins transiently or through stable cell lines, where the expression construct is integrated into the host genome. While stable cell lines can be used over several experiments, transient production can generate large amounts of protein in one to two weeks. These transient, high-yield mammalian expression systems utilize suspension cultures and can produce gram-per-liter yields. Furthermore, these proteins have more native folding and post-translational modifications, such as glycosylation, as compared to other expression systems. In the example that follows, 3 different mammalian expression systems were used to express recombinant proteins. 

Mammalian expression systems can be used to produce mammalian proteins that have the most native structure and activity due to its physiologically relevant environment. This results in high levels of post-translational processing and functional activity. Mammalian expression systems are the preferred system for the expression of mammalian proteins and can be used for the production of antibodies, complex proteins and proteins for use in functional cell-based assays. However, these benefits are coupled with more demanding culture conditions.

Expi293 expression system

The Expi293 expression system is a major advance in transient expression technology for rapid and ultrahigh-yield protein production in human cells. It is based on high-density culture of Gibco Expi293 Cells in Gibco Expi293 expression medium. Transient expression is powered by the cationic lipid-based Gibco ExpiFectamine 293 transfection reagent in combination with optimized transfection enhancers designed to work specifically with this transfection reagent. All components work in concert to generate 2- to 10-fold higher protein yields than are attained with previous 293-transient expression systems. Expression levels of greater than 1 g/L can be achieved for IgG and non-IgG proteins.

ExpiCHO expression system

The ExpiCHO expression system has revolutionized the use of CHO cells for transient protein expression during early phase drug candidate screening. The glycosylation patterns of recombinant IgG produced by the Expi293 and ExpiCHO transient expression systems were compared to the same protein expressed in stable CHO cells. It is clear that glycosylation of recombinant IgG produced in the ExpiCHO system is much more like glycosylation of the stable CHO cell system (Figure 2.3) which provides a very strong correlation between transiently expressed drug candidates and downstream biotherapeutics manufactured in CHO.


Insect protein expression

Insect cells can be used for high level protein expression with modifications similar to mammalian systems. There are several systems that can be used to produce recombinant baculovirus, which can then be utilized to express the protein of interest in insect cells. These systems can be easily scaled up and adapted to high-density suspension culture for large-scale expression of protein that is more functionally similar to native mammalian protein. Though yields can be up to 500 mg/L, recombinant baculovirus production can be time consuming and culture conditions more challenging than prokaryotic systems.

In the section that follows, we compare a standard protocol with ExpiSf protocol from baculovirus production to protein expression. Bacmid DNA transfection in suspension using the ExpiSf system allows for efficient baculovirus generation without the need for virus amplification, enabling to deliver protein in half the time compared to a standard protocol.


Yeast protein expression

Yeast strains are extremely useful for the expression and analysis of recombinant eukaryotic proteins and are ideally suited for large-scale production. Yeast systems also are typically used for structural and functional research such as protein interaction studies. These single-celled eukaryotic organisms are genetically well characterized and are known to perform many post-translational modifications. They grow quickly in defined medium and are easily adapted to fermentation methods. Yeast systems are also easier and less expensive to work with than insect or mammalian cells.


Bacterial expression

Bacterial protein expression systems are popular because bacteria are easy to culture, grow fast and produce high yields of recombinant protein. However, multi-domain eukaryotic proteins expressed in bacteria often are non-functional because the cells are not equipped to accomplish the required post-translational modifications or molecular folding. Also, many proteins become insoluble as inclusion bodies that are very difficult to recover without harsh denaturants and subsequent cumbersome protein-refolding procedures. In the example that follows, a bacterial cell-based system was used to express 8 different recombinant proteins. 

Protein expression in bacterial cells. Gateway cloning was used to clone 8 human proteins into the Invitrogen Champion pET300/NT-DEST vector. BL21(DE3) E. coli were utilized to express positive clones in either LB + IPTG (1), ready-to-use Invitrogen MagicMedia medium (2), or MagicMedia medium prepared from powder (3). Samples were lysed and analyzed on a Coomassie blue dye–stained Invitrogen NuPAGE 4-12% Bis-Tris Protein Gel. M = Invitrogen SeeBlue Protein Standard. Use of MagicMedia E. coli medium results in higher protein yield across different samples.


Cell-free expression

Cell-free protein expression is the in vitro synthesis of a protein using translation-compatible extracts of whole cells. In principle, whole cell extracts contain all the macromolecules and components needed for transcription, translation and even post-translational modification. These components include RNA polymerase, regulatory protein factors, transcription factors, ribosomes and tRNA. When supplemented with cofactors, nucleotides and the specific gene template, these extracts can synthesize proteins of interest in a few hours.

 

Although not sustainable for large scale production, cell-free, or in vitro translation (IVT) protein expression systems, have several advantages over traditional in vivo systems. Cell-free expression allows for fast synthesis of recombinant proteins without the hassle of cell culture. Cell-free systems enable protein labeling with modified amino acids, as well as expression of proteins that undergo rapid proteolytic degradation by intracellular proteases. Also, with the cell-free method, it is simpler to express many different proteins simultaneously (e.g., testing protein mutations by expression on a small scale from many different recombinant DNA templates). In this representative experiment, an IVT system was used to express human caspase 3 protein. 

Caspase-3 expression in a human IVT system. Caspase-3 was expressed using the Thermo Scientific 1-Step Human High-Yield IVT Kit (Human IVT) and in E. coli (Recombinant). Active caspase-3 activity was assayed using equal amounts of protein. Caspase-3 protein expressed using the IVT system was more active as compared to a protein expressed in bacteria. 


Setting up an effective protein expression lab

Points to consider in setting up a protein expression lab

Whether you are setting up a new lab or introducing protein expression into your current workflow, you’ll find here some of the short- and long-term planning considerations for your project.

Protein expression essentials checklist

The specific requirements of a protein expression laboratory depend mainly on the type of research conducted. However, if you're planning to express proteins in mammalian cells, all cell culture laboratories have the common requirement of being free from pathogenic microorganisms (i.e., asepsis), and share some of the same basic equipment that is essential for culturing cells.

The following checklist provides a concise list of recommended consumables, instruments, and equipment to guide you in starting your protein expression lab.


Overview of downstream applications for recombinant proteins

The recombinant proteins can be used in applications for research and development to industrial production, including disease treatment, protein engineering, high-throughput screening, and more. Here we highlight some of the key applications that rely on recombinant protein expression.