Difference between revisions of "20.109(S16):In situ cloning (Day1)"

From Course Wiki
Jump to: navigation, search
(Created page with "{{Template:20.109(S16)}} <div style="padding: 10px; width: 790px; border: 5px solid #33CC66;"> ==Introduction== The process of scientific inquiry encompasses much more than...")
 
(Part 5: Confirmation digest)
 
(121 intermediate revisions by 3 users not shown)
Line 5: Line 5:
 
==Introduction==
 
==Introduction==
  
The process of scientific inquiry encompasses much more than the collection and interpretation of data. A key part of the process is design – specifically of experiments that address a hypothesis and of new materials or technologies. Moreover, any design is subject to continued revision. You might redesign an experiment or tool based on your own research, or you might consult the vast body of scientific literature for other perspectives. As the old graduate student saying (sarcastically) goes, “A month in the lab might save you a day in the library!” In other words, although the process of combining the literature can be arduous or even tedious at times, it beats wasting a month of your time repeating experiments already proven not to work or reinventing the wheel.
+
Though the theme of Module 1 is protein engineering, today will focus on a few key techniques used in DNA engineering. Because the sequence of proteins is determined by the sequence of the genes that encode them, learning how to manipulate DNA is an important first step. Today you will complete a cloning reaction to generate a protein expression vector that contains a gene that encodes a calcium-sensing protein.  This process is illustrated in the schematic below. Later you will use this construct to engineer a new calcium-sensing protein.
  
During this module, you will generate and test a new version of inverse pericam (IPC). Today, we will refer to a few primary research articles to familiarize ourselves with this recombinant protein and its constituent parts. The fluorescent component of IPC is an enhanced yellow fluorescent protein (abbreviated EYFP), one of the many derivatives of green fluorescent protein (GFP). GFP is naturally produced by jellyfish and was cloned into other organisms in the early 1990’s. It has since been exploited as a genetically encodable reporter and mutagenized to vary its excitation and emission spectra. The other key component of inverse pericam is the protein calmodulin (CaM), a natural calcium sensor that is present in all eukaryotes (and briefly reviewed [http://www.ncbi.nlm.nih.gov/pubmed/10884684 here]). Calmodulin has many ligands that it binds only in the presence of calcium ion, including the peptide fragment M13. This conditional specificity for M13 binding is enabled by the change in CaM’s conformation when it binds calcium.  
+
[[Image:Sp16 M1D1 cloning schematic.png|thumb|center|450px|'''Schematic of pRSET-IPC cloning.'''  First, the EGFP insert is PCR amplified to generate multiple copies of the fragment that are flanked by restriction enzymes sites. Next, this fragment and the pRSET vector are digested to create compatible ends. Last, the compatible ends of the digested insert and vector are 'glued together' in a ligation reaction.]]
  
Within inverse pericam, M13 and CaM are located at opposite ends, surrounding a permuted (i.e., rearranged) version of EYFP. In the absence of calcium, this EYFP exhibits strong fluorescence. However, when enough calcium is added to a solution of inverse pericam, CaM and M13 interact, disrupting the conformation and, as a result, the fluorescence of EYFP. The transition from bright to dim fluorescence occurs over a particular concentration range of calcium. Your goal today is to propose a mutation that will shift the concentration range over which IPC fluorescence decreases. Specifically, you will modify the calcium sensor portion of inverse pericam in a manner that is likely to increase or decrease its affinity for calcium ion.  
+
The cloning vector we will use is pRSETThis vector has several features that make it ideal for cloning and protein expression -- both of which are important for this module. The calcium-sensing protein we will study in Module 1 is inverse pericam (IPC).  We will discuss this protein in much more detail later, for now it is sufficient to know that IPC was engineered to measure calcium concentrations. To generate your final product you will use three common DNA engineering techniques: PCR amplification, restriction enzyme digestion, and ligation.
  
To examine potential modifications to inverse pericam, we will use several protein analysis tools. Proteins are modular materials that may be described and examined at multiple levels of a structural hierarchy (from primary to quaternary in the classical paradigm). Primary structure refers to a protein’s amino acid sequence, which might reveal a cluster of charged residues or a pattern of alternating polar and nonpolar residues. One cannot predict off-hand the conformation of a protein merely from its linear sequence; however, due to rotational flexibility of bonds and non-covalent interactions between non-adjacent amino acids (as well as covalent disulfide bonds) some structural characteristics can be inferred.
+
===PCR amplification===
  
Physical methods used to interrogate 3D protein structure include X-ray diffraction (XRD), electron microscopy (EM), and nuclear magnetic resonance (NMR) spectroscopy. The paper by [http://www.ncbi.nlm.nih.gov/pubmed/7552747 Zhang et al.] that you will refer to today describes the decoding of calmodulin’s structure using NMR, which depends on subjecting molecules to electromagnetic fields and analyzing the resulting energy absorption spectra of their nuclei. Scientists who elucidate protein structures, in addition to publishing their results, will often add them to public databases such as the Protein Data Bank ([http://www.pdb.org/pdb/home/home.do PDB]). Because many proteins have structural motifs in common (e.g., alpha helices and beta sheets at the secondary level, or leucine-rich repeats at the tertiary level), which ultimately arise from their amino acid sequences, such databases can be useful for making predictions about proteins with known amino acid sequences but unknown structures. Today we will use a computer program that harnesses information in the Protein Data Bank to display interactive 3D models.
+
The applications of PCR (polymerase chain reaction) are widespread, from forensics to molecular biology to evolution, but the goal of any PCR is the same: to generate many copies of DNA from a single or a few specific sequence(s) (called the “target” or “template”).  
  
[[Image:20.109_SDM-Nobel.png|thumb|right|150px| 1993 Chemistry Nobel Prize co-winner (with Kary Mullis, inventor of PCR) for developing site-directed mutagenesis.]]
+
In addition to the target, PCR requires only three components: primers to bind sequence flanking the target, dNTPs to polymerize, and a heat-stable polymerase to carry out the synthesis reaction over and over and over. DNA polymerases require short initating pieces of DNA (or RNA) called primers in order to copy DNA. In PCR amplification, forward and reverse primers that target the non-coding and coding strands of DNA, respectively, are separated by a distance equal to the length of the DNA to be copied. Length is one important design feature. Primers that are too short may lack requisite specificity for the desired sequence, and thus amplify an unrelated sequence. The longer a primer is, the more favorable are its energetics for annealing to the template DNA, due to increased hydrogen bonding. On the other hand, longer primers are more likely to form secondary structures such as hairpins, leading to inefficient template priming. Two other important features are G/C content and placement. Having a G or C base at the end of each primer increases priming efficiency, due to the greater energy of a GC pair compared to an AT pair. The latter decrease the stability of the primer-template complex. Overall G/C content should ideally be 50 +/- 10%, because long stretches of G/C or A/T bases are both difficult to copy. The G/C content also affects the melting temperature. PCR is a three-step process (denature, anneal, extend) and these steps are repeated 20 or more times. After 30 cycles of PCR, there could be as many as a billion copies of the original target sequence.
  
After examining both two- and three-dimensional protein information, you will select primers that incorporate a mutation to the wild-type inverse pericam protein and use site-directed mutagenesis to incorporate the corresponding base pair changes. The  site-directed mutagenesis (SDM) strategy you will use shares some features with the polymerase chain reaction (PCR) for DNA amplification. Recall from Module 1 that PCR amplification involves multiple cycles of melting, annealing, and extending. To create one or more base-pair mutations in the product DNA, primers that have a slight mismatch to the original template can be used. At a low enough annealing temperature (~25 &deg;C below the primer melting temperature as defined for mutagenesis), these nearly-complementary primers will still anneal to the template DNA, but the copies created during the extension phase will contain the mutation.
+
[[Image:Be109karymullis.jpg|thumb|left|150px|'''Kary Mullis.''']]
  
You will combine the mutagenic primers of your choice with plasmid DNA encoding wild-type inverse pericam. These will be acted upon by a DNA polymerase to generate a plasmid that carries the inverse pericam gene. Even more copies of the mutant plasmid can be made by introducing it into bacteria in a process called transformation, which you are familiar with from Module 1. Remember that there is still parental -- that is, non-mutant -- DNA present in your SDM reaction mixture. In order to propagate ''only'' the mutant plasmid upon introduction into bacteria, the parental DNA is selectively digested using the ''DpnI'' enzyme prior to bacterial transformation. The underlying selective property is that ''DpnI'' only digests methylated DNA. Therefore, the synthetically made (and thus non-methylated) mutant DNA is not digested, while the parental DNA is digested due to methylation by the host bacterial strain originally used to amplify it. The resulting small linear  pieces of parental DNA are simply degraded by the bacteria, whereas the intact (due to a ligation reaction) mutant DNA is amplified by the bacteria.
+
Based on the numerous applications of PCR, it may seem that the technique has been around forever. In fact it is just over 30 years old. In 1984, Kary Mullis described this technique for amplifying DNA of known or unknown sequence, realizing immediately the significance of his insight.  
  
[[File:Fa15 NEB Q5 SDM Schematic.png|thumb|center|550px| Schematic from NEB Q5 Site Directed Mutagenesis Kit Manual]]
+
''"Dear Thor!," I exclaimed. I had solved the most annoying problems in DNA chemistry in a single lightening bolt. Abundance and distinction. With two oligonucleotides, DNA polymerase, and the four nucleosidetriphosphates I could make as much of a DNA sequence as I wanted and I could make it on a fragment of a specific size that I could distinguish easily. Somehow, I thought, it had to be an illusion. Otherwise it would change DNA chemistry forever. Otherwise it would make me famous. It was too easy. Someone else would have done it and I would surely have heard of it. We would be doing it all the time. What was I failing to see? "Jennifer, wake up. I've thought of something incredible." '' --Kary Mullis from his Nobel lecture; December 8, 1983
  
<br style="clear:both;"/>  
+
<br style="clear:both;"/>
  
Now might be a good time to mention why we care about measuring intracellular calcium in the first place. Calcium is involved in many signal transduction cascades, which regulate everything from immune cell activation to muscle contraction, from adhesion to apoptosis - see for example
+
===Restriction enzyme digest===
[http://www.ncbi.nlm.nih.gov/pubmed/18083096?ordinalpos=1&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum  this review by David Clapham in Cell], or [http://www.ncbi.nlm.nih.gov/pubmed/11830654?ordinalpos=6&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_RVDocSum this one by Ernesto Carafoli in PNAS]. Intracellular calcium (Ca<sup>2+</sup>) is normally maintained at ~100 nM, orders of magnitude less than the ~mM concentration outside the cell. ATPase pumps act to keep the basal concentration of cytoplasmic calcium low. Often calcium acts as a secondary messenger, i.e., it relays a message from the cell surface to its cytoplasm. For example, a particular ligand may bind a cell surface receptor, causing a flood of calcium ions to be released from the intracellular compartments in which they are usually sequestered. These free ions in turn may promote phosphorylation or other downstream signaling.
+
  
The proteins that bind calcium do so with a great variety of affinities, and have roles ranging from sequestration to sensing. Some calcium responses may have long-term effects, particularly in the case of transcription factors that can bind calcium.  As discussed in lecture, calmodulin works as a calcium sensor by undergoing a conformational change upon calcium binding. Your goal today is to generate mutant calmodulin (in the context of inverse pericam) DNA, in an attempt to alter the affinity of the resulting protein for calcium.
+
[[Image:Mod1 1 eco ri.jpg|thumb|right|300px|'''Restriction enzyme digest with EcoRI.''' ''EcoRI'' cuts between the G and the A on each strand of DNA, leaving a single stranded DNA overhang (also called a “sticky end”) when the phosphate backbone is cleaved.]]
  
After today's lab session, the teaching faculty will transform your mutated plasmids into cells that are able to generate multiple copies.   When you return you will receive your purified (and hopefully mutated) plasmid. The details will be discussed further during prelab.
+
Restriction endonucleases, also called restriction enzymes, 'cut' or 'digest' DNA at specific sequences of bases. The restriction enzymes are named according to the prokaryotic organism from which they were isolated. For example, the restriction endonuclease ''EcoRI'' (pronounced “echo-are-one”) was originally isolated from ''E. coli'' giving it the “Eco” part of the name. “RI” indicates the particular version on the ''E. coli strain'' (RY13) and the fact that it was the first restriction enzyme isolated from this strain.  
  
==Protocols==
+
The sequence of DNA that is bound and cleaved by an endonuclease is called the recognition sequence or restriction site. These sequences are usually four or six base pairs long and palindromic, that is, they read the same 5’ to 3’ on the top and bottom strand of DNA. For example, the recognition sequence for ''EcoRI'' is below (see also figure at right).  ''EcoRI'' cleaves the phosphate backbone of DNA between the G and A of the recognition sequence, which generates overhangs or 'sticky ends' of double-stranded DNA.
  
===Part 1: Protein backbone===
+
<font face="courier">
 +
5’ GAATTC 3’<br>
 +
3’ CTTAAG 5’
 +
</font>
  
Perhaps nothing is so conducive to a feeling of intimate familiarity with a protein as studying it at the amino acid level (primary structure). For the first part of lab today, you will examine a two-dimensional representation of inverse pericam.  
+
Unlike ''EcoRI'', some other restriction enzymes cut precisely in the middle of the palindromic DNA sequence, thus leaving no overhangs after digestion. The single-stranded overhangs resulting from DNA digestion by enzymes such as ''EcoRI'' are called sticky ends, while double-stranded ends resulting from digestion by enzymes such as ''HaeIII'' are called blunt ends. ''HaeIII'' recognizes
  
[[Image:20.109_IPC_Nagai_Fig1.jpeg|thumb|right|400px|Figure 1 from Nagai ''et al.'', ''PNAS'' '''98''':3197 (2001)]]
+
<font face="courier">
#Begin by downloading [[Media:20109_IPC_class-version.gb | this file]], which contains the DNA sequence of inverse pericam (IPC) in GenBank format. Open the file in ApE (''A plasmid Editor'', created by M. Wayne Davis at the University of Utah), which you used during Module 1, and save as a new file called 20109_IPC_YourTeamDay-YourTeamColor.
+
5’ GGCC 3’<br>
#In the sequence file, the M13 peptide is highlighted in magenta, the EYFP sequence in yellow, and calmodulin (CaM) in green. Linker sequences connecting these three parts are shown in blue lettering. Refer to Figure 1 of the paper by [http://www.ncbi.nlm.nih.gov/pubmed/11248055 Nagai et al.], which depicts the inverse pericam construct in schematic form, to cement your understanding of how the different components of IPC are connected.
+
3’ CCGG 5’
#As you go through the steps below, use the ''New Feature'' option on the ''Features'' menu to mark the calcium binding sites (and whatever else you deem relevant) in your IPC file. First, you will probably find it helpful to select ''ORFs'' &rarr; ''Translate'' for an amino acid level view of IPC.
+
</font>
#*Select the following options for the translation: ''1 letter'' code, line numbers ''left'', DNA ''below'', and copy highlight ''checked on''.
+
#To help you locate the binding sites for calcium (in the calmodulin portion of IPC), read the following portions of the [http://www.nature.com/nsmb/journal/v2/n9/abs/nsb0995-758.html Zhang paper], along with skimming whatever else you find useful: abstract, first two paragraphs, “Linker and loop flexibility” section.
+
#In  your IPC sequence document, you will mark the amino acid residues that make up the calcium-binding loops in CaM in orange. Begin by looking for the DNA or amino acid sequence of CaM on a site such as [http://www.ncbi.nlm.nih.gov/ NCBI]: choose ''Proteins'' &rarr; ''Protein Database'', search for calmodulin, and scroll down for the amino acid sequence. The CaM sequence is highly conserved across species, so you can refer to almost any sequence and compare it to the one in your file. Are any residues of calmodulin missing in IPC? Why might this be? If you get stuck, use the fact that the CaM within inverse pericam is an E104Q mutant, that is, the 104th residue of calmodulin is Q, to keep yourself oriented.
+
#Do the four calcium binding loops share any common features? You might imagine that negating or enhancing such features could decrease or increase calcium affinity, respectively.
+
#If you find other areas of calmodulin that you may be interested in mutagenizing (e.g., hydrophobic pockets), mark these as well. You may find the “Loss of hydrophobic cavities” section in Zhang ''et al.'' helpful.
+
#As you consider sites that may alter calcium binding, keep the following in mind:
+
#*When this module was first debuted, everyone mutated residues directly in the calcium binding loops, and very few groups saw dramatic changes in affinity or cooperativity of calmodulin with respect to calcium. In some years, class-wide results suggested that mutations in the first two binding loops were more likely to have an effect than mutations in the latter two binding loops. Some folks also targeted non-binding structural areas, but results were inconclusive. You may repeat or otherwise build upon prior results as long as you give your own reasoning.
+
  
Print out your annotated document and hang on to it for reference. Now let’s put some visuals to all those letters!
+
===Ligation===
  
===Part 2: Higher-order protein features===
+
In a ligation reaction, DNA ends are covalently attached to one another via the ligase enzyme.  The efficiency of the reaction is related to type of DNA ends: compatible sticky ends will ligate more efficiently than blunt ends, and non-compatible sticky ends will not be ligated due to the lack of hydrogen bonding between the basepairs.  To initiate the ligation reaction, hydrogen bonds are formed between the compatible overhangs of DNA fragments.  The ligase enzyme then forms a covalent phosphodiester bond between the 3' hydroxyl end of the 'acceptor' nucleotide and the 5' phosphodiester end of the 'donor' nucleotide. 
  
Unless we are precocious bioengineers indeed, looking at the amino acid sequence alone is unlikely to tell us too much about the protein. We might be left wondering where the binding sites for M13 and for calcium ions are located in calmodulin, for example. In the previous section, you read some primary scientific literature to locate these features. Now you will use a tool called Protein Explorer to visualize them. As you work, you can ask yourself why these stretches of the protein might work the way that they do, and how they might be changed.
+
[[Image:Mod1 3 dnaligatn.jpg|thumb|left|400px|'''Schematic of DNA ligation.''']]
  
#Protein Explorer is a free web-based viewer for biological molecules. To access it, open the Firefox browser and load [http://proteinexplorer.org proteinexplorer.org]. Choose “FirstGlance in Jmol” to proceed.
+
The first step in this process is the addition of AMP (adenylation) to a lysine residue within the active site of DNA ligase, which releases a pyrophosphate. Next, the AMP is transferred to the 5' phosphate of the donor nucleotide resulting in the formation of a pyrophosphate bond. Lastly, a phosphodiester bond is formed between the 5' phosphate of the donor nucleotide and the 3' hydroxyl of the 3' acceptor nucleotide.
#Structures are organized according to [http://www.pdb.org/pdb/home/home.do PDB] (Protein Data Bank) identification codes, which may be input at the prompt at the top of the page. Begin by looking at the molecule with PDB ID number 1CLL, which is a calcium-bound form of calmodulin. Later you will search for an example of the ligand-free form, also called apo calmodulin.
+
#The program will open in FirstView mode for the structure you’ve chosen (ensure that popup blockers are off if the structure fails to load). On the right is the image panel, which shows your protein along with associated ligands (in this case, calcium). Try clicking and dragging on the rotating image to see what happens.
+
#Now look at the control panel on the upper left: here you can modify the image. Try adding and removing water molecules and ligands see where they interact with the protein.
+
#As you explore the features of the control panel and image panel, be sure to observe the message frame window on the lower left for any relevant information that may pop up. If you click on an atom in the image panel, its atomic identity will be displayed in the message frame, along with its encompassing amino acid residue and position.
+
#From the control panel, click on the PDB icon, which leads to detailed information about the publication upon which the model image is based.
+
#To find further options for modifying how you view the image, or search for particular atoms, click on ''More Views'' in the control panel, or on ''Jmol'' at the bottom right of the image panel. For example, you can highlight specific amino acids, or change from a backbone trace to a space-filling model. Explore these features. For example, you might use color to highlight all the acidic amino acids in calmodulin.
+
#Be sure to note any useful information in your notebook as you go. You might ask:
+
#*what method was used to elucidate the structure of this protein?
+
#*how good is the image resolution?
+
#*which species did this protein come from?
+
#*when did the authors publish their results?
+
#*what are the major components of the molecule’s secondary structure?
+
#*what do the calcium binding loops (or other areas of interest you found) look like?
+
#Once you are satisfied with your understanding of calcium-bound calmodulin, bring up an apo calmodulin structure (or two) for comparison. You might find the structure directly by using [http://www.pdb.org/pdb/home/home.do PDB], or by using the [http://www.ncbi.nlm.nih.gov/ NCBI] Structure database. Write a few sentences in your lab notebook describing the differences between the calcium-bound and apo forms of calmodulin.
+
  
===Part 3: Primer design for mutagenesis===
 
 
[[File:Fa15 NEB Q5 SDM Primer Design.png|thumb|right|500px| Primer design for NEB Q5 Site Directed Mutagenesis]]
 
 
====Mutations====
 
 
It wouldn’t be very experimentally efficient to somehow pick out and modify a single residue on inverse pericam post-translationally. Instead, researchers genetically encode desired mutations, by making mutated copies of a plasmid originally containing inverse pericam DNA. As you learned in Module 1, DNA polymerases require short initating pieces of DNA (or RNA) called primers in order to copy DNA. Besides non-mutagenic amplification of a specific piece of DNA, synthetic primers can be used for incorporating desired mutations into DNA. For amplification, forward and reverse primers that target the non-coding and coding strands of DNA, respectively, are separated by a distance equal to the length of the DNA to be copied (see figure, part A). In contrast, primer design for site-directed mutagenesis is quite straightforward: both primers are directed at the same location on each strand, and thus will be precisely complementary (see figure, part B). Both direct and mutagenic amplification require cycles of DNA melting, annealing, and extension.
 
 
<br style="clear:both;"/>
 
<br style="clear:both;"/>
  
As you know from Module 1, good primers must meet several design criteria in order to promote specificity and efficiency of the desired amplification. Length is one important design feature. Primers that are too short may lack requisite specificity for the desired sequence, and thus amplify an unrelated sequence. The longer a primer is, the more favorable are its energetics for annealing to the template DNA, due to increased hydrogen bonding. On the other hand, longer primers are more likely to form secondary structures such as hairpins, leading to inefficient template priming. Two other important features are G/C content and placement. Having a G or C base at the end of each primer increases priming efficiency, due to the greater energy of a GC pair compared to an AT pair. The latter decrease the stability of the primer-template complex. Overall G/C content should ideally be 50 +/- 10%, because long stretches of G/C or A/T bases are both difficult to copy. The G/C content also affects the melting temperature, which should be high for mutagenesis.
+
==Protocols==
  
'''Consider the following design guidelines for mutagenesis primers and think about how these differ from the guidelines for amplification:'''
+
===Part 1: Laboratory orientation quiz===
  
#The desired mutation (1-3 bp) must be present in the middle of the forward primer.
+
Complete the orientation quiz with your partner. Though you are working with your partner, each student should record all answers on the provided quiz. If you disagree with your partner on an answer, you should write what you think is the correct answer on your quiz.
#The forward and reverse primers should 'face' away from the mutation and be 'back-to-back' when annealed to the template.
+
#The primers should be 25-45 bp long.
+
#A G/C content of > 40% is desired.
+
#Both primers should terminate in at least one G or C base.
+
#The melting temperature should exceed 78&deg;C, according to:
+
#*T<sub>m</sub> = 81.5 + 0.41 (%GC) – 675/N - %mismatch
+
#*where N is primer length, and the two percentages should be integers.
+
  
To demonstrate primer design, the illustration below uses S101L, which is an uninteresting mutation but is a straightforward teaching example.
+
Good luck!
  
<div style="padding: 10px; width: 760px; border: 5px solid #996600;">
+
===Part 2: PCR amplification and restriction enzyme digest of IPC insert===
Residue 101 of calmodulin is serine, encoded by the AGC codon. This is residue 379 with respect to the entire inverse pericam construct,
+
and we can find it and some flanking code in the DNA sequence from Part 1:
+
  
<font face="courier">
+
Because DNA engineering at the benchtop can take days, if not weeks, you will generate your clone ''in silico'' today.  You can use any DNA manipulation software you choose to complete the protocols, but the instructions provided are for APE (A Plasmid Editor, created by M. Wayne Davis at the University of Utah).  The software can be downloaded free-of-charge from [http://bioweb.biology.utah.edu/jorgensen/wayned/ape/ this site] onto your personal computer or you can use the 20.109 laboratory computers.  Please note that if you use a different program the teaching faculty may not be able to assist you.
<small>
+
  
361 (5') GAG GAA ATC CGA GAA GCA TTC CGT GTT TTT GAC AAG GAT GGG AAC GGC TAC ATC AGC GCT
+
'''Be sure to document your work and answer all questions in your lab notebook as you progress through the exercises below.'''
  
381 (5') GCT CAG TTA CGT CAC GTC ATG ACA AAC CTC GGG GAG AAG TTA ACA GAT GAA GAA GTT GAT
+
[[Image:Sp16 M1D1 Part1 IPC insert.png|thumb|right|300px|'''PCR amplification and restriction enzyme digest of IPC insert.''']]
</small>
+
</font>
+
  
To change from serine to leucine, one might choose TTA, TTG, or CTN (wherer N = T, A, G, or C). Because CTC requires only two mutations (rather than three as for the other options), we choose this codon.  
+
To amplify a specific sequence of DNA, you first need to design primers -- one primer that anneals at the start of the sequence of interest and a second primer that anneals at the end of the sequence of interest. Today you will design a 'forward' primer that anneals to the non-coding DNA strand and reads toward the IPC gene and a 'reverse' primer that anneals to the coding DNA strand at the end of the IPC gene and reads back into it.  Each primer will consist of two parts:  the 'landing sequence' will anneal to the sequence of interest and the 'flap sequence' will be used to add a restriction enzyme recognition sequence to your IPC insert.
  
Now we must keep 15-20 bp of sequence on each side in a way that meets all our requirements. To quickly find G/C content and see secondary structures, look at the [http://www.idtdna.com/calc/analyzer IDT website]. (Note that the T<sub>m</sub> listed at this site is '''''not''''' one that is relevant for mutagenesis.)  
+
#Find the IPC insert sequence [[Media:Sp16 M1D1 IPC sequence.docx| here]].
 +
#*Open APE then copy and paste the sequence into a new workspace.
 +
#*Record the size of the IPC gene in your notebook.
 +
#Because we want to amplify the entire gene, the landing sequence of the forward primer will begin with the first basepair of the sequence.
 +
#*Record the first 20 basepairs of the IPC gene sequence in your notebook.
 +
#Several websites are available to help you evaluate the characteristics of your primer.  We will use the Oligoanalyzer tool provided by Integrated DNA Technologies (IDT).
 +
#*Copy and paste the 20 basepair sequence into Sequence box at the [https://www.idtdna.com/calc/analyzer IDT website].  
 +
#*Leave the defaults for stems and loops as they are and then click Analyze.
 +
#*Use the following guidelines to evaluate your primer:
 +
#**length:  17-28 basepairs
 +
#**GC Content:  50-60%
 +
#**Tm:  60-65&deg;C
 +
#**avoid hairpins, complementation between primers, and repetitive sequences
 +
#*If you primer does not fit the guidelines provided above, try altering the length. '''Remember''' that the 5’ end of the landing sequence must not change or you will delete basepairs from your gene.
 +
#*When you are satisfied with the landing sequence, use the Features tool to label the forward primer sequence on your APE file (''Features'' &rarr; ''New Feature'').
 +
#Now that you have your landing sequence you will add a flap sequence that introduces a restriction enzyme recognition sequence.
 +
#*As shown in the schematic of our cloning strategy, we need to add a BamHI recognition sequence to our forward primer.  Search the [https://www.neb.com/products/restriction-endonucleases/restriction-endonucleases NEB catalog] to find the BamHI recognition sequence.  Record the recognition sequence and the cleavage sites within the sequence.
 +
#*Add the recognition sequence for the BamHI restriction enzyme to the landing sequence.  Consider the direction in which PCR amplification occurs to determine which end of your primer should carry the flap sequence.
 +
#**For reasons that will be evident later, you must also include an extra basepair between the BamHI recognition sequence and your landing sequence.  Add a 'T' at this location in your primer.
 +
#*In addition to the recognition sequence, it is important to include a 6 basepair 'tail' or 'junk' sequence to ensure the restriction enzyme is able to bind and cleave the DNA. Learn more about why this is necessary from scientists at [https://www.neb.com/tools-and-resources/usage-guidelines/cleavage-close-to-the-end-of-dna-fragments NEB]. Add any sequence of 6 basepairs to your primer flap sequence.  Carefully consider where this sequence should appear in your primer.
 +
#Record the sequence (5' &rarr; 3') of your forward primer in your notebook.
 +
#Use steps 2-5 to design your reverse primer.  Please keep the following notes in mind:
 +
#*Because you want to amplify the entire gene you should start with the last basepair of the sequence.
 +
#*Do NOT add a 'T' between the enzyme recognition sequence and your landing sequence.
 +
#*You will add an EcoRI restriction recognition site to your reverse primer.
 +
#*Remember that the reverse primer anneals to the coding DNA strand at the end of the IPC gene and reads back into it.  Keep this in mind when you add the flap sequence and when you record the sequence (5' &rarr; 3') of your primer in your notebook.
 +
#Create a new APE file that depicts the IPC product you would expect if you used your primers in a PCR amplification reaction.
 +
#*What is the size of your PCR product?  How does this compare to the size of the gene you recorded in Step #1.
 +
#*Add the sequence information to your notebook (it may be easiest to screen capture your work station in APE and embed the image in your notebook).
 +
#Now that you have your amplified IPC insert, you need to digest with BamHI and EcoRI to generate 'sticky ends' that will enable you to ligate your insert into the pRSET vector.
 +
#Create another new APE file that depicts your amplified IPC product following a BamHI and EcoRI double-digest.
 +
#*What is the size of your digest product?  How does this compare to the size of your PCR product?
  
Ultimately,  your forward primer might look like the following, which has a T<sub>m</sub> of almost 81&deg;C, and a G/C content of ~58%.
+
===Part 3: Restriction enzyme digest of pRSET vector===
  
<font face="courier">
+
To prepare for the ligation step, it is important to generate compatible 'sticky ends' on the insert and vectorAbove, you digested your IPC amplicon (PCR amplification product) with BamHI and EcoRI in a double-digest.  Here you will digest your pRSET vector to create compatible ends that can be ligated together.
5’ GG AAC GGC TAC ATC CTC GCT GCT CAG TTA CGT CAC G
+
</font><br>
+
The reverse primer is the inverse complement of a sequence just preceding the forward primer in the IPC geneThe forward and reverse primers are set up back-to-back.
+
</div style>
+
  
====Insertions====
+
[[Image:Sp16 M1D1 Part2 pRSET vector.png|thumb|right|250px|'''Restriction enzyme digest of pRSET vector.''']]
  
It is also possible to incorporate insertions using SDM. Here, the purpose of the insertion is to limit alterations to the protein while still adding a recognizable DNA tag. Thus, the amino acids that are added should, ideally, be small and neutral. For example, note the structure of Glycine:
+
#Find the pRSET vector sequence [[Media:Sp16 M1D1 PRSET sequence.docx| here]].
 +
#*Copy and paste the vector sequence into a new APE workspace.
 +
#Commercially available cloning vectors are engineered to contain a Multiple Cloning Site (MCS). The MCS is a short segment of DNA that encodes several restriction enzyme recognition sites.  These restriction enzyme recognition sites are provided for so researchers can clone their genes of interest into a specific location of the vector.
 +
#*Using the Feature tool, label basepairs 192-248 as the MCS.
 +
#To locate the BamHI and EcoRI recognition sequences within the MCS, go to ''Enzymes'' &rarr; ''Enzyme selector''.
 +
#*Select EcoRI and click Graphic Map.  An image of the plasmid should appear in a separate window with the recognition sites marked. In addition, the recognition sequences should be highlighted in the sequence that is in your workspace.
 +
#*Using the feature tool, label the BamHI and EcoRI recognition sequences.
 +
#Save your labelled pRSET file.
  
[[File:Fa15 glycine structure.png|thumb|center|250px| Glycine structure]]
+
<br style="clear:both" />
  
 +
===Part 4: Ligation of IPC insert and pRSET vector===
  
*Why is it important to consider the size and charge of the amino acid residues that you add to your protein?
+
When you complete a ligation at the bench, one very important step is to calculate the amounts of DNA you will use in the reaction.  Use the steps below to calculate the amount of IPC insert and pRSET vector you would use to complete this ligation in the laboratory.
  
One category of useful DNA tags are restriction enzyme sites. Recall from Module 1 that these sequences, usually short and palindromic, are recognized by enzymes that cut the sites in unique and specific ways.  
+
[[Image:Sp16 M1D1 recovery gel.png|thumb|right|200px|'''Recovery gel for ligation calculations.''' Lane 1 = pRSET vector, Lane 2 = molecular weight ladder, and Lane 3 = IPC insert.]]
  
*Why might it be useful to add an restriction enzyme recognition sequence?
+
#Calculate the concentration of backbone and of insert you would use in a ligation reaction based on the recovery gel posted on the right.
 +
#*For both the insert and the vector, 5 &mu;L of DNA was loaded into the gel.
 +
#*Refer to the [https://www.neb.com/products/n3232-1-kb-dna-ladder NEB marker] definitions to determine the ''ng'' of DNA in each lane.  Note that the ''ng'' listed are for 10 &mu;L of ladder and in the gel shown we loaded 20 &mu;L of ladder.
 +
#Convert the mass concentration to a molar concentration, using the fact that a typical DNA base is 660 g/mol. This conversion will mostly cancel out between the insert and the backbone, except for the difference in number of bases. Feel free to either omit steps that will cancel if you are comfortable doing so, or to keep them if you follow the math better that way.
 +
#Ideally, you will use 50-100 ng of backbone in the this ligation.
 +
#*Referring to the mass concentration, what volume of DNA will this amount require?
 +
#Ideally, you will use a 4:1 '''''molar''''' ratio of insert to backbone.
 +
#*Referring to the molar concentrations, how much insert do you need per &mu;L of backbone?
 +
#A 15 &mu;L scale ligation should not include more than 13.5 &mu;L of DNA because you must leave enough volume to add buffer and the ligase enzyme.
 +
#*If your backbone and insert volumes total to greater than this amount, you must (1) scale down both DNA amounts, using less than 50 ng backbone and/or (2) stray from the ideal 4:1 molar ratio. You may ask the teaching faculty for advice during class if you are unsure what choice is best.
 +
#Be sure to record all of your work for the ligation calculations in your notebook.
 +
#*Feel free to take a picture of your hand-written work and embed the image in your notebook.
 +
#Next you will complete this ligation ''in silico'' to generate a plasmid map of your pRSET-IPC plasmid.[[Image:Sp16 M1D1 Part3 ligation.png|thumb|right|400px|'''Ligation of IPC insert and pRSET vector.''']]
 +
#To ligate your IPC fragment into your pRSET vector, copy the digested IPC sequence you generated above and paste it into your vector sequence.
 +
#*Recall where BamHI and EcoRI cut within their recognition sequences as you consider the exact basepairs between which you should paste your IPC insert.
 +
#*'''Hint:''' the IPC insert should be flanked by intact BamHI and EcoRI recognition sequences in your final cloning product.
 +
#Save the file of your pRSET-IPC and embed the plasmid map image in your notebook.
 +
#Now that you have generated your pRSET-IPC clone, discuss the following questions with your partner and record your answers in your notebook.
 +
#*Recall the 'T' that you added between the landing sequence and the BamHI recognition sequence of your forward primer.  What is the purpose of this additional basepair?
 +
#**'''Hint:''' Think about the spacing between the His tag (CATCATCATCATCATCAT) and the first codon of the IPC gene in your plasmid map.  This His tag will be incorporated onto the translated protein sequence and enable you to purify your protein using affinity chromatography.
 +
#*Why was a 'T' not added between the landing sequence and the EcoRI recognition sequence of your reverse primer?
 +
#*Why did you use two different restriction enzymes in the cloning strategy?
  
===Part 4: Primer selection for mutagenesis===
+
===Part 5: Confirmation digest===
  
You will now integrate the information you learned about inverse pericam (especially calmodulin) at the structural and residue levels. Examine the primer sequences below and consider the mutations that each incorporates into inverse pericam.  Note: the mutations are written as X#Z, where X is the original amino acid, Z is the modified amino acid, and # is the residue number with respect to calmodulin (not IPC as a whole). '''For example, residue 379 of inverse pericam is residue 101 of calmodulin, which happens to be a serine, so a mutation at that site to leucine is written S101L.'''
+
To confirm the pRSET-IPC construct that we will use for this module, you will perform a 'diagnostic' or 'confirmation' digest. Recall from lecture that this step is important as a control -- you want to be sure that the products you use in your research are correct. This is an important step to check products you clone yourself and, perhaps more importantly, those that you may receive from another researcher.   
*Consider the mutation primer sequences below and compare the mutated residues to those that you identified as potentially interesting in Part 1.   
+
  
<center>
+
Ideally you will use a single enzyme that cuts once within the vector and once within your insert.  Unfortunately, this is rarely an option and you instead need to select an enzyme that cuts once within the vector and a second, compatible enzyme that cuts once within the insert.  Enzyme compatibility is determined by the buffer.  If two enzymes are able to function (cleave DNA) in the same buffer, they are compatible.  The [https://www.neb.com/tools-and-resources/interactive-tools/double-digest-finder NEB double digest online tool] will prove very helpful!
{| border="1"
+
|'''Mutation (X#Z)'''
+
|'''Forward primer (5' - 3')'''
+
|-
+
| D21A
+
| <font color=red>'''GCC GGC'''</font color> ATT CGA CAA GGC TGG GGA CGG CA
+
|-
+
| E30K
+
| <font color=red>'''CCA GGG'''</font color> CAC CAC AAA GAA ACT TGG CAC CG
+
|-
+
| L31R
+
| <font color=red>'''GCC GGC'''</font color> CAC AAA GGA ACG TGG CAC CGT TAT G
+
|-
+
| D57H
+
| <font color=red>'''GCC GGC'''</font color> AGT CGA TGC TCA TGG CAA TGG AA
+
|-
+
| T78P
+
| <font color=red>'''CCA GGG'''</font color> AAT GAA GGA CCC AGA CAG CGA AG
+
|-
+
| D93V
+
| <font color=red>'''CCA GGG'''</font color> CCG TGT TTT TGT CAA GGA TGG GAA C
+
|-
+
| M123L
+
| <font color=red>'''GCC GGC'''</font color> AGT TGA TGA ATT GAT AAG GGA AGC
+
|-
+
| D130G
+
| <font color=red>'''GCC GGC'''</font color> AGC AGA TAT CGG TGG TGA TGG CC
+
|-
+
| D132H
+
| <font color=red>'''GCC GGC'''</font color> TAT CGA TGG TCA TGG CCA AGT AAA C
+
|-
+
|}
+
  
<br style="clear:both;"/>
+
Use information from the lab manual, the [https://www.neb.com/products/restriction-endonucleases/restriction-endonucleases NEB catalog] and the plasmid map you generated above to choose the enzymes you will use. The following table may be helpful as you plan your work. 
</center>
+
  
<font color=red>'''NOTE:  The forward primers used to generate your mutations were altered such that the restriction enzyme flap sequence is no longer present!'''</font color>
+
Keep the following in mind as you consider which enzymes to use:
 
+
*Each enzyme should be present in 2.5 U quantity. As an example, the ''XbaI'' vial contains 20,000 U/mL, or 20 U/&mu;L, that is to say 8 times the desired working quantity in one microliter; therefore one reaction will require 0.125 &mu;L.
#Choose one modification that you hyphothesize might increase or decrease CaM’s affinity for calcium (or M13), or might affect cooperativity among the four calcium binding sites. Be sure to include your hypothesis for the effect of this mutation in your lab notebook.
+
*Because the lower limit of your pipet is 0.5 &mu;L, you will need to dilute the enzyme in its appropriate buffer prior to adding it to your master mix.
#Using the primer design diagram above (Part 3) and the general primer design guidelines, what do you think the reverse primer sequence might be?  '''Check your reverse primer sequence with the teaching faculty to be sure you have the correct sequence for your report.'''
+
*The 20.109 enzyme stocks are always the "S" size and concentration.
#Compare the sequence of the forward primer to the native (or wild-type) sequence of inverse pericam.  Locate the restriction enzyme sequence that will be incorporated with this primer. Where does the amino acid substitution occur within the IPC protein?  What restriction enzyme recognition sequence does the flap sequence insertion create within the mutated sequence?
+
#*'''Hint:''' use APE to compare the sequences and identify the restriction enzyme site.
+
#Lastly, use the design guidelines in Part 3 to examine the primer.  Is this a 'good' mutation primer?  Use the information you collect to support your decision to use this mutation primer.
+
#*Feel free to select a different mutation primer if you are not satisfied with your first choice at this point.
+
#Before you leave today, note which primer you chose on today's [[Talk:20.109(F15):Evaluate mutations and site-directed mutagenesis (Day1)| Discussion page]].  Also, include the information concerning the flap sequence insertion of a restriction enzyme recognition site.
+
 
+
===Part 5: Site-directed mutagenesis===
+
 
+
We will be using the Q5 Site Directed Mutagenesis Kit from NEB to perform your site-directed mutagenesis reactions. Each group will set up one reaction, for their chosen X#Z mutation. Meanwhile, the teaching faculty will set up a single positive control reaction, to ensure that all the reagents are working properly. You should work quickly but carefully, and keep your tube in a chilled container at all times. '''Please return shared reagents to the ice bucket(s) from which you took them as soon as you are done with each one.'''
+
 
+
#Get a PCR tube and label the top with your mutation and lab section (write small!).
+
#Add 10.25 &mu;L of nuclease-free water.
+
#Add 1.25 μL of your mutagenesis primer mix (both primers are at a stock concentration of 10 &mu;M).
+
#Add 1 &mu;L of IPC template DNA (at a stock concentration of 25 ng/&mu;L).
+
#Lastly, use a filter tip to add 12.5 &mu;L of Q5 Hot Start High-Fidelity 2X Master Mix - containing buffer, dNTPs, and polymerase - to your tube.
+
#Once each group is ready, we will begin the thermocycler, under the following conditions:
+
  
 
<center>
 
<center>
 
{| border="1"
 
{| border="1"
! Segment
+
|
! Cycles
+
! Diagnostic digest
! Temperature
+
! Enzyme 1 only
! Time
+
! Enzyme 2 only
 +
! No enzyme (uncut)
 
|-
 
|-
| Initial denaturation
+
| pRSET-IPC
| 1
+
| 5 &mu;L
| 98 &deg;C
+
| 5 &mu;L
| 30 s
+
| 5 &mu;L
 +
| 5 &mu;L
 
|-
 
|-
| Amplification
+
| 10X NEB buffer
| 25
+
| 2.5 &mu;L of buffer#_____
| 98 &deg;C
+
| 2.5 &mu;L of buffer#_____
| 10 s
+
| 2.5 &mu;L of buffer#_____
 +
| 2.5 &mu;L of buffer#_____
 
|-
 
|-
|  
+
| 1st Enzyme (2.5 U)
 +
| ____ &mu;L of _____
 +
| ____ &mu;L of _____
 +
|
 
|
 
|
| 55 &deg;C
 
| 30 s
 
 
|-
 
|-
 +
| 2nd Enzyme (2.5 U)
 +
| ____ &mu;L of _____
 
|
 
|
 +
| _____ &mu;L of _____
 
|
 
|
| 72 &deg;C
 
| 2 min
 
 
|-
 
|-
| Final extension
+
| H<sub>2</sub>O
| 1
+
! colspan="4"| to a final volume of 25 &mu;L (not including volume of enzyme)
| 72 &deg;C
+
| 2 min
+
|-
+
| Hold
+
| 1
+
| 4 &deg;C
+
| indefinite
+
 
|}
 
|}
 
</center>
 
</center>
  
*After the cycling is completed, the teaching faculty will complete the KLD reaction (which stands for "kinase, ligase, ''DnpI''") using 1 &mu;L of your amplification product, 5 &mu;L 2X KLD Reaction Buffer, 1 &mu;L KLD Enzyme Mix, and 3 &mu;L nuclease-free water. The reactions will be incubated for 5 min at room temperature.
+
#Unlike the cloning steps you completed above, the diagnostic digest will be performed at the benchtop.
 
+
#Prepare a reaction cocktail for each of the above reactions (uncut, singly cut with enzyme 1, singly cut with enzyme 2 and doubly cut with enzyme 1 and enzyme 2) that includes (in that order) water, buffer and enzyme.  
*The teaching faculty will then use 5 &mu;L of the KLD reaction product to complete a transformation into an ''E. coli'' strain (NEB 5&alpha; cells of genotype ''fhuA2 Δ(argF-lacZ)U169 phoA glnV44 Φ80 Δ(lacZ)M15 gyrA96 recA1 relA1 endA1 thi-1 hsdR17'') that will amplify the plasmid such that you are able to confirm the appropriate mutation was incorporated. The transformation procedure will be as follows:
+
#Aliquot 5 &mu;L of pRSET-IPC into four well-labeled eppendorf tubes.  
#Add 5 &mu;L of KLD mix to 50 &mu;L of chemically-competent NEB 5&alpha;.
+
#*The labels should include the plasmid name, the enzymes to be added, and your team color.  
#Incubate on ice for 30 min.
+
#Add 20 &mu;L of the appropriate cocktail to each tube. Flick the tubes to mix the contents then gather the liquid in the bottom of the tube with a short spin down.
#Heat shock at 42 &deg;C for 30 sec.
+
#Incubate your digests at 37 &deg;C.
#Incubate on ice for 5 min.
+
#Add 950 &mu;L SOC and gently shake at 37 &deg;C for 1 hour.
+
#Spread 50 &mu;L onto LB amp plate and incubate overnight at 37 &deg;C.
+
  
=Reagent list=
+
The teaching faculty will leave your digests at 37&deg;C for one hour, then move them to -20&deg;C.
  
*Q5 Site Directed Mutagenesis Kit from NEB
+
==Reagents==
**Q5 Hot Start High-Fidelity 2X Master Mix
+
*pRSET-IPC (concentration: 0.5 &mu;g / &mu;L )
***Propriety mix of Q5 Hot Start High-Fidelity DNA Polymerase, buffer, dNTPs, and Mg<sup>2+</sup>.
+
*NEB buffer
**2X KLD Reaction Buffer
+
**The buffer you use will depend on the enzymes you use for your confirmation digest, but all NEB buffers are supplied at a 10X concentration.
**10X KLD Enzyme Mix
+
*NEB enzymes
***Proprietary mix of kinase, ligase, and ''DpnI'' enzymes.
+
**The concentration for the enzymes are listed on the product information page of the website.
*The SOC medium contains 2% tryptone, 0.5% yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2, 10 mM MgSO4, and 20 mM glucose.
+
*Ampicillin: 100 mg/mL, aqueous, sterile-filtered
+
  
=Navigation links=
+
==Navigation links==
Next day:  
+
Next day: [[20.109(S16):Design mutation primers (Day2)| Design mutation primers]]
  
Previous day: Orientation
+
Previous day: [[20.109(S16):Lab tour| Orientation]]

Latest revision as of 21:29, 4 February 2016

20.109(S16): Laboratory Fundamentals of Biological Engineering

S16 TemplateImage.png

Home        People        Schedule Spring 2016        Assignments        Homework        Lab Basics        Wiki Basics       
Protein Engineering        System Engineering        Biomaterials Engineering              

Introduction

Though the theme of Module 1 is protein engineering, today will focus on a few key techniques used in DNA engineering. Because the sequence of proteins is determined by the sequence of the genes that encode them, learning how to manipulate DNA is an important first step. Today you will complete a cloning reaction to generate a protein expression vector that contains a gene that encodes a calcium-sensing protein. This process is illustrated in the schematic below. Later you will use this construct to engineer a new calcium-sensing protein.

Schematic of pRSET-IPC cloning. First, the EGFP insert is PCR amplified to generate multiple copies of the fragment that are flanked by restriction enzymes sites. Next, this fragment and the pRSET vector are digested to create compatible ends. Last, the compatible ends of the digested insert and vector are 'glued together' in a ligation reaction.

The cloning vector we will use is pRSET. This vector has several features that make it ideal for cloning and protein expression -- both of which are important for this module. The calcium-sensing protein we will study in Module 1 is inverse pericam (IPC). We will discuss this protein in much more detail later, for now it is sufficient to know that IPC was engineered to measure calcium concentrations. To generate your final product you will use three common DNA engineering techniques: PCR amplification, restriction enzyme digestion, and ligation.

PCR amplification

The applications of PCR (polymerase chain reaction) are widespread, from forensics to molecular biology to evolution, but the goal of any PCR is the same: to generate many copies of DNA from a single or a few specific sequence(s) (called the “target” or “template”).

In addition to the target, PCR requires only three components: primers to bind sequence flanking the target, dNTPs to polymerize, and a heat-stable polymerase to carry out the synthesis reaction over and over and over. DNA polymerases require short initating pieces of DNA (or RNA) called primers in order to copy DNA. In PCR amplification, forward and reverse primers that target the non-coding and coding strands of DNA, respectively, are separated by a distance equal to the length of the DNA to be copied. Length is one important design feature. Primers that are too short may lack requisite specificity for the desired sequence, and thus amplify an unrelated sequence. The longer a primer is, the more favorable are its energetics for annealing to the template DNA, due to increased hydrogen bonding. On the other hand, longer primers are more likely to form secondary structures such as hairpins, leading to inefficient template priming. Two other important features are G/C content and placement. Having a G or C base at the end of each primer increases priming efficiency, due to the greater energy of a GC pair compared to an AT pair. The latter decrease the stability of the primer-template complex. Overall G/C content should ideally be 50 +/- 10%, because long stretches of G/C or A/T bases are both difficult to copy. The G/C content also affects the melting temperature. PCR is a three-step process (denature, anneal, extend) and these steps are repeated 20 or more times. After 30 cycles of PCR, there could be as many as a billion copies of the original target sequence.

Kary Mullis.

Based on the numerous applications of PCR, it may seem that the technique has been around forever. In fact it is just over 30 years old. In 1984, Kary Mullis described this technique for amplifying DNA of known or unknown sequence, realizing immediately the significance of his insight.

"Dear Thor!," I exclaimed. I had solved the most annoying problems in DNA chemistry in a single lightening bolt. Abundance and distinction. With two oligonucleotides, DNA polymerase, and the four nucleosidetriphosphates I could make as much of a DNA sequence as I wanted and I could make it on a fragment of a specific size that I could distinguish easily. Somehow, I thought, it had to be an illusion. Otherwise it would change DNA chemistry forever. Otherwise it would make me famous. It was too easy. Someone else would have done it and I would surely have heard of it. We would be doing it all the time. What was I failing to see? "Jennifer, wake up. I've thought of something incredible." --Kary Mullis from his Nobel lecture; December 8, 1983


Restriction enzyme digest

Restriction enzyme digest with EcoRI. EcoRI cuts between the G and the A on each strand of DNA, leaving a single stranded DNA overhang (also called a “sticky end”) when the phosphate backbone is cleaved.

Restriction endonucleases, also called restriction enzymes, 'cut' or 'digest' DNA at specific sequences of bases. The restriction enzymes are named according to the prokaryotic organism from which they were isolated. For example, the restriction endonuclease EcoRI (pronounced “echo-are-one”) was originally isolated from E. coli giving it the “Eco” part of the name. “RI” indicates the particular version on the E. coli strain (RY13) and the fact that it was the first restriction enzyme isolated from this strain.

The sequence of DNA that is bound and cleaved by an endonuclease is called the recognition sequence or restriction site. These sequences are usually four or six base pairs long and palindromic, that is, they read the same 5’ to 3’ on the top and bottom strand of DNA. For example, the recognition sequence for EcoRI is below (see also figure at right). EcoRI cleaves the phosphate backbone of DNA between the G and A of the recognition sequence, which generates overhangs or 'sticky ends' of double-stranded DNA.

5’ GAATTC 3’
3’ CTTAAG 5’

Unlike EcoRI, some other restriction enzymes cut precisely in the middle of the palindromic DNA sequence, thus leaving no overhangs after digestion. The single-stranded overhangs resulting from DNA digestion by enzymes such as EcoRI are called sticky ends, while double-stranded ends resulting from digestion by enzymes such as HaeIII are called blunt ends. HaeIII recognizes

5’ GGCC 3’
3’ CCGG 5’

Ligation

In a ligation reaction, DNA ends are covalently attached to one another via the ligase enzyme. The efficiency of the reaction is related to type of DNA ends: compatible sticky ends will ligate more efficiently than blunt ends, and non-compatible sticky ends will not be ligated due to the lack of hydrogen bonding between the basepairs. To initiate the ligation reaction, hydrogen bonds are formed between the compatible overhangs of DNA fragments. The ligase enzyme then forms a covalent phosphodiester bond between the 3' hydroxyl end of the 'acceptor' nucleotide and the 5' phosphodiester end of the 'donor' nucleotide.

Schematic of DNA ligation.

The first step in this process is the addition of AMP (adenylation) to a lysine residue within the active site of DNA ligase, which releases a pyrophosphate. Next, the AMP is transferred to the 5' phosphate of the donor nucleotide resulting in the formation of a pyrophosphate bond. Lastly, a phosphodiester bond is formed between the 5' phosphate of the donor nucleotide and the 3' hydroxyl of the 3' acceptor nucleotide.


Protocols

Part 1: Laboratory orientation quiz

Complete the orientation quiz with your partner. Though you are working with your partner, each student should record all answers on the provided quiz. If you disagree with your partner on an answer, you should write what you think is the correct answer on your quiz.

Good luck!

Part 2: PCR amplification and restriction enzyme digest of IPC insert

Because DNA engineering at the benchtop can take days, if not weeks, you will generate your clone in silico today. You can use any DNA manipulation software you choose to complete the protocols, but the instructions provided are for APE (A Plasmid Editor, created by M. Wayne Davis at the University of Utah). The software can be downloaded free-of-charge from this site onto your personal computer or you can use the 20.109 laboratory computers. Please note that if you use a different program the teaching faculty may not be able to assist you.

Be sure to document your work and answer all questions in your lab notebook as you progress through the exercises below.

PCR amplification and restriction enzyme digest of IPC insert.

To amplify a specific sequence of DNA, you first need to design primers -- one primer that anneals at the start of the sequence of interest and a second primer that anneals at the end of the sequence of interest. Today you will design a 'forward' primer that anneals to the non-coding DNA strand and reads toward the IPC gene and a 'reverse' primer that anneals to the coding DNA strand at the end of the IPC gene and reads back into it. Each primer will consist of two parts: the 'landing sequence' will anneal to the sequence of interest and the 'flap sequence' will be used to add a restriction enzyme recognition sequence to your IPC insert.

  1. Find the IPC insert sequence here.
    • Open APE then copy and paste the sequence into a new workspace.
    • Record the size of the IPC gene in your notebook.
  2. Because we want to amplify the entire gene, the landing sequence of the forward primer will begin with the first basepair of the sequence.
    • Record the first 20 basepairs of the IPC gene sequence in your notebook.
  3. Several websites are available to help you evaluate the characteristics of your primer. We will use the Oligoanalyzer tool provided by Integrated DNA Technologies (IDT).
    • Copy and paste the 20 basepair sequence into Sequence box at the IDT website.
    • Leave the defaults for stems and loops as they are and then click Analyze.
    • Use the following guidelines to evaluate your primer:
      • length: 17-28 basepairs
      • GC Content: 50-60%
      • Tm: 60-65°C
      • avoid hairpins, complementation between primers, and repetitive sequences
    • If you primer does not fit the guidelines provided above, try altering the length. Remember that the 5’ end of the landing sequence must not change or you will delete basepairs from your gene.
    • When you are satisfied with the landing sequence, use the Features tool to label the forward primer sequence on your APE file (FeaturesNew Feature).
  4. Now that you have your landing sequence you will add a flap sequence that introduces a restriction enzyme recognition sequence.
    • As shown in the schematic of our cloning strategy, we need to add a BamHI recognition sequence to our forward primer. Search the NEB catalog to find the BamHI recognition sequence. Record the recognition sequence and the cleavage sites within the sequence.
    • Add the recognition sequence for the BamHI restriction enzyme to the landing sequence. Consider the direction in which PCR amplification occurs to determine which end of your primer should carry the flap sequence.
      • For reasons that will be evident later, you must also include an extra basepair between the BamHI recognition sequence and your landing sequence. Add a 'T' at this location in your primer.
    • In addition to the recognition sequence, it is important to include a 6 basepair 'tail' or 'junk' sequence to ensure the restriction enzyme is able to bind and cleave the DNA. Learn more about why this is necessary from scientists at NEB. Add any sequence of 6 basepairs to your primer flap sequence. Carefully consider where this sequence should appear in your primer.
  5. Record the sequence (5' → 3') of your forward primer in your notebook.
  6. Use steps 2-5 to design your reverse primer. Please keep the following notes in mind:
    • Because you want to amplify the entire gene you should start with the last basepair of the sequence.
    • Do NOT add a 'T' between the enzyme recognition sequence and your landing sequence.
    • You will add an EcoRI restriction recognition site to your reverse primer.
    • Remember that the reverse primer anneals to the coding DNA strand at the end of the IPC gene and reads back into it. Keep this in mind when you add the flap sequence and when you record the sequence (5' → 3') of your primer in your notebook.
  7. Create a new APE file that depicts the IPC product you would expect if you used your primers in a PCR amplification reaction.
    • What is the size of your PCR product? How does this compare to the size of the gene you recorded in Step #1.
    • Add the sequence information to your notebook (it may be easiest to screen capture your work station in APE and embed the image in your notebook).
  8. Now that you have your amplified IPC insert, you need to digest with BamHI and EcoRI to generate 'sticky ends' that will enable you to ligate your insert into the pRSET vector.
  9. Create another new APE file that depicts your amplified IPC product following a BamHI and EcoRI double-digest.
    • What is the size of your digest product? How does this compare to the size of your PCR product?

Part 3: Restriction enzyme digest of pRSET vector

To prepare for the ligation step, it is important to generate compatible 'sticky ends' on the insert and vector. Above, you digested your IPC amplicon (PCR amplification product) with BamHI and EcoRI in a double-digest. Here you will digest your pRSET vector to create compatible ends that can be ligated together.

Restriction enzyme digest of pRSET vector.
  1. Find the pRSET vector sequence here.
    • Copy and paste the vector sequence into a new APE workspace.
  2. Commercially available cloning vectors are engineered to contain a Multiple Cloning Site (MCS). The MCS is a short segment of DNA that encodes several restriction enzyme recognition sites. These restriction enzyme recognition sites are provided for so researchers can clone their genes of interest into a specific location of the vector.
    • Using the Feature tool, label basepairs 192-248 as the MCS.
  3. To locate the BamHI and EcoRI recognition sequences within the MCS, go to EnzymesEnzyme selector.
    • Select EcoRI and click Graphic Map. An image of the plasmid should appear in a separate window with the recognition sites marked. In addition, the recognition sequences should be highlighted in the sequence that is in your workspace.
    • Using the feature tool, label the BamHI and EcoRI recognition sequences.
  4. Save your labelled pRSET file.


Part 4: Ligation of IPC insert and pRSET vector

When you complete a ligation at the bench, one very important step is to calculate the amounts of DNA you will use in the reaction. Use the steps below to calculate the amount of IPC insert and pRSET vector you would use to complete this ligation in the laboratory.

Recovery gel for ligation calculations. Lane 1 = pRSET vector, Lane 2 = molecular weight ladder, and Lane 3 = IPC insert.
  1. Calculate the concentration of backbone and of insert you would use in a ligation reaction based on the recovery gel posted on the right.
    • For both the insert and the vector, 5 μL of DNA was loaded into the gel.
    • Refer to the NEB marker definitions to determine the ng of DNA in each lane. Note that the ng listed are for 10 μL of ladder and in the gel shown we loaded 20 μL of ladder.
  2. Convert the mass concentration to a molar concentration, using the fact that a typical DNA base is 660 g/mol. This conversion will mostly cancel out between the insert and the backbone, except for the difference in number of bases. Feel free to either omit steps that will cancel if you are comfortable doing so, or to keep them if you follow the math better that way.
  3. Ideally, you will use 50-100 ng of backbone in the this ligation.
    • Referring to the mass concentration, what volume of DNA will this amount require?
  4. Ideally, you will use a 4:1 molar ratio of insert to backbone.
    • Referring to the molar concentrations, how much insert do you need per μL of backbone?
  5. A 15 μL scale ligation should not include more than 13.5 μL of DNA because you must leave enough volume to add buffer and the ligase enzyme.
    • If your backbone and insert volumes total to greater than this amount, you must (1) scale down both DNA amounts, using less than 50 ng backbone and/or (2) stray from the ideal 4:1 molar ratio. You may ask the teaching faculty for advice during class if you are unsure what choice is best.
  6. Be sure to record all of your work for the ligation calculations in your notebook.
    • Feel free to take a picture of your hand-written work and embed the image in your notebook.
  7. Next you will complete this ligation in silico to generate a plasmid map of your pRSET-IPC plasmid.
    Ligation of IPC insert and pRSET vector.
  8. To ligate your IPC fragment into your pRSET vector, copy the digested IPC sequence you generated above and paste it into your vector sequence.
    • Recall where BamHI and EcoRI cut within their recognition sequences as you consider the exact basepairs between which you should paste your IPC insert.
    • Hint: the IPC insert should be flanked by intact BamHI and EcoRI recognition sequences in your final cloning product.
  9. Save the file of your pRSET-IPC and embed the plasmid map image in your notebook.
  10. Now that you have generated your pRSET-IPC clone, discuss the following questions with your partner and record your answers in your notebook.
    • Recall the 'T' that you added between the landing sequence and the BamHI recognition sequence of your forward primer. What is the purpose of this additional basepair?
      • Hint: Think about the spacing between the His tag (CATCATCATCATCATCAT) and the first codon of the IPC gene in your plasmid map. This His tag will be incorporated onto the translated protein sequence and enable you to purify your protein using affinity chromatography.
    • Why was a 'T' not added between the landing sequence and the EcoRI recognition sequence of your reverse primer?
    • Why did you use two different restriction enzymes in the cloning strategy?

Part 5: Confirmation digest

To confirm the pRSET-IPC construct that we will use for this module, you will perform a 'diagnostic' or 'confirmation' digest. Recall from lecture that this step is important as a control -- you want to be sure that the products you use in your research are correct. This is an important step to check products you clone yourself and, perhaps more importantly, those that you may receive from another researcher.

Ideally you will use a single enzyme that cuts once within the vector and once within your insert. Unfortunately, this is rarely an option and you instead need to select an enzyme that cuts once within the vector and a second, compatible enzyme that cuts once within the insert. Enzyme compatibility is determined by the buffer. If two enzymes are able to function (cleave DNA) in the same buffer, they are compatible. The NEB double digest online tool will prove very helpful!

Use information from the lab manual, the NEB catalog and the plasmid map you generated above to choose the enzymes you will use. The following table may be helpful as you plan your work.

Keep the following in mind as you consider which enzymes to use:

  • Each enzyme should be present in 2.5 U quantity. As an example, the XbaI vial contains 20,000 U/mL, or 20 U/μL, that is to say 8 times the desired working quantity in one microliter; therefore one reaction will require 0.125 μL.
  • Because the lower limit of your pipet is 0.5 μL, you will need to dilute the enzyme in its appropriate buffer prior to adding it to your master mix.
  • The 20.109 enzyme stocks are always the "S" size and concentration.
Diagnostic digest Enzyme 1 only Enzyme 2 only No enzyme (uncut)
pRSET-IPC 5 μL 5 μL 5 μL 5 μL
10X NEB buffer 2.5 μL of buffer#_____ 2.5 μL of buffer#_____ 2.5 μL of buffer#_____ 2.5 μL of buffer#_____
1st Enzyme (2.5 U) ____ μL of _____ ____ μL of _____
2nd Enzyme (2.5 U) ____ μL of _____ _____ μL of _____
H2O to a final volume of 25 μL (not including volume of enzyme)
  1. Unlike the cloning steps you completed above, the diagnostic digest will be performed at the benchtop.
  2. Prepare a reaction cocktail for each of the above reactions (uncut, singly cut with enzyme 1, singly cut with enzyme 2 and doubly cut with enzyme 1 and enzyme 2) that includes (in that order) water, buffer and enzyme.
  3. Aliquot 5 μL of pRSET-IPC into four well-labeled eppendorf tubes.
    • The labels should include the plasmid name, the enzymes to be added, and your team color.
  4. Add 20 μL of the appropriate cocktail to each tube. Flick the tubes to mix the contents then gather the liquid in the bottom of the tube with a short spin down.
  5. Incubate your digests at 37 °C.

The teaching faculty will leave your digests at 37°C for one hour, then move them to -20°C.

Reagents

  • pRSET-IPC (concentration: 0.5 μg / μL )
  • NEB buffer
    • The buffer you use will depend on the enzymes you use for your confirmation digest, but all NEB buffers are supplied at a 10X concentration.
  • NEB enzymes
    • The concentration for the enzymes are listed on the product information page of the website.

Navigation links

Next day: Design mutation primers

Previous day: Orientation