The process of scientific inquiry encompasses much more than the collection and interpretation of data. A key part of the process is design – specifically of experiments that address a hypothesis and of new materials or technologies. Moreover, any design is subject to continued revision. You might redesign an experiment or tool based on your own research, or you might consult the vast body of scientific literature for other perspectives. As the old graduate student saying (sarcastically) goes, “A month in the lab might save you a day in the library!” In other words, although the process of combining the literature can be arduous or even tedious at times, it beats wasting a month of your time repeating experiments previously demonstrated not to work or reinventing the wheel.
During this module, you will generate and test a new version of inverse pericam (IPC). Today, you will refer to a few primary research articles to familiarize yourself with this recombinant protein and its constituent parts. The fluorescent component of IPC is an enhanced yellow fluorescent protein (abbreviated EYFP), one of the many derivatives of green fluorescent protein (GFP). GFP is naturally produced by jellyfish and was cloned into other organisms in the early 1990’s. It has since been exploited as a genetically encodable reporter and mutagenized to vary its excitation and emission spectra. The other key component of inverse pericam is the protein calmodulin (CaM), a natural calcium sensor that is present in all eukaryotes (briefly reviewed here). Calmodulin has many ligands that it binds only in the presence of calcium ion, including the peptide fragment M13. This conditional specificity for M13 binding is enabled by the change in CaM’s conformation when it binds calcium.
Diagram of inverse pericam.
(A) The EYFP gene within IPC is mutated such that the C and N termini are re-organized then flanked by M13 and CaM. (B) In the absence of Ca2+
, inverse pericam fluoresces yellow and in the absence of Ca2+
fluorescence is quenched.
Within inverse pericam, M13 and CaM are located at opposite ends, surrounding a permuted (i.e., rearranged) version of EYFP. In the absence of calcium, this EYFP exhibits strong fluorescence. However, when enough calcium is added to a solution of inverse pericam, CaM and M13 interact, disrupting the conformation and, as a result, the fluorescence of EYFP. The transition from bright to dim fluorescence occurs over a particular concentration range of calcium. The calcium concentration at which binding to CaM occurs (and fluorescence decreases) is referred to as the Kd and determined by the affinity of CaM to calcium. In addition, the interaction between CaM and calcium is impacted by cooperativity. CaM has four calcium binding sites. In cooperativity, the affinity of CaM for calcium is altered by how many calcium ions are already bound to the protein. Your goal today is to propose a mutation that will modify the calcium sensor portion of inverse pericam in a manner that is likely to change the affinity and/or cooperativity for calcium ions.
To examine potential modifications to inverse pericam, we will use several protein analysis tools. Proteins are modular materials that may be described and examined at multiple levels of a structural hierarchy (from primary to quaternary in the classical paradigm). Primary structure refers to a protein’s amino acid sequence, which might reveal a cluster of charged residues or a pattern of alternating polar and nonpolar residues. One cannot predict off-hand the conformation of a protein merely from its linear sequence; however, due to rotational flexibility of bonds and non-covalent interactions between non-adjacent amino acids (as well as covalent disulfide bonds) some structural characteristics can be inferred.
Physical methods used to interrogate 3D protein structure include X-ray diffraction (XRD), electron microscopy (EM), and nuclear magnetic resonance (NMR) spectroscopy. The paper by Zhang et al. that you will refer to today describes the decoding of calmodulin’s structure using NMR, which depends on subjecting molecules to electromagnetic fields and analyzing the resulting energy absorption spectra of their nuclei. Scientists who elucidate protein structures, in addition to publishing their results, will often add them to public databases such as the Protein Data Bank (PDB). Because many proteins have structural motifs in common (e.g., alpha helices and beta sheets at the secondary level, or leucine-rich repeats at the tertiary level), which ultimately arise from their amino acid sequences, such databases can be useful for making predictions about proteins with known amino acid sequences but unknown structures. Today we will use a computer program that harnesses information in the Protein Data Bank to display interactive 3D models.
Now might be a good time to mention why we care about measuring intracellular calcium in the first place. Calcium is involved in many signal transduction cascades, which regulate everything from immune cell activation to muscle contraction, from adhesion to apoptosis - see for example
this review by David Clapham in Cell, or this one by Ernesto Carafoli in PNAS. Intracellular calcium (Ca2+) is normally maintained at ~100 nM, orders of magnitude less than the ~mM concentration outside the cell. ATPase pumps act to keep the basal concentration of cytoplasmic calcium low. Often calcium acts as a secondary messenger, i.e., it relays a message from the cell surface to its cytoplasm. For example, a particular ligand may bind a cell surface receptor, causing a flood of calcium ions to be released from the intracellular compartments in which they are usually sequestered. These free ions in turn may promote phosphorylation or other downstream signaling.
The proteins that bind calcium do so with a great variety of affinities, and have roles ranging from sequestration to sensing. Some calcium responses may have long-term effects, particularly in the case of transcription factors that can bind calcium. As discussed in lecture, calmodulin works as a calcium sensor by undergoing a conformational change upon calcium binding. Your goal today is to design mutation primers that will generate mutant calmodulin (in the context of inverse pericam) DNA, in an attempt to alter the affinity and/or cooperativity of the resulting protein for calcium.
Part 1: Agarose gel electrophoresis of confirmation digests
Electrophoresis is a technique that separates large molecules by size using an applied electrical field and a sieving matrix. DNA, RNA and proteins are the molecules most often studied with this technique; agarose and acrylamide gels are the two most common sieves. The molecules to be separated enter the matrix through a well at one end and are pulled through the matrix when a current is applied across it. The larger molecules get entwined in the matrix and are stalled; the smaller molecules wind through the matrix more easily and travel further from the well. The distance a DNA fragment travels is inversely proportional to the log of its length. Over time fragments of similar length accumulate into “bands” in the gel. Higher concentrations of agarose can be used to resolve smaller DNA fragments.
Diagram of gel electrophoresis chamber.
Larger sized DNA molecules will remain close to the well where the sample was loaded and smaller DNA molecules will migrate through the gel toward the positive electrode.
DNA and RNA are negatively charged molecules due to their phosphate backbone, and they naturally travel toward the positive charge at the far end of the gel. Today you will separate DNA fragments using an agarose matrix. Agarose is a polymer that comes from seaweed and if you’ve ever made Jell-O™, then you already have all the skills needed for pouring an agarose gel! To prepare these gels, agarose and 1X TAE buffer are microwaved until the agarose is melted and fully dissolved. The molten agar is then poured into a horizontal casting tray, and a comb is added. Once the agar has solidified, the comb is removed, leaving wells into which the DNA sample can be loaded.
You will use a 1% agarose gel (prepared by the teaching faculty) to separate the DNA fragments in your four digested samples as well as a reference lane of molecular weight markers (also called a DNA ladder).
- Add 5 μL of 6x loading dye to the digests.
- Loading dye contains bromophenol blue as a tracking dye to follow the progress of the electrophoresis (so you don’t run the smallest fragments off the end of your gel!) as well as glycerol to help the samples sink into the well.
- Xylene cyanol may also be used as a tracking dye, but does not migrate as far across the gel.
- Flick the eppendorf tubes to mix the contents, then quick spin them in the microfuge to bring the contents of the tubes to the bottom.
Illustration of proper gel loading technique.
- Load 25 μL of each digest into the gel, as well as 20 μL of 1kb DNA ladder.
- Be sure to record the order in which you load your samples!
- To load your samples, draw the volume listed above into the tip of your P200 or P20. Lower the tip below the surface of the buffer and directly over the well. You risk puncturing the bottom of the well if you lower the tip too far into the well itself (puncturing well = bad!). Expel your sample into the well. Do not release the pipet plunger until after you have removed the tip from the gel box (or you'll draw your sample back into the tip!).
- Once all the samples have been loaded, attach the gel box to the power supply and run the gel at 125 V for no more than 45 minutes.
Part 2: Protein backbone of IPC
Perhaps nothing is so conducive to a feeling of intimate familiarity with a protein as studying it at the amino acid level (primary structure). For the next part of lab today, you will examine a two-dimensional representation of inverse pericam.
Figure 1 from Nagai et al.
, PNAS 98
:3197 (2001) showing schematic representations of Ca2+
-sensitive reporter constructs. Your research will explore the properties of the inverse pericam construct (boxed in red).
- Open your APE file of the wild-type IPC sequence. If you do not have the original sequence saved, you can find the sequence attached to the Day 1 page (Part 2, #1).
- Label the following features of the IPC sequence in your APE file:
- M13 peptide: 1-78 bp
- EYFP (C-terminus portion): 91-372 bp
- EYFP (N-terminus portion): 400-831 bp
- CaM: 838-1281 bp
- Linker sequences: 82-90, 373-399, and 832-837 bp
- Refer to Figure 1 of the paper by Nagai et al., shown here on the right, which depicts the inverse pericam construct in schematic form, to cement your understanding of how the different components of IPC are connected.
- As you go through the steps below, label the calcium binding sites (and whatever else you deem relevant) in your IPC file. First, you will probably find it helpful to select ORFs → Translate for an amino acid level view of IPC.
- Select the following options for the translation: 1 letter code, line numbers left, DNA below, and copy highlight checked on.
- To help you locate the binding sites for calcium (in the calmodulin portion of IPC), read the following portions of the Zhang paper, along with skimming whatever else you find useful: abstract, first two paragraphs, “Linker and loop flexibility” section.
- In your IPC sequence document, you will mark the amino acid residues that make up the calcium-binding loops in CaM in orange. Begin by looking for the DNA or amino acid sequence of CaM on a site such as NCBI: choose Proteins → Protein Database, search for calmodulin, and scroll down for the amino acid sequence. The CaM sequence is highly conserved across species, so you can refer to almost any sequence and compare it to the one in your file. Are any residues of calmodulin missing in IPC? Why might this be?
- If you get stuck, use the fact that the CaM within inverse pericam is an E103Q mutant, that is, the 103rd residue of calmodulin is Q, to keep yourself oriented.
- Do the four calcium binding loops share any common features? You might imagine that negating or enhancing such features could decrease or increase calcium affinity, respectively.
- If you find other areas of calmodulin that you are interested in mutagenizing (e.g., hydrophobic pockets), mark these as well. You may find the “Loss of hydrophobic cavities” section in Zhang et al. helpful.
- As you consider sites that may alter calcium binding, keep the following in mind:
- When this module was first debuted, everyone mutated residues directly in the calcium binding loops, and very few groups saw dramatic changes in affinity or cooperativity of calmodulin with respect to calcium.
- In some years, class-wide results suggested that mutations in the first two binding loops were more likely to have an effect than mutations in the latter two binding loops.
- Some folks also targeted non-binding structural areas, but results were inconclusive.
- You may repeat or otherwise build upon prior results as long as you give your own reasoning.
Print out your annotated document and hang on to it for reference. Now let’s put some visuals to all those letters!
Part 3: Higher-order protein features
Unless we are precocious bioengineers indeed, looking at the amino acid sequence alone is unlikely to tell us too much about the protein. We might be left wondering where the binding sites for M13 and for calcium ions are located in calmodulin, for example. In the previous section, you read some primary scientific literature to locate these features. Now you will use a tool called Protein Explorer to visualize them. As you work, you can ask yourself why these stretches of the protein might work the way that they do, and how they might be changed.
- Protein Explorer is a free web-based viewer for biological molecules. To access it, open the Firefox browser and load proteinexplorer.org. Choose “FirstGlance in Jmol” to proceed.
- Structures are organized according to PDB (Protein Data Bank) identification codes, which may be input at the prompt at the top of the page. Begin by looking at the molecule with PDB ID number 1CLL, which is a calcium-bound form of calmodulin. Later you will search for an example of the ligand-free form, also called apo calmodulin.
- The program will open in FirstView mode for the structure you’ve chosen (ensure that popup blockers are off if the structure fails to load). On the right is the image panel, which shows your protein along with associated ligands (in this case, calcium). Try clicking and dragging on the rotating image to see what happens.
- Now look at the control panel on the upper left: here you can modify the image. Try adding and removing water molecules and ligands see where they interact with the protein.
- As you explore the features of the control panel and image panel, be sure to observe the message frame window on the lower left for any relevant information that may pop up. If you click on an atom in the image panel, its atomic identity will be displayed in the message frame, along with its encompassing amino acid residue and position.
- From the control panel, click on the PDB icon, which leads to detailed information about the publication upon which the model image is based.
- To find further options for modifying how you view the image, or search for particular atoms, click on More Views in the control panel, or on Jmol at the bottom right of the image panel. For example, you can highlight specific amino acids, or change from a backbone trace to a space-filling model. Explore these features. For example, you might use color to highlight all the acidic amino acids in calmodulin.
- Be sure to note any useful information in your notebook as you go. You might ask:
- What method was used to elucidate the structure of this protein?
- How good is the image resolution?
- Which species did this protein come from?
- When did the authors publish their results?
- What are the major components of the molecule’s secondary structure?
- What do the calcium binding loops (or other areas of interest you found) look like?
- Once you are satisfied with your understanding of calcium-bound calmodulin, look at an apo calmodulin structure (or two) for comparison. You might find the structure directly by using PDB, or by using the NCBI Structure database.
- Write a few sentences in your lab notebook describing the differences between the calcium-bound and apo forms of calmodulin.
Part 3: Primer design for mutagenesis
Primer design schematic for NEB Q5 Site Directed Mutagenesis.
It wouldn’t be very experimentally efficient to somehow pick out and modify a single residue on inverse pericam post-translationally. Instead, researchers genetically encode desired mutations, by making mutated copies of a plasmid that contains the inverse pericam DNA sequence. In addition to non-mutagenic amplification of a specific piece of DNA, synthetic primers can be used to incorporate desired mutations into the DNA. Primer design for site-directed mutagenesis is quite straightforward: the forward primer introduces a mutation into the coding strand. Both non-mutagenic and mutagenic amplification require cycles of DNA melting, annealing, and extension.
Remember from Day 1 that primers used in PCR amplification must meet several design criteria in order to ensure specificity and efficiency. Consider the following design guidelines for mutagenesis primers and think about how these differ from the guidelines for non-mutagenic amplification:
- The desired mutation (1-2 bp) must be present in the middle of the forward primer.
- The forward and reverse primers should 'face' away from the mutation and be 'back-to-back' when annealed to the template.
- The primers should be 25-45 bp long.
- A G/C content of > 40% is desired.
- Both primers should terminate in at least one G or C base.
- The melting temperature should exceed 78°C, according to:
- Tm = 81.5 + 0.41 (%GC) – 675/N - %mismatch
- where N is primer length, and the two percentages should be integers.
To demonstrate primer design, the illustration below uses S101L, which is an uninteresting mutation but is a straightforward teaching example.
Residue 101 of calmodulin is serine, encoded by the AGC codon. This is residue 379 with respect to the entire inverse pericam construct,
and we can find it and some flanking code in the DNA sequence from Part 2:
361 (5') GAG GAA ATC CGA GAA GCA TTC CGT GTT TTT GAC AAG GAT GGG AAC GGC TAC ATC AGC GCT (3')
381 (5') GCT CAG TTA CGT CAC GTC ATG ACA AAC CTC GGG GAG AAG TTA ACA GAT GAA GAA GTT GAT (3')
To change from serine to leucine, one might choose TTA, TTG, or CTN (wherer N = T, A, G, or C). Because CTC requires only two mutations (rather than three as for the other options), we choose this codon.
Now we must keep >10 bp of sequence on each side in a way that meets all our requirements. To quickly find G/C content and see secondary structures, look at the IDT website. (Note that the Tm listed at this site is not one that is relevant for mutagenesis.)
Ultimately, your forward primer might look like the following, which has a Tm of almost 81°C, and a G/C content of ~58%.
5’ GG AAC GGC TAC ATC CTC GCT GCT CAG TTA CGT CAC G 3'
The reverse primer is the inverse complement of a sequence just preceding the forward primer in the IPC gene. The forward and reverse primers are set up back-to-back.
Lucky for us, NEB has a tool that can design our mutagenic primers.
- Go to the NEBaseChanger site and click 'Please enter a new sequence to begin.'
- A new window will open. Copy and paste the wild-type IPC sequence.
- Confirm that the 'Substitution' option is selected.
- Highlight the basepairs you want to mutate using by scrolling through the sequence, or you can search the sequence by typing the basepairs into the 'Find' box.
- Type the new DNA sequence (the basepair(s) you want your forward mutagenic primer to incorporate into the IPC sequence) in the 'Desired Sequence' box.
- Under the Result header, a diagram showing where your primers will anneal is provided.
- Under the Required Primers header, the sequences for your forward primer and reverse primer are shown with the characteristics for each.
- Screen capture the information provided in the Result and Required Primers sections.
- Embed the images in your notebook.
- Print the screen capture and submit it to the teaching faculty before you leave today. In addition, record your primer sequences in the table on the Discussion page.
- It is very important that you submit your primer sequences before you leave! The teaching faculty will order your primers from IDT DNA tonight to ensure they arrive by your next class.
- Use the guidelines above to examine the mutagenesis primers designed by NEBaseChanger. Include your thoughts in your notebook.
- Do NOT alter the primers provided by NEB.
- NEB Loading Dye (link)
- 2.5% Ficoll®-400
- 11 mM EDTA
- 3.3 mM Tris-HCl
- 0.017% SDS
- 0.015% bromophenol blue
- pH 8.0 @ 25 °C
- 1x TAE
- 40 mM Tris
- 20 mM acetic acid
- 1 mM EDTA, pH 8.3
Next day: Site-directed mutagenesis
Previous day: In silico cloning