In the previous laboratory session, we researched how to best target a gene using the CRISPRi system. The goal for today is to review the strategy used in constructing an expression plasmid that encodes the sgRNA_target. This will allow for transcription of the sgRNA_target molecule and ultimately provides the specificity needed to target dCas9 to a gene of interest. The strategy used in constructing the psgRNA_target plasmid involved two common methods: primer design and polymerase chain reaction.
To amplify a specific sequence of DNA, you first need to design primers -- one primer that anneals at the start of the sequence of interest (the 5' end) and a second primer that anneals at the end of the sequence of interest (the 3' end). The primer that anneals at the start of the sequence is referred to as the 'forward' primer. The forward primer anneals to the non-coding DNA strand and reads toward, or into, the gene of interest. The 'reverse' primer anneals to the coding DNA strand at the end of the sequence and reads back into the sequence. Primers can also be useful in adding sequence to sequences upon amplification via the polymerase chain reaction.
Polymerase chain reaction (PCR)
The applications of PCR are widespread, from forensics to molecular biology to evolution, but the goal of any PCR is the same: to generate many copies of DNA from a single or a few specific sequence(s) (called the “template”). In addition to the template, PCR requires only three components: primers to bind sequence flanking the target, dNTPs to polymerize, and a heat-stable polymerase to catalyze the synthesis reaction over and over and over. DNA polymerases require short initiating pieces of DNA called primers to copy DNA. In PCR amplification, forward and reverse primers that target the non-coding and coding strands of DNA, respectively, are separated by a distance equal to the length of the DNA to be copied. To amplify DNA, the original DNA segment, or template DNA, is denatured using heat. This separates the strands and allows the primers to anneal to the template. Then polymerase extends from the primer to copy the template DNA. How many cycles of PCR are required to achieve the desired double-stranded amplification product?
Schematic of PCR amplification.
PCR amplification results from multiple (typically ~30) cycles of three steps: denaturation, annealing, and extension.
Several features are important to consider when designing primers for PCR. Primers that are too short may lack requisite specificity for the desired sequence, and thus amplify an unrelated sequence. The longer a primer is, the more favorable are its energetics for annealing to the template DNA, due to increased hydrogen bonding. On the other hand, longer primers are more likely to form secondary structures such as hairpins, leading to inefficient template priming. Two other important features are G/C content and placement. Having a G or C base at the end of each primer increases priming efficiency, due to the greater energy of a GC pair compared to an AT pair. The latter decrease the stability of the primer-template complex. Overall G/C content should ideally be 50 +/- 10%, because long stretches of G/C or A/T bases are both difficult to copy. The G/C content also affects the melting temperature. PCR is a three-step process (denature, anneal, extend) and these steps are repeated 20 or more times. After 30 cycles of PCR, there could be as many as a billion copies of the original template sequence.
Part 1: Participate in Comm Lab workshop
Our communication instructors, Dr. Prerna Bhargava and Dr. Sean Clarke, will join us today for a discussion on preparing a Research proposal presentation.
Part 2: Research sgRNA expression plasmid
After identifying the sgRNA_target sequence that is best for targeting the gene of interest in the host genome, the sgRNA_target is inserted into an expression plasmid. This is achieved using primers and PCR in the procedure described in Part 2. Before you review the approach used to insert the sgRNA sequence, first familiarize yourself with the important features of the expression plasmid. The expression plasmid used for regulating gene expression via the CRISPRi system is described in the following article:
Larson et al. "CRISPR interference (CRISPRi) for sequence-specific control of gene expression." Nature. (2013) 8:2180-2196.
In this exercise, you will explore the features present in the plasmid that are necessary to express the sgRNA_target sequence (see plasmid map below).
In your laboratory notebook,
Image modified from Larson et. al. Nature
. (2013) 8:2180-2196.
complete the following:
- Review the details provided in the caption for Fig. 1B of the Larson et. al. article.
- Describe the purpose / role for each of the following features that are present in the bacterial sgRNA plasmid. Please note: you many need to reference resources outside of the article!
- Constitutive - pJ23119
- Base-pairing region
- dCas9 handle
- S. pyogenes terminator
- Term (rrnB)
- restriction enzyme recognition sequences: EcoRI, BglII, and BamHI
Part 3: Review sgRNA expression plasmid construction strategy
Synthesize sgRNA_target sequence
In the previous laboratory session you learned how to best target genes of interest by designing sgRNA sequences that bind within the host genome. The sgRNA sequence is what enables the CRISPRi system to specifically target the promoter or coding region of a gene, and thereby regulate gene expression. Thus far, we have described the sgRNA as only containing sequence complementary to the targeted gene in the genome, but this is only half of the full sgRNA used in CRISPR-based technologies. The second half of the sgRNA is a 'handle' that binds dCas9 (or Cas9 in the native CRISPR system). As shown in the image to the right, the targeting sequence and the dCas9 handle together compose the sgRNA. To synthesize the full sgRNA, the basepair sequence was submitted to IDT-DNA, a commercial company that specializes in DNA synthesis chemistry. To generate the full sgRNA, the basepair sequences for the targeting sequence and the dCas9 handle were simply submitted as a single DNA strand.
Before we continue, we should review the process used to generate actual primers that can be used to amplify DNA. Current oligonucleotide, or primer, synthesis uses phosphoramidite monomers, which are simply nucleotides with protection groups added. The protection groups prevent side reactions and promote the formation of the correct DNA product. The DNA product synthesis starts with the 3'-most nucleotide and cycles through four steps: deprotection, coupling, capping, and stabilization. First, deprotection removes the protection groups. Second, during coupling the 5' to 3' linkage is generated with the incoming nucleotide. Next, a capping reaction is completed to prevent uncoupled nucleotides from forming unwanted byproducts. Lastly, stabilization is achieved through an oxidation reaction that makes the phosphate group pentavalent. For a more detailed description of this process, read this article from IDT DNA.
Insert sgRNA_target sequence into expression plasmid
The Q5 Site Directed Mutagenesis Kit from NEB was used to insert the sgRNA_target sequence into the expression plasmid. For this, a reaction was prepared with the following: the sgRNA_target primer, a universal CRISPRi primer, and the expression plasmid.
Image modified from Larson et. al. Nature
. (2013) 8:2180-2196.
As shown in the figure to the right, the steps used to insert the sgRNA_target sequence into the expression vector are based on the principles of PCR. The expression plasmid is the template in this amplification reaction. For amplification to occur, a forward primer and a reverse primer are required. In this reaction, the forward primer (labeled Primer Ec-F in the figure) is the full sgRNA_target that consists of the targeting sequence and the dCas9 handle. The dCas9 handle part of the primer anneals to the complementary sequence in the expression plasmid. The targeting sequence does not anneal to the expression plasmid, instead this part of the primer is incorporated into the amplification product during PCR. The reverse primer (labeled Primer Ec-R in the figure) anneals to a complementary sequence in the expression plasmid and reads away from the forward primer. The product from each of these primers is a single-stranded DNA molecule. Because these single-stranded products are complementary, the strands will anneal and form a double-stranded product.
In your laboratory notebook, complete the following:
- Draw the single-stranded product (5' - 3') that is generated from the forward primer. Draw the product generated from the reverse primer.
- Include the features that are in each product.
- The single-stranded products will anneal to form a double-stranded product. Is this double-stranded product linear or circular? Why?
A more technical depiction of the protocol used to insert an sgRNA_target sequence into the expression plasmid is included below. Briefly, in Step 1 DNA polymerase copies the plasmid using the sgRNA_target primer to insert the target sequence. Following PCR amplification the product is a linear DNA fragment. In Step 2 circular plasmids that carry the sgRNA_target sequence are generated when the double-stranded DNA is phosphorylated (Step 2A) and then ligated (Step 2B). Following the amplification reaction, the expression plasmid template that does not contain insert is present in the reaction product. To ensure that only the sgRNA_target-containing expression plasmid is used in the next steps, the parental DNA is selectively digested using the DpnI enzyme (Step 2C). The underlying selective property is that DpnI only digests methylated DNA. Because DNA is methylated during replication in host cells, DNA that is synthetically made via an amplification reaction using PCR is not methylated. Lastly, in Step 3 the sgRNA_target-containing expression plasmid is transformed into competent cells that propagate the plasmid.
Schematic for inserting sequences into plasmids using SDM technique.
Image modified from Q5 Site-Directed Mutagenesis Kit Manual published by NEB.
Confirm sgRNA_target with sequencing
The sgRNA_target sequence that was inserted into the expression plasmid was confirmed using DNA sequencing. The invention of automated sequencing machines has made sequence determination a relatively fast and inexpensive process. The method for sequencing DNA is not new but automation of the process is recent, developed in conjunction with the massive genome sequencing efforts of the 1990s and 2000s. At the heart of sequencing reactions is chemistry worked out by Fred Sanger in the 1970s which uses dideoxynucleotides, or chain-terminating bases. These chain-terminating bases can be added to a growing chain of DNA but cannot be further extended. Performing four reactions, each with a different chain-terminating base, generates fragments of different lengths ending at G, A, T, or C. The fragments, once separated by size, reflect the DNA sequence due to the presence of fluorescent dyes, one color linked to each dideoxy-base. The four colored fragments can be passed through capillaries to a computer that can read the output and trace the color intensities detected.
Principles of Sanger sequencing.
A. Chain-terminating bases are used to halt the DNA synthesis reaction at different lengths and attach a fluorophore that is used to determine the sequence of the DNA strand. B. The sequence of the DNA strand is determined using the fluorescent signature associated with each length of DNA in the reaction, this is visualized as a chromatogram.
Part 4: Align sgRNA_target sequences to host genome
To ensure that you understand the principles of using sgRNA to target a gene of interest, complete the sgRNA Design Worksheet with your laboratory partner (linked here). It may be helpful to review the work you completed in Part 2 of the previous laboratory session.
In your laboratory notebook, attach the completed sgRNA Design Worksheet.
Next, you will use your knowledge of primer design to align the sgRNA_target sequences that were designed by former 109ers to the targeted genes in the MG1655 genome. Recall that your goal in this module is to optimize the CRISPRi system by building on the data collected by students in previous semesters. The first step in achieving this goal is mining the data that exist! The sgRNA_target sequences that you will assess are included in the table below:
||gRNA sequence (5' → 3')
||Data set ID
|| coding region
|| Fa17_TR Green
|| Fa17_WF Orange
|| coding region
|| Fa18_TR Orange
|| coding region
|| Fa19_TR Orange
|| coding region
|| Fa19_TR Yellow
|| Fa19_TR Purple
|| Fa17_TR Orange
|| coding region
|| Fa17_TR Blue
|| Fa17_WF Red
|| coding region
|| Fa18_TR Red
|| Fa19_TR Green
|| Fa17_WF CyanC
With your laboratory partner, align the sgRNA_target sequences with the targeted genes. Feel free to divide the workload, one partner can align the sequences that target ldhA and the other can align the sequences that target pta-ack.
- Use the KEGG Database to obtain the DNA sequences of the targeted genes (ldhA and pta-ack) in the E. coli K-12 MG1655 strain.
- Enter the name of targeted gene in the Search genes box and click Go.
- Double click on the linked gene name.
- In your laboratory notebook, use the information provided in the KEGG database to answer the following questions:
- What is the full name of the gene (or Definition)?
- In what pathways is the gene involved?
- The amino acid (AA) sequence and nucleotide sequence (NT) for the gene are provided at the bottom of the page.
- Generate a new DNA file in SnapGene that contains the NT sequence of the gene.
- Because sgRNA_target molecules were generated that target the promoter, enter 50 in the +upstream box to get the 50 basepair sequence immediately preceding the start codon.
- Identify the sgRNA_target sequences from the table below in the MG1655 targeted gene.
- For each sgRNA_target sequence, create a feature in the SnapGene file.
- In your laboratory notebook, complete the following:
- Attach the SnapGene file with the sgRNA_target sequences aligned.
- Draw a simplified schematic that shows sgRNA_target sequence alignments. For an example, refer back to Fig. 2C and 2D from the Lei et. al. article.
- Determine which sgRNA_target sequences bind downstream of a PAM sequence and indicate this information on the schematic.
- Speculate on which sgRNA_target sequences might be better at increasing ethanol yield. Include reasons for why!
- Q5 Site Directed Mutagenesis Kit (from NEB)
- Q5 Hot Start High-Fidelity 2X Master Mix: propriety mix of Q5 Hot Start High-Fidelity DNA Polymerase, buffer, dNTPs, and Mg2+
- 2X KLD Reaction Buffer
- 10X KLD Enzyme Mix: proprietary mix of kinase, ligase, and DpnI enzymes
- Universal CRISPRi reverse primer 5' - ACT AGT ATT ATA CCT AGG ACT GAG CTA GC - 3'
- SOC medium: 2% tryptone, 0.5% yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2, 10 mM MgSO4, and 20 mM glucose
- LB+Amp plates
- Luria-Bertani (LB) broth: 1% tryptone, 0.5% yeast extract, and 1% NaCl
- Plates prepared by adding 1.5% agar and 100 μg/mL ampicillin (Amp) to LB
Next day: Analyze ethanol yield data
Previous day: Research project scope and experimental strategy