20.109(S20):Analyze RNA-seq data and select gene targets for quantitative PCR (qPCR) experiment (Day 6)

From Course Wiki
Jump to: navigation, search
20.109(S20): Laboratory Fundamentals of Biological Engineering

Sp20 banner image v2.png

Spring 2020 schedule        FYI        Assignments        Homework        Class data        Communication
       1. Screening ligand binding        2. Measuring gene expression        3. Engineering antibodies              


Introduction

The transcriptome is the full suite of transcripts within an organism and provides the key link between the genetic code and phenotype. Research focused on the transcriptome has provided important insights into how gene expression is altered in different cell / tissue types, in developmental phases, in disease states, and between species. In this module, you will evaluate gene expression in the DLD-1 cell line to assess the effects of a drug treatment on the transcriptome in these cells.

The gene expression data was generated using RNA-seq. In this method deep sequencing is completed using reagents and equipment from Illumina. With this technology, transcripts are directly sequenced and mapped to a reference genome. Then the reads are counted to provide information on gene expression levels for a particular portion of the genome (i.e. for a particular gene).

Image from Goodwin et al. (2013) Nature Rev. 17:333-351.
In RNA-seq, RNA is purified from cells and reverse transcribed into DNA. The DNA molecules are modified with adapters, which are ligated to both ends of the DNA. Sequences complementary to the adapters are attached to the surface of flow cell channels and facilitate binding of modified DNA molecules and provide a primer for DNA polymerase. Following the initial binding to the flow cell channel, the DNA molecules from bridges that enable bridge amplification and cluster generation (see figure to the right). Through this process millions of dense clusters containing double-stranded DNA are generated.

To directly sequence from the clusters a sequencing by synthesis approach is used. In this, several rounds of amplification are performed using deoxynucleoside triphosphate (dNTP) bases. dNTP are terminator molecules given that the ribose 3'-OH group is blocked thereby preventing elongation by polymerase. Each terminating base (dATP, dTTP, dCTP, and dGTP) is fluorescently labeled (dATP = red, dTTP = green, dCTP = blue, and dGTP = yellow). For each round of sequencing a mixture containing all four labelled dNTP bases is added and a single base is incorporated to each DNA molecule bound to the flow cell channels. The flow cell is then imaged to capture the dNTP base that was added at each cluster location. Then the fluorescent label and 3'-OH blocking group is removed from the incorporated dNTP and another round of sequencing is performed. This results in the full sequences of every DNA molecule bound to the flow cell channels. Therefore, the sequence of the cluster denoted by the asterisk is GCTGA in the schematic provided below.

Sp18 20.109 M2D4 illumina sequencing.png

As with all technologies, there are positives and negatives to RNA-seq. On the plus side, the ability to directly sequence enables researchers to assess gene expression in organisms for which a full genome sequence is not available or not fully annotated. Furthermore, this method allows for the quantification of individual isoforms that result from alternate splicing. On the minus side, the cost of RNA-seq can limit the depth of sequencing achieved and genes that are not highly expressed may not be captured in a data set.

Protocols

Welcome back!! To ensure we all are ready to go, let's review RNA-seq. In addition to reading the text provided in the Introduction, please watch the Illumina synthesis by sequencing (SBS) video which shows the process used to acquire the RNA-seq data you are analyzing in this module (linked here). Remember, following RNA purification, the samples were submitted to the BioMicro Center for Illumina sequencing. Illumina sequencing technology, SBS, is used for massively parallel sequencing with a proprietary method that detects single bases as they are incorporated into growing DNA strands.

Part 1: Analyze RNA-seq data

Today you will expand on your analysis of the RNA-seq data gathered from the DLD-1 cell line.

Complete Exercise #3 developed by Amanda Kedaigle, Anne Shen & Prof. Ernest Fraenkel. In this exercise, you will first work through a refresher focused on the clustering methods used in the previous analysis. Then you will use the skills you developed to examine a published dataset (A549 cells treated with etoposide) and compare the results to those collected for DLD-1.

Part 2: Select genes for qPCR experiment

During the analysis in exercise 2 you identified the most prominent gene changes in response to etoposide in the RNA-seq data in the DLD-1 and DLD-1 etoposide gene sets, you are able to select particular genes of interest to pursue using qPCR.

Please consult the list of possible target genes (linked here) available from the pilot studies and chose 2-3 genes to quantify. A good place to start to learn more about genes of interest is GeneCards (https://www.genecards.org). It gives alternative names for a gene and a basic functional description as well as links to additional information.

  • Note: Consider your strategy when choosing genes. Are you interested in comparing genes from different GO groups? Are you interested in a particular GO group to analyze in more depth? Analyzing the RNA-seq data gave you a broad idea of the effect of etoposide on DLD-1 cells, but you will need to think about what direction you want your research article to take.

Once you have selected your genes of interest, please see the class data page for qPCR pilot data to analyze. The pilot data has been divided into 3 sheets, each sheet representing a separate experiment where each gene had 3 technical replicates. To more closely replicate the experiment you would have done in the lab, please chose your genes from only 1 experiment.

Navigation links

Next day: Review qPCR experiment and complete statistical analysis

Previous day: Purify RNA from etoposide-treated cells and generate cDNA