Difference between revisions of "20.109(S22):M1D2"

From Course Wiki
Jump to: navigation, search
(Evaluate raw data for hits)
(Part 1: Analyze SMM results)
Line 30: Line 30:
 
#Complete 'by-eye' analysis of hits.
 
#Complete 'by-eye' analysis of hits.
  
'''Align printed small molecule spots using fluorescence on 532 channel'''
+
'''Align printed small molecule spots using fluorescence on 532 channel''' <br>
 
[[Image:Sp22 M1D2 P1 align spots.png|thumb|right|550px]] To process the slide images, a grid pattern (panel A) is aligned to each block using the fluorescence measured on the 532 channel. In the example to the right (panel B), the closed green spots represent the fluorescence signal on the 532 channel and the open green circles represent the software alignment.  As you can see, the software misaligned the "X" pattern for this block.  The orange arrows indicate the manual manipulation completed by the user to correct the alignment.
 
[[Image:Sp22 M1D2 P1 align spots.png|thumb|right|550px]] To process the slide images, a grid pattern (panel A) is aligned to each block using the fluorescence measured on the 532 channel. In the example to the right (panel B), the closed green spots represent the fluorescence signal on the 532 channel and the open green circles represent the software alignment.  As you can see, the software misaligned the "X" pattern for this block.  The orange arrows indicate the manual manipulation completed by the user to correct the alignment.
  
Line 37: Line 37:
 
*Why is it important that the "X" pattern is correctly aligned for each block?
 
*Why is it important that the "X" pattern is correctly aligned for each block?
  
'''Quantify fluorescence on 635 nm channel'''
+
'''Quantify fluorescence on 635 nm channel''' <br>
 
[[Image:Sp22 M1D2 P1 quantify 635nm.png|thumb|right|300px]] After the grid pattern is aligned, the fluorescence is measured on the 635 nm channel.  In this, the slide is 'read' pixel by pixel and a numerical value is assigned that indicates the intensity of the signal measured at a particular pixel as illustrated in the image to the right.  The computer software uses the numerical values to calculate the signal-to-noise ratio (<math> SNR </math>).   
 
[[Image:Sp22 M1D2 P1 quantify 635nm.png|thumb|right|300px]] After the grid pattern is aligned, the fluorescence is measured on the 635 nm channel.  In this, the slide is 'read' pixel by pixel and a numerical value is assigned that indicates the intensity of the signal measured at a particular pixel as illustrated in the image to the right.  The computer software uses the numerical values to calculate the signal-to-noise ratio (<math> SNR </math>).   
  
Line 47: Line 47:
  
 
<center>
 
<center>
<math> robust Z-score = \frac{SNR_{i}-median(SNR)}{median(|SNR_{i}-median(SNR)|)*1.48} </math>, ''where'' <math> SNR_{i}-median(SNR)</math> = ''median absolute variation (MAD) and <math> 1.48 </math> = ''scale factor for the normal distribution
+
<math> robust Z-score = \frac{SNR_{i}-median(SNR)}{median(|SNR_{i}-median(SNR)|)*1.48} </math>, ''where'' <math> SNR_{i}-median(SNR)</math> = ''median absolute variation (MAD)'' and <math> 1.48 </math> = ''scale factor for the normal distribution''
 
</center>
 
</center>
  
Line 54: Line 54:
 
*What does the presence of signal on the 635 nm channel indicate?
 
*What does the presence of signal on the 635 nm channel indicate?
  
'''Identify hits with improbably high fluorescence'''
+
'''Identify hits with improbably high fluorescence''' <br>
 
The z-score values for the hits identified in an SMM can be ordered and graphed to identify a threshold value.  It is reasonable to assume that small molecule compounds that have the highest z-score are the most promising candidates; however, how many of the 'top' hits should be tested?  Graphing the z-scores shows how the values compare to the population and to the negative control which enables the research to define a threshold that is likely to include more promising hits and exclude less promising hits.
 
The z-score values for the hits identified in an SMM can be ordered and graphed to identify a threshold value.  It is reasonable to assume that small molecule compounds that have the highest z-score are the most promising candidates; however, how many of the 'top' hits should be tested?  Graphing the z-scores shows how the values compare to the population and to the negative control which enables the research to define a threshold that is likely to include more promising hits and exclude less promising hits.
  

Revision as of 18:31, 4 February 2022

20.109(S22): Laboratory Fundamentals of Biological Engineering

Sp17 20.109 M1D7 chemical structure features.png

Spring 2022 schedule        FYI        Assignments        Homework        Class data        Communication        Accessibility

       M1: Drug discovery        M2: Metabolic engineering        M3: Project design       


Introduction

Though you may be able to qualitatively visualize the spots that appear to emit more fluorescence, it is important to complete quantitative analysis that supports your observations. The microarrayer reads the fluorescence signals emitted from the surface of the SMM slide at two excitation wavelengths. As noted previously, the 532 nm wavelength was used to excite fluorescein, which was printed in an 'X' pattern to assist with alignment. The 635 nm wavelength was used to excite Alexa Fluor 647; which would be associated with TDP43-RRM12 bound to a small molecule on the slide. A hit denotes a spot on the slide that emits a red fluorescence signal significantly higher than the background fluorescence level. In terms of protein binding, a hit denotes that the TDP43-RRM12 protein is bound to a small molecule and is therefore localized to a specific position on the slide. You will analyze the fluorescence signal collected by the microarray scanner using a value termed the robust z-score.

Sp20 M1D6 background, foreground.png
The robust z-score differentiates signal from noise by providing a value that represents the intensity of a signal above background. In the case of the SMM experiment, the intensity of a fluorescent signal above the background fluorescence is calculated. To do this the fluorescence emitted across the entire slide is grouped to define the Median Absolute Deviation (MAD), which is is a measure of the variability of a univariate dataset. Though beyond the scope of this class, the equation for calculating the robust z-score assigns a value for how much more intense the fluroescent signal at a spot is over background. The higher the value, the more different the signal from background.

When the SMM slides were imaged, the microarrayer also produced a GAL file, or GenePix Array List. The GAL file contains information about where each spot was printed, and what compound was printed there. However, the relationship between the GAL file and the actual contact of the print head is very imprecise. Instead, the fluorescein guide spots are used to align the array in the GAL file to the true print location for each pin. Following the alignment, the software quantifies the fluorescence at 635 nm within the deposition region of each spot (foreground) and the fluorescence immediately outside of this region, where nothing was printed (background) as illustrated in the image to the right. These values are used to calculate the robust z-score. From the robust z-score, you can determine the associated probability that the observed fluorescence occurred by chance, and if this probability is sufficiently low, we call the small molecule a 'hit'.

After hits are identified via the robust z-score, the data are examined by-eye using the criteria discussed in the prelab. In the exercises below you will review this process to gain a better understanding for how hits the hits identified in the SMM screen that targeted TDP43-RRM12 were validated.

Protocols

Part 1: Analyze SMM results

In the previous laboratory session, you reviewed how the SMM screen for TDP43-RRM12 was completed and how the slides were imaged for analysis. Prior to performing the SMM experiment, small molecule compounds were printed onto slides using the method described in the M1D1 Introduction. Each slide was printed with ~12,000 spots! In addition to ~4,200 small molecules (printed in duplicate), fluorescein and DMSO spots were included on every slide. The fluorescein spots, or sentinel spots, are used to align a grid pattern to the slide so the small molecule compounds at each spot can be identified. The DMSO spots are a negative control. Each slide is arranged as depicted below. The fluorescein spots are printed in an "X" pattern across the slide (panel A). Each "X" section represents a block within the slide (panel B).

Sp22 M1D2 slide, block.png


The goal for today is to more thoroughly explore the analysis steps that were used to identify the hits for TDP43-RRM12. In this exercise you will consider the importance of each of the four steps listed below in the identification of hits.

  1. Align printed small molecule spots using fluorescence on 532 nm channel.
  2. Quantify fluorescence on 635 nm channel.
  3. Identify hits with improbably high fluorescence.
  4. Complete 'by-eye' analysis of hits.

Align printed small molecule spots using fluorescence on 532 channel

Sp22 M1D2 P1 align spots.png
To process the slide images, a grid pattern (panel A) is aligned to each block using the fluorescence measured on the 532 channel. In the example to the right (panel B), the closed green spots represent the fluorescence signal on the 532 channel and the open green circles represent the software alignment. As you can see, the software misaligned the "X" pattern for this block. The orange arrows indicate the manual manipulation completed by the user to correct the alignment.

In your laboratory notebook, complete the following:

  • What is the source of the fluorescence measured on the 532 nm channel?
  • Why is it important that the "X" pattern is correctly aligned for each block?

Quantify fluorescence on 635 nm channel

Sp22 M1D2 P1 quantify 635nm.png
After the grid pattern is aligned, the fluorescence is measured on the 635 nm channel. In this, the slide is 'read' pixel by pixel and a numerical value is assigned that indicates the intensity of the signal measured at a particular pixel as illustrated in the image to the right. The computer software uses the numerical values to calculate the signal-to-noise ratio ($ SNR $).

$ SNR = \frac{\mu _{foreground}-\mu _{background}}{\theta_{background}} $

The SNR value is then used to calculate the robust Z-score. This value provides a measure of how different the signal is from background.

$ robust Z-score = \frac{SNR_{i}-median(SNR)}{median(|SNR_{i}-median(SNR)|)*1.48} $, where $ SNR_{i}-median(SNR) $ = median absolute variation (MAD) and $ 1.48 $ = scale factor for the normal distribution

In your laboratory notebook, complete the following:

  • What is the source of the fluorescence measured on the 635 nm channel?
  • What does the presence of signal on the 635 nm channel indicate?

Identify hits with improbably high fluorescence
The z-score values for the hits identified in an SMM can be ordered and graphed to identify a threshold value. It is reasonable to assume that small molecule compounds that have the highest z-score are the most promising candidates; however, how many of the 'top' hits should be tested? Graphing the z-scores shows how the values compare to the population and to the negative control which enables the research to define a threshold that is likely to include more promising hits and exclude less promising hits.

One method used to graph the z-score results is to simply plot the averaged value for each small molecule compound. In the graph below the number of small molecules (y-axis) with a particular z-score (x-axis) are shown in purple. The data for DMSO are shown in red. From this graph you can see that the DMSO negative control spots emit a signal equal or greater to some of the small molecule compounds that were screened.

Averaged z-scores for each small molecule compound. The y-axis indicates the number of small molecule compounds with a particular z-score, as indicated on the x-axis.

In your laboratory notebook, complete the following:

  • What does it mean if a small molecule compound has a z-score less than or the same as what is measured for DMSO?
  • From this graph, what is a good threshold value that will include more promising hits and exclude less promising hits?

Another method is to group the small molecule compounds based on the averaged z-score. In the graph below the number of small molecule compounds (y-axis) greater than or equal to a particular z-score (x-axis) are shown in purple. The data for DMSO are shown in red. From this graph you can see that only one small molecule compound has a z-score of 50 and ~20 small molecule compounds have a z-score of 10 or greater.

Cumulative z-scores for each small molecule compound. The y-axis indicates the number of small molecule compounds with a z-score at a particular value or higher, as indicated on the x-axis.

In your laboratory notebook, complete the following:

  • From this graph, what is a good threshold value that will include more promising hits and exclude less promising hits?
  • Which graphical representation of the z-score results provides a better tool for defining a threshold value? Why?

Complete 'by-eye' analysis of hits The final step in the validation process is to actually look at the flourescence of the hits. The analysis software captures an image for each small molecule compound and groups the images for each compound (remember that replicates for each small molecule compound are included in the slide!). For each small molecule the below information is available for the researcher to assess.

Sp22 M1D2 by eye analysis.png

In the window above, the small molecule compounds included in the screen are listed in rank order by z-score (panel A). To see the raw data, the research selects a compound from the menu. After a small molecule is selected, the images for each replicate are displayed for the 532 nm channel and for the 635 nm channel (panel B). The multiple images indicate the number of replicates that were screened within one slide and across different slides. Each row of the table contains images from a single slide. In this experiment the small molecule was printed on three slides. Remember that each small molecule compound is printed in duplicate on each slide such that each column of the table contains the images for each of the replicates within the slide. In the table, the first column for the 532 nm channel and the first column for the 635 nm channel are images for the same spot. Lastly, the chemical structure for the small molecule compound is shown (panel C).

When confirming hits in the by-eye analysis, it is important to confirm that signal is only present on the 635 nm channel. In the example above the images for the 635 nm channel indicate strong signal in a central location consistent with a spot on the slide. The images for the 532 nm channel are indicative of background noise. It is also important to confirm that the replicates are consistent.

In your laboratory notebook, complete the following:

  • Why is it important to complete a by-eye inspection of the SMM results?

Part 2: Complete by-eye analysis of hits identified in SMM using TDP43

Using the analysis workflow described in Part 1 and the results of a biological assay completed by 109ers in Sp21 (ask your Instructor if you are interested in the details!), five small molecule compounds were selected for your research this semester. In this exercise you will evaluate the results and structures of the small molecules, then you will select which hits you want to test using functional assays.

Use the information in the table below to complete the exercises in Part 2:

Compound ID SMM ID Formula Molecular weight Raw data SMILES
95877382 (2) 01:KI0000454:E08 C17H18N4O4S 374.4 01-KI000454-E08 S(=O)(=O)(c1ccc(cc1)CCNC(=O)c1cc(n[nH]1)c1oc(cc1)C)N
69269200 (3) 05:KI0001106:N03 C17H23N3O2 301.4 05-KI0001106-N03 c1(N2C[C@H]([C@@](CC2)(O)CC)O)nc2c(cc1C#N)CCCC2
83079118 (4) 13:KI0001562:P15 C16H15FN4O 298.3 13-KI0001562-P15 n1(c(n[nH]c1=O)Cc1ccncc1)[C@@H](c1ccc(cc1)F)C
83023303 (5) 02:KI0001354:N19 C16H26N2O2 351.3 02-KI0001354-N19 C1(O)(CNCCC1)CNCCc1ccc(cc1)OCC
26408703 (6) 07:KI0000907:G08 C18H27N7O2 373.5 07-KI0000907-G08 c12n(c(c(c(n1)C)CCC(=O)N1[C@H](C(=O)NC(C)C)C[C@@H](C1)N)C)ncn2

Evaluate raw data for hits

Visually validate the raw data images for each of the small molecule compounds in the table above using the guidelines provided in Part 1.

In your laboratory notebook, complete the following:

  • For each small molecule, answer the questions below:
    • Is the z-score above the threshold value you define in Part 1?
    • Do the images captured on the 532 nm channel appear as expected? Why or why not?
    • Do the images captured on the 635 nm channel appear as expected? Why or why not?
    • Are the images for the replicates consistent?

Identify common features in hits

One method for assessing protein-small molecule binding is to visually inspect known small molecule binders for common features / structures. To do this you will carefully examine the hits and identify any common features / structures. As in the image below, it is possible that multiple features will be present within the same small molecule.

Sp17 20.109 M1D7 chemical structure features.png


Review the hits that were identified in the SMM screen completed for TDP43-RRM12. To see the chemical structures, translate the SMILES strings using one of the methods described in the text below the table. It may be easier to copy / paste the small molecule images into a powerpoint file so you can readily see all of the structures. Also, it may be helpful to use a color-coding system (like the one in the image provided above) to highlight features / structures that are common to the hits.

These online resources may be helpful to learning more about the hits identified in the SMM:

  • Cloud version of ChemDraw here.
    • Copy and paste the small molecule smiles into the work space to get a chemical structure
  • Platform to transform the smiles information into a PubChem ID here.
    • Copy and paste the smiles into the input ID search to determine the ID number.
  • PubChem database of chemical information here.
    • Includes small molecule molecular weight and other useful information.

In your laboratory notebook, complete the following:

  • How many features did you identify that are present in two or more of the small molecules that putatively bind TDP43-RRM12? Are there more or less than you expected?
  • Is there a feature present in all of the identified small molecules? What might this suggest about the binding site(s) and / or binding ability of TDP43-RRM12?
  • Can you assign the identified small molecules to sub-groups based on the common features that are present?
  • What might the different features represent? More specifically, consider whether each subgroup has a unique binding site on the target protein or if each subgroup represents different solutions for interacting with the same binding site.
  • How might you make modifications to the small molecules / features to probe binding? As a hint, consider how different functional groups could be positioned at a given site without altering qualitative binding in the SMM assay to translate that into some testable ideas.

Select small molecule compounds to test in functional assays

Lastly, using the information you gathered from your analysis, select two small molecules that you will test in functional assays. Complete the appropriate portion of the table on the Class data tab to select your molecules.

In your laboratory notebook, complete the following:

  • Record the chemical ID for the small molecule compounds you selected.
  • Provide some rationale for why these small molecules were selected for further study.

Part 3: Discuss journal article

To further help you in preparing your Data summary, we will discuss how similar data are presented in a publication from the Koehler laboratory.

Chen et al. titled "Small molecule microarrays enable the discovery of compounds that bind the Alzheimer's Aβ peptide" (2010) J Am Chem Soc 132:17015-17022.

The initial experiment presented by Chen et. al. was an SMM that identified ligands binders of the amyloid-β (Aβ) petptide. This first step is very similar to what was done to identify the hits you are testing in this module! To further assess the results of the SMM, the authors completed several follow-up experiments to test the effect of the small molecule on functionality of the Aβ petptide.

In the context of your research, this article focuses on the next step experiments that can be performed after a drug candidate is discovered from a screen. Though you can use this article as guidance as you consider the experiments that could follow your screen, remember that the specific next step experiments should be related to the protein target and drug candidate(s) identified in your project. For this exercise, the focus in on how the data are organized and presented.

From the Introduction

Consider the key components of an introduction:

  • What is the big picture?
  • Is the importance of this research clear?
  • Are you provided with the information you need to understand the research?
  • Do the authors include a preview of the key results?

From the Results

Carefully examine the figures. First, read the captions and use the information to 'interpret' the data presented within the image. Second, read the text within the results section that describes the figure.

  • Do you agree with the conclusion(s) reached by the authors?
  • What controls were included and are they appropriate for the experiment performed?
  • Are you convinced that the data are accurate and/or representative?

From the Discussion

Consider the following components of a discussion:

  • Are the results summarized?
  • Did the authors 'tie' the data together into a cohesive and well-interpreted story?
  • Do the authors overreach when interpreting the data?
  • Are the data linked back to the big picture from the introduction?

In your laboratory notebook, complete the following:

  • Based on your reading and the group discussion of the article, answer the questions above.

Navigation links

Next day: Induce and purify TDP43 protein

Previous day: Review small molecule microarray (SMM) technology