Difference between revisions of "Assignment 10 Overview"

From Course Wiki
Jump to: navigation, search
(7 intermediate revisions by 2 users not shown)
Line 5: Line 5:
 
__NOTOC__
 
__NOTOC__
 
=Assignment 10=
 
=Assignment 10=
==Data acquisition==
 
===Identify unknown sample===
 
You will receive 1.5 mL each of four samples. Three of the samples will be identified by their sequence, salt ion concentration, and degree of complementarity (see these [[DNA_Melting:_DNA_Sequences]]). The fourth sample matches one of the three identified samples. You will not be told which one.
 
  
* Acquire melting curves for the known and unknown samples.
+
In the final assignment of the DNA melting lab, you will measure DNA melting curves for 3 known samples and one unknown. For each sample you will extract the best-fit thermodynamic parameters, and use them to identify your unknown sample. This assignment has 3 main parts:
** You may want to run some or all of the samples more than once to provide more confidence in your result.
+
# collecting data,
* Identify the unknown sample and report your confidence in the result.
+
# analyzing data,
** See [[Identifying the unknown DNA sample]] for some ideas on identifying the sample.
+
# identifying your unknown sample and discussing your results.
** You may use a different  statistical procedure, if you like. Be sure to document the procedure you used.
+
  
{{Template:Assignment Turn In|message=Turn in your <br>
+
==In preparation==
# Procedure:
+
{{Template: Assignment Turn In|message = Compose an entertaining, exhilarating, thought-provoking, or melancholy Haiku on the subject of DNA melting.}}
#* Document procedure used to gather data
+
# Raw data:
+
#* Measure melting curves 3x each of four samples (three known, one unknown). Plot all raw data.
+
# Results:
+
#* Identify your unknown sample (or state that your investigation did not provide a conclusive answer).
+
#* Quantify the confidence you have in your result.
+
# Discussion:
+
#* Discuss the validity of assumptions in the regression model.
+
#* Discuss any atypical results or data you rejected.
+
#* Compare your data to results from other groups and/or instructor data.
+
#* Give a bullet point summary of problems you encountered in the lab during part 2 and changes that you made to your instrument and methodology to address those issues.
+
#* Discuss significant error sources.
+
#** Consider the entire system: the oligos, dye, the experimental method, and analysis methodology, and any other relevant factors.
+
#** Indicate whether each source likely caused a systematic or random distortion in the data.  
+
#** Present error sources, error type and their resultant uncertainty on your data and results in a table, if you like.
+
#* Discuss additional unimplemented changes that might improve your instrument or analysis.
+
}}
+
  
==Analysis==
+
==Collect Data==
We set out to measure the fraction of dsDNA versus temperature. As discussed in lecture, factors like photobleaching, thermal quenching, and the difference between the block and sample temperature distorted the measurement in various ways. For the final report, you will carefully analyze your data using a mechanistic model to factor out the melting dynamics from the shortcomings of the measurement technique.
+
  
In an ideal world, you would have your analysis code written before the data is collected. In the real world, data collection involves a lot of waiting around, and it might be a good use of your time to work on the analysis code and collect data in parallel. In that case, it is still important to ensure that your data makes sense as it comes in. For example, is the trend in the melting temperature what you expected? &hellip; e.g. did longer oligos have a higher melting temperature than shorter ones? The following section provides a quick way to estimate melting temperatures from the raw data.  
+
You may want to check your instrument to make sure it is still working and reliable. Use fluorescein and beater DNA to test until you are satisfied with the results. The DNA samples you will receive for the next part of the lab are prepared with LC green fluorescent dye, as opposed to SYBR green dye which you've been using so far. LC green is somewhat dimmer than SYBR green, so you may need to tweak your system accordingly.  
  
===Estimating melting temperatures from raw data===
+
When you're ready, (and if you haven't already done so) choose which of the following axes you'd like to explore:
[[Image:Delta Vf Delta theta versus theta.png|thumb|right| &part;''V<sub>f</sub>'' / &part; ''T'' versus ''T'']]
+
*DNA length
A frequently-used, quick-and-dirty proxy for melting temperature is the peak value of the fluorescence voltage's derivative as a function of temperature. The peak of the derivative is the temperature at which the equilibrium is changing most rapidly. It's not exactly the same as the melting temperature, but it allows for quick comparison of results.
+
*Number of mismatches
 +
*Salt concentration
 +
 +
{{Template:Biohazard warning|message=LC Green in DMSO is readily absorbed through skin. Synthetic oligonucleotides may be harmful by inhalation, ingestion, or skin absorption. Wear gloves when handling samples. Wear safety goggles at all times when pipetting the LC Green/DNA samples. Do not create aerosols. The health effects of LC Green have not been thoroughly investigated. See the LC Green and synthetic oligonucleotide under <code>../EHS Guidelines/MSDS Repository</code> in the course locker for more information.}}
  
You can't just take the derivative of the raw signal. The raw data is not guaranteed to be a function at all. One way to take a good derivative is to combine the fluorescence voltage values that fall into a particular range of temperatures and then take the average of all those values &mdash; a process sometimes called ''binning'' or ''discretization''. Temperature bins of about a quarter of a degree work well. The code below demonstrates how to accomplish this in MATLAB. Because of the difference between the block and sample temperatures, it is best to do this for just the heating or melting portion of the curve.  
+
You will receive 1.5 mL each of four samples. Three of the samples will be identified by their sequence, salt ion concentration, and degree of complementarity (see [[DNA_Melting:_DNA_Sequences]] for sequence details and the sample naming key). The fourth sample matches one of the three identified samples. You will not be told which one.  
  
Assuming the variables <code>temperature</code> and <code>fluorescence</code> are defined, the following Matlab code fragment below computes &Delta;F/&Delta;T for the heating portion of the curve. If your melting curves are very noisy, you may have to adjust the code.
+
{{Template: Assignment Turn In|message = Document your data collection procedure. Report instrument settings for each trial, including control software parameters. Refer to the lab manual wiki pages when appropriate, and describe any changes you made.}}
  
<pre>
+
* Acquire two or (preferably) three melting curves for each known and unknown sample.
discretizationInterval = 0.25;
+
** Running the samples more than once will provide more confidence in your result.
discreteTemperatureAxis = 20: discretizationInterval:100; % center value for each bin
+
** It may be wise to analyze the data as you go. Ask yourself: is the melting temperature approximately what I would expect? Do the trends in melting temperature for the known samples agree with my intuition?
discreteTemperaturBinEdges = [ discreteTemperatureAxis - discretizationInterval / 2,
+
      discreteTemperatureAxis(end) + discretizationInterval / 2 ];
+
temperatureBinIndex = discretize( temperature, discreteTemperaturBinEdges );
+
binnedFluorescence = accumarray( temperatureBinIndex', fluorescence',
+
      size( discreteTemperatureAxis' ), @mean )';
+
badOnes = binnedFluorescence == 0;
+
discreteTemperatureAxis( badOnes ) = [];
+
binnedFluorescence( badOnes ) = [];
+
  
figure
+
{{Template:Environmental Warning|message=Discard pipette tips with DNA sample residue in the pipette tips or the ''Biohazard Sharps'' container. Do not pour synthetic oligonucleotides with LC Green down the drain. Pour your used samples into the waste container provided in the middle of the wet bench, or aspirate them into the biowaste flask to the left of the sink.}}
subplot(211)
+
plot( discreteTemperatureAxis, binnedFluorescence )
+
title( 'Average Fluorescence Voltage versus Temperature')
+
xlabel( 'T (^{\circ}C)' )
+
ylabel( 'F (AU)' )
+
  
dFluorescenceDTemperature = diff( binnedFluorescence ) ./ diff( discreteTemperatureAxis );
+
{{Template:Assignment Turn In|message =  
derivativeAxis = mean( [ discreteTemperatureAxis(1:(end-1)); discreteTemperatureAxis(2:end) ] );
+
Plot all of your raw data (as fluorescence vs. block temperature) on the smallest number of axes that clearly conveys the dataset. Include only data generated by your own group.  
 +
* Data from the many sample runs overlaps, which makes presenting so much data on a small number of axes a real challenge.
 +
* Devise a combination of line colors, line thicknesses, and marker symbols that produces clear plot. If two sample types have a great deal of overlap, there may be no choice but to plot them on separate axes.
 +
* One approach that works well for some datasets is to plot a subsampled version of each trial using discrete markers. Vary the color and form to differentiate between sample types and individual trials.
 +
}}
  
subplot(212)
+
==Analyze data==
plot( derivativeAxis, dFluorescenceDTemperature);
+
title( 'Derivative of Fluorescence versus Temperature' )
+
xlabel( 'T (^{\circ}C)' )
+
ylabel( 'dF/dT' )
+
</pre>
+
  
===Using nonlinear regression to estimate parameters===
+
Use your code developed in Assignment 9 to fit your data to the mechanistic model discussed in lecture (an in [[Assignment 9, Part 1: model function]]).
  
The [[DNA Melting: Model function and parameter estimation by nonlinear regression]] wiki page will guide you in writing the model function used to estimate the relevant DNA melting parameters.
+
{{Template:Assignment Turn In|message=<br>
 +
 
 +
* Document the regression model you used to analyze your data
 +
** See [[Assignment 9, Part 1: model function]]
 +
** Explain the model parameters using bullet points or in a table.
 +
* Plot <math>V_{f,measured}</math> and <math>V_{f,model}</math> versus <math>T_{block}</math> for a typical run of each samples type. Use the smallest number of axes that clearly conveys the data.
 +
* Provide a table of the best-fit model parameters and uncertainties for each experimental run. Also include the estimated melting temperature for each run.
 +
* For a typical curve, plot residuals versus time, temperature, and fluorescence, ([http://measurebiology.org/wiki/File:Residual_plot_for_DNA_data.png example plot]).
 +
* For at least one experimental trial, plot <math>\text{DnaFraction}_{inverse-model}</math> versus <math>T_{sample}</math> ([http://measurebiology.org/wiki/File:Inverse_cuvrve.png example plot]). On the same set of axes plot DnaFraction versus <math>T_{sample}</math> using the best-fit values of &Delta;H and &Delta;S. Finally, plot simulated dsDNA fraction vs. temperature using data from DINAmelt or another melting curve simulator.
  
{{Template:Assignment Turn In|message=Turn in the opening of a lab report:<br>
 
# Haiku:
 
#* Compose an entertaining, exhilarating, thought-provoking, or melancholy Haiku on the subject of DNA melting.
 
# Abstract:
 
#* In one paragraph containing six or fewer sentences, summarize the investigation you undertook and key results.
 
# Introduction and Purpose:
 
#* Provide a succinct introduction to the project, including the purpose of the experiment, relevant background material and/or links to such information.
 
#* Summarize the ways in which this part of the lab differs from Part 1 covered in [[Assignment 7 Overview|Assignment 7]].
 
#* Keep the length to one or two short paragraphs, no more than 1/3 of a page.
 
 
}}
 
}}
{{Template:Assignment Turn In|message=also your analysis:
+
 
#* Use bullet points to explain your data analysis methodology.
+
==Results & Disscussion==
#* Document the regression model you used to analyze your data
+
{{Template:Assignment Turn In|message= <br>
#** See [[DNA Melting: Model function and parameter estimation by nonlinear regression]]
+
# Results:
#** Explain the model parameters using bullet points or in a table.
+
#* Identify your unknown sample (or state that your investigation did not provide a conclusive answer).
#* Plot <math>V_{f,measured}</math> and <math>V_{f,model}</math> versus <math>T_{block}</math> for a typical run of each samples type. Use the smallest number of axes that clearly conveys the data.
+
#* Quantify the confidence you have in your result.
#* For a typical curve, plot residuals versus time, temperature, and fluorescence, ([http://measurebiology.org/wiki/File:Residual_plot_for_DNA_data.png example plot]).
+
#* How do your estimates for <math>\Delta H, \Delta S</math> and ''T<sub>m</sub>'' compare to the predicted values from DINAmelt (or your favorite software)?
#* Provide a table of the best-fit model parameters and confidence intervals for each experimental run. Also include the estimated melting temperature for each run.
+
#* How do your estimates for <math>\Delta H, \Delta S</math> and ''T<sub>m</sub>'' compare to results from other groups and/or instructor data?
#* For at least one experimental trial, plot <math>\text{DnaFraction}_{inverse-model}</math> versus <math>T_{sample}</math> ([http://measurebiology.org/wiki/File:Inverse_cuvrve.png example plot]). On the same set of axes plot DnaFraction versus <math>T_{sample}</math> using the best-fit values of &Delta;H and &Delta;S. Finally, plot simulated dsDNA fraction vs. temperature using data from DINAmelt or another melting curve simulator.
+
# Discussion:
 +
#* LC green fluorescent dye saturates the binding sites of the double-stranded DNA. SYBR green, in turn, is non-saturating. Compare the melting temperatures that you measured for Sample A (20 bp, complete match, 100mM salt) to a measurement you made previously using the 'beater DNA'. The beater DNA is the same sequence and salt concentration as sample A, but prepared with SYBR green. Which dye would you use if you wanted to make the most accurate measurement of ''T<sub>m</sub>''? Why?
 +
#* Discuss the validity of assumptions in the regression model.
 +
#* Discuss any atypical results or data you rejected.
 +
#* Discuss significant error sources.
 +
#** Consider the entire system: the oligos, dye, the experimental method, and analysis methodology, and any other relevant factors.
 +
#** Indicate whether each source likely caused a systematic or random distortion in the data.
 +
#** Present error sources, error type and their resultant uncertainty on your data and results in a table, if you like.
 +
#* Discuss additional unimplemented changes that might improve your instrument or analysis.
 
}}
 
}}
 +
 +
[http://measurebiology.org/wiki/20.309_Main_Page Back to 20.309 main page]
  
 
==Resources==
 
==Resources==
Line 129: Line 105:
  
  
{{Template:Assignment 8 navigation}}
+
{{Template:Assignment 10 navigation}}

Revision as of 15:01, 20 November 2017

20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg


Assignment 10

In the final assignment of the DNA melting lab, you will measure DNA melting curves for 3 known samples and one unknown. For each sample you will extract the best-fit thermodynamic parameters, and use them to identify your unknown sample. This assignment has 3 main parts:

  1. collecting data,
  2. analyzing data,
  3. identifying your unknown sample and discussing your results.

In preparation


Pencil.png

Compose an entertaining, exhilarating, thought-provoking, or melancholy Haiku on the subject of DNA melting.


Collect Data

You may want to check your instrument to make sure it is still working and reliable. Use fluorescein and beater DNA to test until you are satisfied with the results. The DNA samples you will receive for the next part of the lab are prepared with LC green fluorescent dye, as opposed to SYBR green dye which you've been using so far. LC green is somewhat dimmer than SYBR green, so you may need to tweak your system accordingly.

When you're ready, (and if you haven't already done so) choose which of the following axes you'd like to explore:

  • DNA length
  • Number of mismatches
  • Salt concentration


Biohazard.jpg LC Green in DMSO is readily absorbed through skin. Synthetic oligonucleotides may be harmful by inhalation, ingestion, or skin absorption. Wear gloves when handling samples. Wear safety goggles at all times when pipetting the LC Green/DNA samples. Do not create aerosols. The health effects of LC Green have not been thoroughly investigated. See the LC Green and synthetic oligonucleotide under ../EHS Guidelines/MSDS Repository in the course locker for more information.


You will receive 1.5 mL each of four samples. Three of the samples will be identified by their sequence, salt ion concentration, and degree of complementarity (see DNA_Melting:_DNA_Sequences for sequence details and the sample naming key). The fourth sample matches one of the three identified samples. You will not be told which one.


Pencil.png

Document your data collection procedure. Report instrument settings for each trial, including control software parameters. Refer to the lab manual wiki pages when appropriate, and describe any changes you made.


  • Acquire two or (preferably) three melting curves for each known and unknown sample.
    • Running the samples more than once will provide more confidence in your result.
    • It may be wise to analyze the data as you go. Ask yourself: is the melting temperature approximately what I would expect? Do the trends in melting temperature for the known samples agree with my intuition?


Global Tree.gif Discard pipette tips with DNA sample residue in the pipette tips or the Biohazard Sharps container. Do not pour synthetic oligonucleotides with LC Green down the drain. Pour your used samples into the waste container provided in the middle of the wet bench, or aspirate them into the biowaste flask to the left of the sink.



Pencil.png

Plot all of your raw data (as fluorescence vs. block temperature) on the smallest number of axes that clearly conveys the dataset. Include only data generated by your own group.

  • Data from the many sample runs overlaps, which makes presenting so much data on a small number of axes a real challenge.
  • Devise a combination of line colors, line thicknesses, and marker symbols that produces clear plot. If two sample types have a great deal of overlap, there may be no choice but to plot them on separate axes.
  • One approach that works well for some datasets is to plot a subsampled version of each trial using discrete markers. Vary the color and form to differentiate between sample types and individual trials.


Analyze data

Use your code developed in Assignment 9 to fit your data to the mechanistic model discussed in lecture (an in Assignment 9, Part 1: model function).


Pencil.png


  • Document the regression model you used to analyze your data
  • Plot $ V_{f,measured} $ and $ V_{f,model} $ versus $ T_{block} $ for a typical run of each samples type. Use the smallest number of axes that clearly conveys the data.
  • Provide a table of the best-fit model parameters and uncertainties for each experimental run. Also include the estimated melting temperature for each run.
  • For a typical curve, plot residuals versus time, temperature, and fluorescence, (example plot).
  • For at least one experimental trial, plot $ \text{DnaFraction}_{inverse-model} $ versus $ T_{sample} $ (example plot). On the same set of axes plot DnaFraction versus $ T_{sample} $ using the best-fit values of ΔH and ΔS. Finally, plot simulated dsDNA fraction vs. temperature using data from DINAmelt or another melting curve simulator.


Results & Disscussion


Pencil.png


  1. Results:
    • Identify your unknown sample (or state that your investigation did not provide a conclusive answer).
    • Quantify the confidence you have in your result.
    • How do your estimates for $ \Delta H, \Delta S $ and Tm compare to the predicted values from DINAmelt (or your favorite software)?
    • How do your estimates for $ \Delta H, \Delta S $ and Tm compare to results from other groups and/or instructor data?
  2. Discussion:
    • LC green fluorescent dye saturates the binding sites of the double-stranded DNA. SYBR green, in turn, is non-saturating. Compare the melting temperatures that you measured for Sample A (20 bp, complete match, 100mM salt) to a measurement you made previously using the 'beater DNA'. The beater DNA is the same sequence and salt concentration as sample A, but prepared with SYBR green. Which dye would you use if you wanted to make the most accurate measurement of Tm? Why?
    • Discuss the validity of assumptions in the regression model.
    • Discuss any atypical results or data you rejected.
    • Discuss significant error sources.
      • Consider the entire system: the oligos, dye, the experimental method, and analysis methodology, and any other relevant factors.
      • Indicate whether each source likely caused a systematic or random distortion in the data.
      • Present error sources, error type and their resultant uncertainty on your data and results in a table, if you like.
    • Discuss additional unimplemented changes that might improve your instrument or analysis.


Back to 20.309 main page

Resources

Background reading

Code examples and simulations

Subset of datasheets

(Many more can be found online or on the course share)

  1. National Instruments USB-6212 user manual
  2. National Instruments USB-6341 user manual
  3. LF411 Op-amp datasheet
  4. LM741 Op-amp datasheet


Template:Assignment 10 navigation