Difference between revisions of "DNA Melting: Simulating DNA Melting - Basics"

From Course Wiki
Jump to: navigation, search
(Plot an ideal melting curve)
 
(31 intermediate revisions by 7 users not shown)
Line 1: Line 1:
[[Category:20.309]]
+
[[Category:Matlab]]
 
[[Category:DNA Melting Lab]]
 
[[Category:DNA Melting Lab]]
[[Category:Matlab]]
+
<div style="padding: 10px; width: 774px; border: 3px solid #000000;">
{{Template:20.309}}
+
 
+
 
==Introduction==
 
==Introduction==
 +
This tutorial will show you how to simulate the DNA melting experiment with Matlab. The DNA melting apparatus you will build produces two output voltages related to the sample temperature and fluorescence intensity. During an experimental run, these two voltages will be recorded periodically. At the end of this tutorial, you should be able to produce mock datasets that are similar to what you will record in the lab. You can use simulated datasets to help develop and debug your data analysis scripts.
  
<blockquote>
+
It will be essential to have working data analysis scripts before you begin running experiments in the lab. When you have completed this tutorial, refer to the lab manual for data analysis instructions.
<div>
+
''Chance favors the prepared mind.''
+
 
+
''&mdash;Louis Pasteur (abridged)''
+
</div>
+
</blockquote>
+
 
+
This tutorial will show you how to simulate the DNA melting experiment with Matlab. The DNA melting apparatus you will build produces two output voltages related to the sample temperature and fluorescence intensity. During an experimental run, these two voltages will be recorded periodically. At the end of this tutorial, you will be able to produce mock datasets that are similar to what you will record in the lab. You can use simulated datasets to help develop and debug your data analysis scripts.
+
 
+
It is essential to have working data analysis scripts before you begin running experiments in the lab.
+
  
 
The procedure for creating the simulation is:
 
The procedure for creating the simulation is:
# Write a function to compute the equilibrium dsDNA concentration vs. temperature
+
# Write a function to compute the equilibrium dsDNA contration vs. temperature
 
# Model sample temperature versus time during an experimental run
 
# Model sample temperature versus time during an experimental run
 
# Compute the RTD voltage from temperature vs. time data
 
# Compute the RTD voltage from temperature vs. time data
# Compute relative fluorescence intensity vs. time data from temperature vs. time data
+
# Add noise to the RTD signal
 +
# Compute relative fluorescence intensity from temperature vs. time data
 +
# Model photobleaching (optional)
 +
# Scale the fluorescence intensity signal and add noise
 +
# Reformat the data to match what is produced by the DNA melting software
  
Run Matlab and follow along as you read. Code from this page can be copied and pasted into the Matlab command window.
+
You will not have to write any new code to complete this tutorial; however, you should open a Matlab window and follow along as you read. Code from this page can be copied and pasted into the Matlab command window. M-files are available in the course locker.
  
 
If you prefer to work in Python, please see [[Python:Simulating DNA Melting|this page]].
 
If you prefer to work in Python, please see [[Python:Simulating DNA Melting|this page]].
  
 
===Before you begin===
 
===Before you begin===
 +
*Read the [[Lab Manual: Measuring DNA Melting Curves|Measuring DNA Melting Curves Lab Manual]].
 
*Review the [[DNA Melting Thermodynamics|DNA Melting Thermodynamics Notes]].
 
*Review the [[DNA Melting Thermodynamics|DNA Melting Thermodynamics Notes]].
 
*If you need a review of Matlab fundamentals, see [[Matlab:Matlab Fundamentals|Matlab Fundamentals]] tutorial.
 
*If you need a review of Matlab fundamentals, see [[Matlab:Matlab Fundamentals|Matlab Fundamentals]] tutorial.
  
 
===Matlab skills===
 
===Matlab skills===
* Array arithmetic (element-by-element) operators (<tt>+</tt>, <tt>-</tt>, <tt>.*</tt>, <tt>./</tt>)
+
* Element-by-element operations on arrays (<tt>+</tt>, <tt>-</tt>, <tt>.*</tt>, <tt>./</tt>)
 
* Functions declared in an m-file
 
* Functions declared in an m-file
 
* Conditionals
 
* Conditionals
Line 40: Line 34:
 
* Plotting data
 
* Plotting data
  
==Modeling dsDNA equilibrium fraction==
+
==Modeling dsDNA equilibrium concentration==
We derived an expression for dsDNA fraction at a given temperature [[DNA Melting Thermodynamics#Simulating DNA Melting|in class]]:  
+
We derived an expression for dsDNA concentration at a given temperature [[DNA Melting Thermodynamics#Simulating DNA Melting|in class]]:  
 
:<math>
 
:<math>
 
K_{eq} = e^\left [\frac{\Delta S^{\circ}}{R} - \frac{\Delta H^{\circ}}{R T} \right ]
 
K_{eq} = e^\left [\frac{\Delta S^{\circ}}{R} - \frac{\Delta H^{\circ}}{R T} \right ]
Line 49: Line 43:
 
</math>
 
</math>
  
where ''C_T'' is the total concentration of DNA strands.
+
Since this is a very complicated equation and we will use it many times, it makes sense to write a Matlab function.
 
+
Since this is a complicated equation that we will evaluate many times, it makes sense to write a Matlab function.
+
  
 
===dsDNA fraction function===
 
===dsDNA fraction function===
The fraction of dsDNA is a function of 4 variables: ''&Delta;S&deg;'', ''&Delta;H&deg;'', temperature, and concentration. The <tt>DnaFraction</tt> function below also takes an optional fifth argument for the gas constant to enable using different unit systems. Using the element-by-element divide operator in the body of the function makes it possible for <tt>DnaFraction</tt> to accept either a single temperature argument or a vector of temperature values. (As a general rule, use the <tt>./</tt> operator instead of <tt>/</tt> unless you intend to do a matrix divide.)
+
The fraction of dsDNA is a function of 4 variables: ''&Delta;S&deg;'', ''&Delta;H&deg;'', temperature, and concentration. The <tt>DnaFraction</tt> function below takes four parameters to represent these quantities and an optional fifth parameter that causes the function to use a different value of the gas constant. Changing the fifth parameter allows the same function to operate with different unit systems.  
 +
 
 +
A built-in MATLAB function in called <tt>nargin</tt> allows functions to accept a variable number of arguments. <tt>nargin</tt> returns the number of arguments that the function was called with. The <tt>DnaFraction</tt> function sets <tt>R</tt> to the default value of 1.987 if <tt>nargin</tt> returns a value of 4 or less. Otherwise, the function sets <tt>R</tt> to the value supplied by the caller.
  
The Matlab function <tt>nargin</tt> returns the number of arguments that the function was called with. Many Matlab programs use this function to handle variable numbers of arguments. If <tt>nargin</tt> returns 4 or less, <tt>R</tt> is set to the default value of 1.987. Otherwise, the value of <tt>R</tt> is set to the value supplied by the caller.
+
The element-by-element divide operator in the body of the function makes it possible for <tt>DnaFraction</tt> to accept either a single temperature argument or a vector of temperature values. (As a general rule, use the <tt>./</tt> operator instead of <tt>/</tt> unless you intend to do a matrix divide.)
  
{{Matlab Notes}} Matlab functions must be saved in the current directory or another directory in the Matlab path. The name of the file must be the function name appended with ".m". Function help should be placed in comments at the top of the file. Type <tt>help DnaFraction</tt> to see how this works.
+
To use this function, save it in a file called "DnaFraction.m" (MATLAB requires that all functions be saved in a file that has the same name as the function appended with the extension ".m".) The file must reside in the current directory or another directory that is on your Matlab path. Function help should be placed in comments at the top of the file. Type <tt>help DnaFraction</tt> to see how this works.
  
 
<pre>
 
<pre>
Line 98: Line 92:
  
 
<pre>
 
<pre>
close all;
 
clc;
 
 
 
ideal = struct();                          % create an empty structure to hold the results
 
ideal = struct();                          % create an empty structure to hold the results
 
ideal.temperature = (0:99) + 273.15;      % temperature axis in Kelvin: 100 points(0-99C)
 
ideal.temperature = (0:99) + 273.15;      % temperature axis in Kelvin: 100 points(0-99C)
ideal.deltaS = -184;                      % cal/mol
+
ideal.deltaS = -500;                      % cal/(mol K)
ideal.deltaH = -71E3;                     % cal/(mol K)
+
ideal.deltaH = -180E3;                     % cal/mol  
ideal.DnaConcentration = 30E-6;           % M
+
ideal.concentration = 1E-6;               % mol/L
  
 
% call DnaFraction function
 
% call DnaFraction function
Line 111: Line 102:
 
% (optional R argument not supplied)
 
% (optional R argument not supplied)
  
ideal.dsDnaFraction = DnaFraction(ideal.DnaConcentration, ...
+
ideal.dsDnaFraction = DnaFraction(ideal.concentration, ideal.temperature, ...
    ideal.temperature, ideal.deltaS, ideal.deltaH);
+
                                  ideal.deltaS, ideal.deltaH);
  
figure(1)
 
 
plot(ideal.temperature, ideal.dsDnaFraction);
 
plot(ideal.temperature, ideal.dsDnaFraction);
 
title('Ideal DNA Melting Curve');
 
title('Ideal DNA Melting Curve');
 
xlabel('Temperature (K)');
 
xlabel('Temperature (K)');
 
ylabel('dsDNA Fraction');
 
ylabel('dsDNA Fraction');
 
 
</pre>
 
</pre>
 
[[Image:Ideal DNA Melting Curve.jpg|350 px|Ideal DNA melting curve]]
 
[[Image:Ideal DNA Melting Curve.jpg|350 px|Ideal DNA melting curve]]
  
Experiment with different values of ''&Delta;S&deg;'' and ''&Delta;H&deg;'' to see how the curve changes.
+
Experiment with different values of ''&Delta;H&deg;'' and ''&Delta;S&deg;'' to see how the curve changes. You can use [http://www.basic.northwestern.edu/biotools/oligocalc.html OligoCalc] or another tool to estimate values for different DNA sequences.
 
+
Also several web tools are available to predict the melting temperature. See, for example, [http://mfold.rna.albany.edu/?q=DINAMelt/Hybrid2 DINA Melt] or [http://www.basic.northwestern.edu/biotools/oligocalc.html Oligocalc]. You can use one of these or another tool to estimate values for different DNA sequences.
+
  
One method of estimating ''T<sub>m</sub>'' is to find the temperature where the dsDNA fraction equals 1/2. Another method is to differentiate the melting curve. ''T<sub>m</sub>'' occurs at the peak of the derivative &mdash; where the dsDNA fraction is changing the fastest. (Why might you prefer one method over the other when estimating ''T<sub>m</sub>'' from experimental data?)
+
One method of estimating ''T<sub>m</sub>'' is to find the temperature where the dsDNA fraction equals 1/2. Another method is to differentiate the melting curve. ''T<sub>m</sub>'' occurs at the peak of the derivative &mdash; where the dsDNA fraction is changing the fastest. (Why might you prefer one method to the other when estimating ''T<sub>m</sub>'' from experimental data?)
  
 
===Numerical differentiation===
 
===Numerical differentiation===
Line 133: Line 120:
 
:<math>
 
:<math>
 
{f'}_N = \frac{f_{N+1}-f_{N}}{\Delta T}</math>.  
 
{f'}_N = \frac{f_{N+1}-f_{N}}{\Delta T}</math>.  
''df/dT'' is guaranteed to take on the value given by ''f'<sub>N</sub>'' at some point between ''N&Delta;T'' and ''(N+1)&Delta;T''; however, this happy coincidence can occur at any place in the interval. A simple but effective technique is to guess that equality occurs in the middle of the interval. This guess is close enough in most cases.
+
''dF/dT'' is guaranteed to take on the value given by ''f'<sub>N</sub>'' at some point between ''N&Delta;T'' and ''(N+1)&Delta;T''; however, this happy coincidence can occur at any place in the interval. A simple but effective technique is to guess that equality occurs in the middle of the interval. This guess is close enough in most cases.
  
 
The code below computes the finite difference and stores it in the <tt>dDsDnaFraction</tt> field of <tt>ideal</tt>. It also generates an associated temperature vector (<tt>ideal.dTemperature</tt>) with values centered between the original temperature samples. Note that both vectors are one element shorter than the original data.
 
The code below computes the finite difference and stores it in the <tt>dDsDnaFraction</tt> field of <tt>ideal</tt>. It also generates an associated temperature vector (<tt>ideal.dTemperature</tt>) with values centered between the original temperature samples. Note that both vectors are one element shorter than the original data.
Line 139: Line 126:
 
<pre>
 
<pre>
 
ideal.dTemperature = ideal.temperature(1:99) + 0.5; % create new temperature vector
 
ideal.dTemperature = ideal.temperature(1:99) + 0.5; % create new temperature vector
ideal.dDsDnaFraction = diff(ideal.dsDnaFraction);  % take adjacent element differences
+
ideal.dDsDnaFraction = diff(ideal.dsDnaFraction);  % take the difference
  
figure(2)
 
 
plot(ideal.dTemperature, -1 * ideal.dDsDnaFraction);
 
plot(ideal.dTemperature, -1 * ideal.dDsDnaFraction);
 
title('Derivative of Ideal DNA Melting Curve');
 
title('Derivative of Ideal DNA Melting Curve');
Line 155: Line 141:
  
 
<pre>
 
<pre>
sim1 = struct();                                          % create empty struct for sim results
+
simulationLength = 900;                              % 15 minute simulation
sim1.DnaConcentration = 30E-6;                            % 30 micromolar concentration
+
sampleRate = 1;                                       % 1 Hz sample rate
sim1.deltaS = ideal.deltaS;                              % cal / (mole-K)
+
sim1.deltaH = ideal.deltaH;                              % cal / mole
+
sim1.finalTemperature = 293;                              % approx. room temperature
+
sim1.sampleRate = 1;                                     % 1 Hz sample rate
+
sim1.durationHT = 900;                                    % heating time, s
+
sim1.durationCL = 600;                                    % cooling time, s
+
sim1.timeHT = (0:(1/sim1.sampleRate):sim1.durationHT)';  % heating time vector, s
+
sim1.timeCL = (2:(1/sim1.sampleRate):sim1.durationCL)';  % cooling time vector, s
+
sim1.time = [sim1.timeHT; sim1.timeHT(end)+sim1.timeCL];  % complete time vector, s
+
  
 +
sim1 = struct();                                      % create empty struct for sim results
 +
sim1.concentration = 1E-6;                            % 1 micromolar concentration
 +
sim1.deltaS = -184;                                  % cal / (mole-K)
 +
sim1.deltaH = -71E3;                                  % cal / mole
 +
sim1.initialTemperature = 363;                        % 363 K = 90 C
 +
sim1.finalTemperature = 293;                          % approx. room temperature
 +
sim1.sampleRate = 1;                                  % 1 Hz sample rate
 +
sim1.duration = 900;                                  % seconds -- 15 minute experiment
  
 +
sim1.time = 0:(1./sampleRate):simulationLength;      % create time vector (units: seconds)
 
</pre>
 
</pre>
  
 
==Simulate sample temperature during an experimental run==
 
==Simulate sample temperature during an experimental run==
The goal of the simulation is to produce two simulated voltage signals that resemble what you will measure in the lab. Since both of the voltages are functions of temperature, it makes sense to begin by modeling how the sample temperature behaves during an experimental run.
+
The goal of the simulation is to produce two simulated voltage signals that resemble what you will measure in the lab. Since both of the voltages are functions of temperature, it makes sense to begin by modeling how the sample cools during an experimental run.
  
===Sample heating and cooling===
+
===Sample cooling===
During each experimental run, the range of temperatures will be provided by forced heating by a TEC, followed by natural cooling of the sample after the heat source is switched off. At the beginning of an experimental run the sample will be at room temperature, then the TEC is switched on and the DNA + fluorescent dye sample is heated to 95&deg;C in about 5 minutes. Upon reaching 95&deg;C, the TEC power is switched off. Then over a period of about 15 minutes, the sample cools to room temperature (less than 30&deg;C). The rate of cooling is proportional to the difference between the sample's temperature and the room temperature, so the sample cools more quickly at the start of the experiment than near the end. (It can be frustrating to wait for the final few degrees.)
+
During each experimental run, the range of temperatures will be provided by natural cooling of the sample after it has been heated in a bath. At the beginning of an experimental run, the DNA + fluorescent dye sample is heated in a bath to around 90&deg;C. Over a period of about 15 minutes, the sample cools to room temperature (about 20&deg;C). The rate of cooling is proportional to the difference between the sample's temperature and the room temperature, so the sample cools more quickly at the start of the experiment than near the end. (It can be frustrating to wait for the final few degrees.)
  
Both heating and cooling can be approximated by an exponential function. Heating is modeled with a time constant of about 1 minute, while cooling is modeled with a time constant of about 3 minutes. The equation for temperature (in K) versus time is:
+
Cooling can be approximated by an exponential function with a time constant of about 3 minutes. The equation for temperature (in &deg;K) versus time is:
  
 
:<math>
 
:<math>
T_{heating}(t) = (T_{max} - T_i) (1 - e ^ {-\frac{t_{HT}}{\tau_{HT}}}) + T_i
+
T(t) = (T_i - T_f) e ^ {-\frac{t}{\tau}} + T_f
</math>
+
:<math>
+
T_{cooling}(t) = (T_{max} - T_f) e ^ {-\frac{t_{CL}}{\tau_{CL}}} + T_f
+
 
</math>
 
</math>
 
where,
 
where,
 
:''T(t)'' is the temperature as a function of time,
 
:''T(t)'' is the temperature as a function of time,
:''T<sub>i</sub>'' is the initial temperature, likely room temperature,
+
:''T<sub>i</sub>'' is the initial temperature (after heating),
:''T<sub>max</sub>'' is the max temperature reached during the heating phase,
+
:''T<sub>f</sub>'' is the final (ambient) temperature,
:''T<sub>f</sub>'' is the final temperature, likely not quite room temperature,
+
:''&tau;'' is the time constant.
:''&tau;'' is the time constant for heating (HT) and cooling (CL).
+
  
 
In Matlab:
 
In Matlab:
  
 
<pre>
 
<pre>
sim1.heatingConstant = 180;                              % heating rate constant, 1/s
+
sim1.coolingConstant = 180;
sim1.coolingConstant = 180;                               % cooling rate constant, 1/s
+
sim1.maxTemperature = 95 + 273;                          % target maximum sample temp, K
+
sim1.initialTemperature = 20 + 273;                      % starting room temp, K
+
sim1.finalTemperature = 35 + 273;                        % final cool-down temp, K
+
  
% use exp function to model exponential heating and cooling
+
% use exp function to model exponential cooling
 
sim1.temperature = ...
 
sim1.temperature = ...
     [ (sim1.maxTemperature - sim1.initialTemperature)   ... % (T_max - T_i) *
+
     (sim1.initialTemperature - sim1.finalTemperature) .*   ...   % T_i - T_f *
    .* (1 - exp(-sim1.timeHT ./ sim1.heatingConstant))  ... % (1 + e^(-t/tauHT)) +
+
     exp(-sim1.time ./ sim1.coolingConstant) ...             % e^(-t/tau) +
    + sim1.initialTemperature; ...                          % T_i
+
     sim1.finalTemperature;                                       % T_f
    (sim1.maxTemperature - sim1.finalTemperature)  ...    % (T_max - T_f) *
+
     .* exp(-sim1.timeCL ./ sim1.coolingConstant) ...       % e^(-t/tauCL) +
+
     + sim1.finalTemperature ];                             % T_f
+
  
figure(3)
+
plot(sim1.time, sim1.temperature, 'r');
plot(sim1.time, sim1.temperature-273, 'r');
+
 
xlabel('Time (sec)');
 
xlabel('Time (sec)');
ylabel('Temperature (C)');
+
ylabel('Temperature (K)');
 
title('Simulated Temperature vs. Time');
 
title('Simulated Temperature vs. Time');
 
 
</pre>
 
</pre>
[[Image:DNAHeatCoolCurve.png|350 px|Simulated sample heating and cooling curve.]]
+
[[Image:DNA Sample Cooling.jpg|350 px|Simulated sample cooling curve]]
  
If you have any questions about the element-by-element division operator in Matlab (./), or the use of a semicolon for concatenation, see:[[Matlab:Division operators|this page]] or ask an Instructor or TA.
+
If you have any questions about the element-by-element division operator in Matlab (./) see:[[Matlab:Division operators|this page]].
  
 
==RTD temperature sensor model==
 
==RTD temperature sensor model==
  
 
===RTD and voltage divider circuit===
 
===RTD and voltage divider circuit===
The resistance of the RTD is given by the equation <math> R_{RTD} = R_o*(1 + \alpha \Delta T) </math>, where "&alpha;" is the temperature coefficient of resistance (TCR) for the RTD and <math>R_o</math> is the resistance at the reference temperature for the RTD. In the experimental apparatus, the RTD is hooked up as one element of a voltage divider with a pull-up resistor, ''R''. The power supply input is 15 volts. Thus, the output voltage is: <math>V_{RTD} = 15 * R_{RTD}/(R + R_{RTD})</math>. The following code calculates RTD voltage measurements from <tt>sim1.temperature</tt> and stores the result in <tt>sim1.V_RTD</tt>, assuming <math>R</math> = 15k&Omega;, <math>R_o</math> = 1000&Omega;, <math>\alpha</math> = 3850 ppm/&deg;C, and <math>\Delta T</math> is the difference, in Kelvin or Celsius, between the RTD reference temperature and the RTD temperature.
+
The resistance of the RTD is given by the equation ''R<sub>RTD</sub> = 1000 + 3.85(T - 273)''. In the experimental apparatus, the RTD is hooked up as one element of a voltage divider with another resistor, ''R''. The power supply is 15 volts. Thus, the output voltage is: ''V<sub>RTD</sub> = 15 * R<sub>RTD</sub>/(R + R<sub>RTD</sub>)''. The following code calculates RTD voltage measurements from <tt>sim1.temperature</tt> and stores the result in <tt>sim1.V_RTD</tt>, assuming ''R'' = 20K&Omega;.
  
 
<pre>
 
<pre>
sim1.R_RTD_atZeroC = 1000;                                % Datasheet value of reference RTD resistance, &Omega;
+
sim1.R_RTD = 1000 + 3.85 * (sim1.temperature - 273);     % Compute RTD resistance in Ohms
sim1.RTDalpha = 3850;                                    % Datasheet value of RTD TCR, ppm/C
+
sim1.V_RTD = 15 * sim1.R_RTD ./ (20E3 + sim1.R_RTD);     % Compute V_RTD in Volts
sim1.R_RTD = sim1.R_RTD_atZeroC ...
+
    * (1 + sim1.RTDalpha*1e-6 * (sim1.temperature-273)); % Compute RTD resistance, Ohms
+
sim1.R_pullup = 15e3;                                    % Pull-up resistor in voltage divider, Ohms
+
sim1.V_input = 15;                                        % Input voltage of divider, V
+
sim1.V_RTD = sim1.V_input * sim1.R_RTD ...
+
    ./ (sim1.R_pullup + sim1.R_RTD);                     % RTD votlage drop, V
+
 
+
figure(4)
+
 
plot(sim1.time, sim1.V_RTD, 'r');
 
plot(sim1.time, sim1.V_RTD, 'r');
 
xlabel('Time (sec)');
 
xlabel('Time (sec)');
 
ylabel('RTD Voltage (V)');
 
ylabel('RTD Voltage (V)');
 
title('Simulated RTD Voltage vs. Time');
 
title('Simulated RTD Voltage vs. Time');
 
 
</pre>
 
</pre>
  
[[Image:RTDHeatCoolCurve.png|350px|Simulated RTD voltage plot]]
+
[[Image:RTD Voltage vs Time.jpg|350px|Simulated RTD voltage plot]]
 
+
===Simulate DNA fraction for this temperature profile===
+
Use the DnaFraction function given above to calculate the simulated fluorescence using the combined heating and cooling temperature profile that you just calculated.
+
 
+
<pre>
+
sim1.dsDnaFraction = DnaFraction(sim1.DnaConcentration, ...
+
    sim1.temperature, sim1.deltaS, sim1.deltaH);
+
 
+
figure(5)
+
plot(sim1.time, sim1.dsDnaFraction);
+
title('Simulated DNA Melting Curve');
+
xlabel('Time (s)');
+
ylabel('dsDNA Fraction');
+
 
+
figure(6)
+
plot(sim1.temperature-273, sim1.dsDnaFraction);
+
title('Simulated DNA Melting Curve');
+
xlabel('Temperature (C)');
+
ylabel('dsDNA Fraction');
+
</pre>
+
 
+
[[Image:SimDNAFractionVsTime.png|350px|Simulated DNA Fraction vs Time]]
+
 
+
[[Image:SimDNAFractionCurve.png|350px|Simulated DNA Melting Curve]]
+
 
+
==Analyze simulated data==
+
 
+
===Fitting the model===
+
To perform the fit, use the matlab function from above, DnaFraction. You can use any of Matlab's least squares curve fitters to estimate best-fit values for &Delta;H° and &Delta;S°. We prefer (<code>nlinfit</code>) because it allows you to calculate confidence intervals.
+
 
+
<code>nlinfit</code> requires the fitting function to be implemented a particular way. You may find the following code excerpt, which declares a suitable function called <code>myFunction</code>, useful.
+
 
+
<pre>
+
Code to be written by student.
+
</pre>
+
 
+
[[Image:FittedDnaFractionCurveOverlay.png|350px|Simulated DNA Melting Curve]]
+
 
+
At first glance there appears to be only one curve. This is what we expect for simulated data with no noise or other non-idealities. Close inspection of your plotted curves will show that there are distinct simulated data and fitted curves. The more interesting case is to add some simulated noise, bleaching, fluorescence variation with temperature, temperature lag, reaction rates, etc. Some of these will be covered in Intermediate Topics, and you can add any that are not covered there. Others will need to be included by you when you analyze your data.
+
 
+
===Plot legends in Matlab===
+
You can create fancier plots with a variety of lines colors and styles and legends. Annoyingly, Matlab handles legends differently than other plot commands. The <code>hold</code> command does not apply to the <code>legend</code> command. (Direct complaints to http://www.mathworks.com/support.) However, the following code excerpt is a work-around and adds a plot legend. This code runs after a main loop which would cycle the above fit routine through your multiple data sets (more on that later). In such an implementation, the cellular array <code>out</code> contains the results from processing. The code below also places an "X" at the melting temperature.
+
 
+
<pre>
+
% Compute text cell array for plot legends and plot an
+
% "X" at the estimated melting temperature
+
 
+
for ii=1:length(out)
+
    legendText{2 * ii - 1} = sampleInfo{ii}.SampleName;
+
    legendText{2 * ii} = [sampleInfo{ii}.SampleName ' Best Fit'];
+
    figure(1)
+
    plot(out{ii}.maxDerivativeTemperature, out{ii}.fluorescence(out{ii}.maxDerivativeIndex),...
+
        'linewidth',2,'marker','x','markersize',18,'color',out{ii}.plotColor);
+
    plot(out{ii}.maxModelDerivativeTemperature, out{ii}.fitFluorescence(...
+
        out{ii}.maxModelDerivativeIndex), 'linewidth',2,'marker','x','markersize',18,'color',...
+
        out{ii}.modelPlotColor);
+
    figure(2)
+
    plot(out{ii}.maxDerivativeTemperature, out{ii}.dFdT(out{ii}.maxDerivativeIndex),...
+
        'linewidth',2,'marker','x','markersize',18,'color',out{ii}.plotColor);
+
    plot(out{ii}.maxModelDerivativeTemperature, out{ii}.dModelFdT(...
+
        out{ii}.maxModelDerivativeIndex), 'linewidth',2,'marker','x','markersize',18,'color',...
+
        out{ii}.modelPlotColor);
+
end
+
 
+
% Add the legends
+
figure
+
legend(legendText, 'location', 'southwest');
+
hold off;
+
figure
+
legend(legendText, 'location', 'southwest');
+
hold off;
+
 
+
</pre>
+
 
+
==Issues experienced in practice==
+
Now that you are confident that your fitting routine functions properly, you are ready to add sources of noise and other issues that you will experience in practice. It will be instructive for you to again try fitting a model to this "noisy" simulated data and discover how much the noise and bleaching affect your ability to estimate the true DNA melting curve.
+
 
+
===Add some noise to the RTD signal===
+
You will certainly notice noise on the signals you capture in the lab. Noise will be an ''extremely'' important (and annoying) factor in this experiment because the melting curves will be differentiated. Differentiation emphasizes noise. In order to ensure that your data analysis script effectively handles the noise you are likely to observe, simulated random and impulse noise will be added to the signals.
+
 
+
To simulate noise, we can use Matlab's <tt>randn</tt> function, which generates normally distributed random numbers with unity standard deviation. The two arguments tell <tt>randn</tt> to create a vector of random samples with size 1xlength(V_RTD).
+
 
+
<pre>
+
noise = 0.001 * randn(1,length(sim1.V_RTD))';    % Create noise vector, standard deviation=1mV
+
sim1.V_RTD = sim1.V_RTD + noise;                % Add simulated noise to the voltage signal
+
 
+
figure(8)
+
plot(sim1.time, sim1.V_RTD, 'r');
+
xlabel('Time (sec)');
+
ylabel('RTD Voltage (V)');
+
title('Simulated RTD Voltage with Noise vs. Time');
+
</pre>
+
 
+
[[Image:RTDNoisyHeatCoolCurve.png|350px|Simulated RTD voltage plot]]
+
 
+
Revisit your code from [[DNA Melting: Simulating DNA Melting - Basics]] and using the approach above add noise to your RTD voltage signal, then explore the behavior of your code for different levels of noise. In particular, you can add differentiation of the curve from Figure 6 and discover the issues with attempts to differentiate that signal without additional processing. You will likely find that you first need to either sort or bin your temperature data so that the f vs T function can be differentiated. Then also run this data through your fit function and observe any difference in the fit statistics.
+
 
+
You are also encouraged to add noise at additional frequencies that you think might be present in the lab, then attempt to filter that data in Matlab to test the effectiveness of the RTD voltage filter design that you built in the lab. To do this, generate the noise (perhaps 60 Hz, pink noise, possibly high frequency noise from switching in the overhead lights, or even noise on the earth ground that could be coming from equipment elsewhere in the building) and add it to the signal from Figure 8. Remember it is a long way to the earth from our lab. Next filter that signal using your transfer function and the matlab command lsim. Then again run this data through your fit function.
+
 
+
===Add some noise to the sample fluorescence signal===
+
'''Maybe too much impulse noise at present'''
+
 
+
Now that we have added noise to the RTD voltage signal, the next step is to do the same for the fluorescence signal.
+
 
+
The output range of the transimpedance amplifier depends on many factors, including: dsDNA quantity, illumination intensity, fluorophore efficiency, optical system gain, photodiode efficiency, and electronic amplifier gain. All of the factors except dsDNA concentration essentialy remain constant. Thus, the output of the transimpedance amplifier can be modeled by a single overall system gain factor times the dsDNA fraction. In addition, the amplifier output may be offset by a constant DC value, which is simple to model. (If the output of your apparatus drifts significantly over time, you will need to improve your implementation and methods to minimize this.)
+
 
+
The output range of the transimpedance amplifier over the course of an experimental run can vary depending on several factors, including the amplifier gain, optical system gain, etc... A typical setup might produce values between, say, 0.5 and 7.2 Volts. The simulation code generates two random numbers to set the range and minimum value of the data. Your data analysis script will have to normalize the amplifier output voltage to a fractional value between 0 and 1. (What assumptions will you make about dsDNA concentration to do the normalization?)
+
 
+
In addition to white Gaussian noise, shot noise from the photo diode and Johnson noise in your feedback resistors, the high gain amplifier sometimes produces impulse noise. This can be caused by small static discharges, for example when a charged person sits down in a chair in the lab, or by machinery switching on or off somewhere else in the building. Each of these latter noise sources would generally show up on your ground reference, even when connected to the building earth ground, an often imperfect grounding source. Your goal is first to provide as much filtering as possible in your circuitry. However, these noise sources are difficult to completely eliminate, so it is a good idea to include some filtering in your data analysis script.
+
 
+
<pre>
+
Range = 5 + 4 * rand();                              % Range between 5 and 9
+
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
+
sim1.V_f = Range * sim1.dsDnaFraction + Minimum;    % Scale output
+
noise = 0.01 * randn(1,length(sim1.V_f))';          % Random noise, standard deviation=10mV
+
sim1.V_f = sim1.V_f + noise;                        % Add random noise
+
noise = rand(1,length(sim1.V_f))';                  % Uniformly distributed  random numbers
+
ImpulseNoiseProbability = 1 ./ 75;                  % Probability of impulse noise
+
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
+
sim1.V_fNoisy = max(sim1.V_f, noise);                    % Add in impulse noise
+
 
+
figure(9)
+
plot(sim1.time, sim1.V_fNoisy)
+
title('Fluorescence Voltage vs. Time');
+
xlabel('Time (s)');
+
ylabel('Fluorescence Voltage (V)');
+
</pre>
+
 
+
[[Image:SimNoisyDNAFractionVsTime.png|350px|Simulated RTD voltage plot]]
+
 
+
As in the last section, revisit your code from DNA Melting: Simulating DNA Melting - Basics but this time add noise to your fluorescence voltage signal, then explore the behavior of your code for different levels of noise. Then again run this data through your fit function.
+
 
+
Here too you are encouraged to add noise at additional frequencies that you think might be present in the lab. What could you do to minimize the effects of these noise sources? Think of solutions both at the hardware level and at the software level and model both of them.
+
 
+
===Add photobleaching effects===
+
Each time a SYBR Green molecule is excited by a photon, there is a small chance that it will be destroyed and permanently lose its ability to fluoresce. The effect of photobleaching on the measured fluorescence intensity can be significant. To demonstrate this, a model for photobleaching may be added to the simulation. In the model, a small, fixed percentage of the fluorophores are mathematically destroyed each time sample. This can be implemented with a simple integration. The derivation of this model can be found [http://dl.dropbox.com/u/12957607/Bleaching%20Correction%20Derivation.pdf here].
+
 
+
The model does not completely describe observations, so in the code below, we have also added an additional time-linear bleaching relationship that has helped to explain the observed behavior. The fraction of fluorophores destroyed per time sample is set by the variable <tt>bleachingCoefficient</tt>, while the coefficient for the linear case is <tt>bleachingCoonstant</tt>.
+
 
+
<pre>
+
sim1.filtV_f = medfilt1(sim1.V_fNoisy,15); % temporarily remove shot noise
+
sim1.fluorescenceNorm = sim1.filtV_f / max(sim1.filtV_f); % normalize photo diode signal to dsDNA fraction
+
sim1.bleachingCoefficient = 3e-5;              % 1/s, estimated from previous experiments
+
sim1.bleachingConstant = 3e-4;              % 1/s, estimated from previous experiments
+
sim1.bleachingCorrection = (1 - sim1.bleachingCoefficient ...
+
    .* cumsum(1 .* sim1.fluorescenceNorm)) ...
+
    .* (1 - sim1.bleachingConstant .* (0:(length(sim1.fluorescenceNorm)...
+
    - 1))');    % calclation of bleaching correction by two methods, exponetial and linear
+
sim1.fluorescenceBleached = sim1.fluorescenceNorm ...
+
    .* sim1.bleachingCorrection;
+
Range = 5 + 4 * rand();                              % Range between 5 and 9
+
Minimum = 2 * rand() - 1;                            % Minimum between +/- 1
+
sim1.V_fScaled = Range * sim1.fluorescenceBleached + Minimum;    % Scale output
+
noise = rand(1,length(sim1.V_fScaled))';                  % Uniformly distributed  random numbers
+
ImpulseNoiseProbability = 1 ./ 75;                  % Probability of impulse noise
+
noise = 20 * (noise < ImpulseNoiseProbability) - 10; % Quantize impulse noise
+
sim1.V_fBleached = max(sim1.V_fScaled, noise);                    % Add in impulse noise
+
 
+
figure(10)
+
plot(sim1.time, sim1.V_fBleached)
+
title('Relative Fluorescence vs. Time with Bleaching');
+
xlabel('Time (s)');
+
ylabel('Relative Fluorescence');
+
</pre>
+
 
+
[[Image:SimNoisyBleachedDNAFractionVsTime.png|350px|Fluorescence vs. time with photobleaching]]
+
 
+
===Add temperature lag effects===
+
Etc etc
+
 
+
===Apply your fitting routine to this more realistic simulated data===
+
Etc etc
+
 
+
==Lab manual sections==
+
 
+
*[[Lab Manual:Measuring DNA Melting Curves]]
+
*[[DNA Melting: Simulating DNA Melting - Basics]]
+
*[[DNA Melting Part 1: Measuring Temperature and Fluorescence]]
+
*[[DNA Melting Report Requirements for Part 1]]
+
*[[DNA Melting Part 2: Lock-in Amplifier and Temperature Control]]
+
*[[DNA Melting Report Requirements for Part 2]]
+
 
+
{{Template:20.309 bottom}}
+

Latest revision as of 22:46, 15 April 2018

Introduction

This tutorial will show you how to simulate the DNA melting experiment with Matlab. The DNA melting apparatus you will build produces two output voltages related to the sample temperature and fluorescence intensity. During an experimental run, these two voltages will be recorded periodically. At the end of this tutorial, you should be able to produce mock datasets that are similar to what you will record in the lab. You can use simulated datasets to help develop and debug your data analysis scripts.

It will be essential to have working data analysis scripts before you begin running experiments in the lab. When you have completed this tutorial, refer to the lab manual for data analysis instructions.

The procedure for creating the simulation is:

  1. Write a function to compute the equilibrium dsDNA contration vs. temperature
  2. Model sample temperature versus time during an experimental run
  3. Compute the RTD voltage from temperature vs. time data
  4. Add noise to the RTD signal
  5. Compute relative fluorescence intensity from temperature vs. time data
  6. Model photobleaching (optional)
  7. Scale the fluorescence intensity signal and add noise
  8. Reformat the data to match what is produced by the DNA melting software

You will not have to write any new code to complete this tutorial; however, you should open a Matlab window and follow along as you read. Code from this page can be copied and pasted into the Matlab command window. M-files are available in the course locker.

If you prefer to work in Python, please see this page.

Before you begin

Matlab skills

  • Element-by-element operations on arrays (+, -, .*, ./)
  • Functions declared in an m-file
  • Conditionals
  • Functions that take a variable number of arguments
  • Numerical differentiation
  • Plotting data

Modeling dsDNA equilibrium concentration

We derived an expression for dsDNA concentration at a given temperature in class:

$ K_{eq} = e^\left [\frac{\Delta S^{\circ}}{R} - \frac{\Delta H^{\circ}}{R T} \right ] $
$ f = \frac{1 + C_T K_{eq} - \sqrt{1 + 2 C_T K_{eq}}}{C_T K_{eq}} $

Since this is a very complicated equation and we will use it many times, it makes sense to write a Matlab function.

dsDNA fraction function

The fraction of dsDNA is a function of 4 variables: ΔS°, ΔH°, temperature, and concentration. The DnaFraction function below takes four parameters to represent these quantities and an optional fifth parameter that causes the function to use a different value of the gas constant. Changing the fifth parameter allows the same function to operate with different unit systems.

A built-in MATLAB function in called nargin allows functions to accept a variable number of arguments. nargin returns the number of arguments that the function was called with. The DnaFraction function sets R to the default value of 1.987 if nargin returns a value of 4 or less. Otherwise, the function sets R to the value supplied by the caller.

The element-by-element divide operator in the body of the function makes it possible for DnaFraction to accept either a single temperature argument or a vector of temperature values. (As a general rule, use the ./ operator instead of / unless you intend to do a matrix divide.)

To use this function, save it in a file called "DnaFraction.m" (MATLAB requires that all functions be saved in a file that has the same name as the function appended with the extension ".m".) The file must reside in the current directory or another directory that is on your Matlab path. Function help should be placed in comments at the top of the file. Type help DnaFraction to see how this works.

% Returns the fraction of dsDNA in a solution containing equal concentrations
% of two complementary DNA oligos as a functino of total DNA concentration,  
% temperature, entropy change, and enthalpy change. 
% Gas constant is an optional parameter.
%
% USAGE: f = DnaFraction(Ct, T, DeltaS, DeltaH, <GasConstant>)
% RETURN VALUE: fraction of dsDNA
%
% Default units are molar, Kelvin, cal/mole, and cal/(mole-Kelvin). For
% other unit systems, supply the appropriate value of R in the optional
% fifth parameter. The default value is 1.987 cal/(degree K * mole).

function f = DnaFraction(Ct, T, DeltaS, DeltaH, GasConstant)
% Gas constant
if(nargin < 5)              % determine if caller supplied R value or not
    R = 1.987;              % set to default: cal/(degree K * mole)
else
    R = GasConstant;        % otherwise use caller's value
end

% Compute Ct * Keq
CtKeq = Ct * exp(DeltaS / R - DeltaH ./ (R * T));

%now compute f
f = (1 + CtKeq - sqrt(1 + 2 * CtKeq)) ./ CtKeq;

% Written 4/9/2008 by SCW

Plot an ideal melting curve

Does the DnaFraction function work as expected? A good way to test the function is to plot a melting curve generated by DnaFraction.

Matlab notes: Over the course of an interactive Matlab session, you may run several simulations with different parameters. Using a structure to keep the results of each simulation together is a good way to prevent the workspace from becoming disorganized. The struct function creates an empty structure. In the code below, a structure called ideal is created to hold the ideal melting curve data. (Since variables are dynamically allocated and typed in Matlab, the struct command is optional; however, explicitly creating the structure can make your code easier to read and better performing.)

Since DnaFraction accepts a vector temperature argument, an entire melting curve can be computed with a single call.

ideal = struct();                          % create an empty structure to hold the results
ideal.temperature = (0:99) + 273.15;       % temperature axis in Kelvin: 100 points(0-99C)
ideal.deltaS = -500;                       % cal/(mol K) 
ideal.deltaH = -180E3;                     % cal/mol 
ideal.concentration = 1E-6;                % mol/L

% call DnaFraction function
% arguments: concentration, temperature vector, delta S, delta H 
% (optional R argument not supplied)

ideal.dsDnaFraction = DnaFraction(ideal.concentration, ideal.temperature, ...
                                  ideal.deltaS, ideal.deltaH);

plot(ideal.temperature, ideal.dsDnaFraction);
title('Ideal DNA Melting Curve');
xlabel('Temperature (K)');
ylabel('dsDNA Fraction');

Ideal DNA melting curve

Experiment with different values of ΔH° and ΔS° to see how the curve changes. You can use OligoCalc or another tool to estimate values for different DNA sequences.

One method of estimating Tm is to find the temperature where the dsDNA fraction equals 1/2. Another method is to differentiate the melting curve. Tm occurs at the peak of the derivative — where the dsDNA fraction is changing the fastest. (Why might you prefer one method to the other when estimating Tm from experimental data?)

Numerical differentiation

The fluorescence function fN is a sampled version of the continuous function f(T) such that fN = f(NΔT), where ΔT is the sampling interval (1° C, in this instance). One method for computing a numerical approximation to df/dT is the finite difference:

$ {f'}_N = \frac{f_{N+1}-f_{N}}{\Delta T} $.

dF/dT is guaranteed to take on the value given by f'N at some point between NΔT and (N+1)ΔT; however, this happy coincidence can occur at any place in the interval. A simple but effective technique is to guess that equality occurs in the middle of the interval. This guess is close enough in most cases.

The code below computes the finite difference and stores it in the dDsDnaFraction field of ideal. It also generates an associated temperature vector (ideal.dTemperature) with values centered between the original temperature samples. Note that both vectors are one element shorter than the original data.

ideal.dTemperature = ideal.temperature(1:99) + 0.5; % create new temperature vector
ideal.dDsDnaFraction = diff(ideal.dsDnaFraction);   % take the difference

plot(ideal.dTemperature, -1 * ideal.dDsDnaFraction);
title('Derivative of Ideal DNA Melting Curve');
xlabel('Temperature (K)');
ylabel('Delta dsDNA Fraction');

Derivative of ideal DNA melting curve

In the previous section, the plot shows that the dsDNA fraction equals 0.5 at about 332.5 degrees. The peak of the derivative also occurs at the same temperature. (The curve is plotted inverted.)

Set up simulation conditions and data structure

It will be convenient to set up a data structure with the simulation parameters and a time axis.

simulationLength = 900;                               % 15 minute simulation
sampleRate = 1;                                       % 1 Hz sample rate

sim1 = struct();                                      % create empty struct for sim results
sim1.concentration = 1E-6;                            % 1 micromolar concentration
sim1.deltaS = -184;                                   % cal / (mole-K)
sim1.deltaH = -71E3;                                  % cal / mole
sim1.initialTemperature = 363;                        % 363 K = 90 C
sim1.finalTemperature = 293;                          % approx. room temperature
sim1.sampleRate = 1;                                  % 1 Hz sample rate
sim1.duration = 900;                                  % seconds -- 15 minute experiment

sim1.time = 0:(1./sampleRate):simulationLength;       % create time vector (units: seconds)

Simulate sample temperature during an experimental run

The goal of the simulation is to produce two simulated voltage signals that resemble what you will measure in the lab. Since both of the voltages are functions of temperature, it makes sense to begin by modeling how the sample cools during an experimental run.

Sample cooling

During each experimental run, the range of temperatures will be provided by natural cooling of the sample after it has been heated in a bath. At the beginning of an experimental run, the DNA + fluorescent dye sample is heated in a bath to around 90°C. Over a period of about 15 minutes, the sample cools to room temperature (about 20°C). The rate of cooling is proportional to the difference between the sample's temperature and the room temperature, so the sample cools more quickly at the start of the experiment than near the end. (It can be frustrating to wait for the final few degrees.)

Cooling can be approximated by an exponential function with a time constant of about 3 minutes. The equation for temperature (in °K) versus time is:

$ T(t) = (T_i - T_f) e ^ {-\frac{t}{\tau}} + T_f $

where,

T(t) is the temperature as a function of time,
Ti is the initial temperature (after heating),
Tf is the final (ambient) temperature,
τ is the time constant.

In Matlab:

sim1.coolingConstant = 180;

% use exp function to model exponential cooling
sim1.temperature = ...
    (sim1.initialTemperature - sim1.finalTemperature) .*   ...   % T_i - T_f *
    exp(-sim1.time ./ sim1.coolingConstant) +   ...              % e^(-t/tau) +
    sim1.finalTemperature;                                       % T_f

plot(sim1.time, sim1.temperature, 'r');
xlabel('Time (sec)');
ylabel('Temperature (K)');
title('Simulated Temperature vs. Time');

Simulated sample cooling curve

If you have any questions about the element-by-element division operator in Matlab (./) see:this page.

RTD temperature sensor model

RTD and voltage divider circuit

The resistance of the RTD is given by the equation RRTD = 1000 + 3.85(T - 273). In the experimental apparatus, the RTD is hooked up as one element of a voltage divider with another resistor, R. The power supply is 15 volts. Thus, the output voltage is: VRTD = 15 * RRTD/(R + RRTD). The following code calculates RTD voltage measurements from sim1.temperature and stores the result in sim1.V_RTD, assuming R = 20KΩ.

sim1.R_RTD = 1000 + 3.85 * (sim1.temperature - 273);      % Compute RTD resistance in Ohms
sim1.V_RTD = 15 * sim1.R_RTD ./ (20E3 + sim1.R_RTD);      % Compute V_RTD in Volts
plot(sim1.time, sim1.V_RTD, 'r');
xlabel('Time (sec)');
ylabel('RTD Voltage (V)');
title('Simulated RTD Voltage vs. Time');
Simulated RTD voltage plot