Difference between revisions of "Assignment 9, Part 2: Simulating DNA melting data and testing the model function"

From Course Wiki
Jump to: navigation, search
(Testing your model function)
Line 10: Line 10:
 
Now it's time to test your code. Just like in the particle tracking assignment, we will create some simulated data with a known functional form, run the code, then see whether the outputs match what you put in.  
 
Now it's time to test your code. Just like in the particle tracking assignment, we will create some simulated data with a known functional form, run the code, then see whether the outputs match what you put in.  
  
The input to the model function, ''V<sub>f,model</sub>'', is the block temperature (from which sample temperature is derived) and the model parameters (to be determined). In the next assignment, you will upgrade your DNA melting instrument to control the block temperature (rather than just having it turn on and off). You will produce a temperature ramp up from room temperature to 95 degrees, hold it there for a bit, then cool back down to room temperature. Let's simulate this in MATLAB:
+
The input to the model function, ''V<sub>f,model</sub>'', is the block temperature (from which sample temperature is derived) and the model parameters (to be determined). In assignment 8, you upgraded your DNA melting instrument to control the block temperature (rather than just having it turn on and off). You produced a temperature ramp from room temperature up to 95 degrees, held it there for a bit, then cooled it back down to room temperature. Let's simulate this in MATLAB:
  
 
[[Image:Model block temperature.png|thumb|right| Model block temperature.]]
 
[[Image:Model block temperature.png|thumb|right| Model block temperature.]]
Line 17: Line 17:
 
Tmin = 25;
 
Tmin = 25;
 
Tmax = 95;
 
Tmax = 95;
Temp = [ (0:0.1:900)*((Tmax-Tmin)/900) + Tmin, Tmax*ones(1,1200), ...
+
blockTemperature = [ (0:0.1:900)*((Tmax-Tmin)/900) + Tmin, Tmax*ones(1,1200), ...
 
     (0.1:0.1:900)*((Tmin-Tmax)/900) + Tmax, Tmin*ones(1,1200) ]';
 
     (0.1:0.1:900)*((Tmin-Tmax)/900) + Tmax, Tmin*ones(1,1200) ]';
time = (1:length(Temp))*0.1;
+
time = (1:length(blockTemperature))*0.1;
plot(time,Temp)
+
plot(time, blockTemperature)
 
xlabel('Time, seconds')
 
xlabel('Time, seconds')
 
ylabel('Temperature, deg C')
 
ylabel('Temperature, deg C')
Line 26: Line 26:
 
</pre>
 
</pre>
  
Next, create a synthetic data set to mimic the DNA fluorescence. We can start out simply, by omitting the effects of photobleaching and thermal quenching, but including the temperature delay between the block and sample.  
+
Next, create a synthetic data set to mimic the DNA fluorescence. You'll need to choose some values for your initial parameters. It may be helpful to start with idealized parameters (for example, Kbleach = 0, Kgain = 1, etc.), and then change them individually until you obtain a realistic curve.
  
 
<pre>
 
<pre>
dnaConcentration = 30E-6;
+
dnaConcentration = 60E-6;
 
deltaS = -500;
 
deltaS = -500;
 
deltaH = -180E3;
 
deltaH = -180E3;
tauThermal = % choose some reasonable value
 
noise = 0.05;  % Roughly 5% noise
 
  
sampleTemp = SimulateLowPass(tauThermal, Temp);
+
% choose some reasonable values for your input parameters:
F = DnaFraction( dnaConcentration, sampleTemp + 273, deltaS, deltaH );
+
parameters = [...];
 +
 
 +
F = Vfmodel( parameters, blockTemperature );
 
</pre>
 
</pre>
  
Line 42: Line 42:
 
[[Image:Noisy Simulated DNA Melting Curve.png|thumb|right| Noisy simulated DNA melting curve.]]
 
[[Image:Noisy Simulated DNA Melting Curve.png|thumb|right| Noisy simulated DNA melting curve.]]
 
<pre>
 
<pre>
 +
noise = 0.05;  % Roughly 5% noise
 
F = F + noise*randn(size(F));
 
F = F + noise*randn(size(F));
plot(time,F)
+
plot(time, F)
 
xlabel('Temperature, deg C');
 
xlabel('Temperature, deg C');
 
ylabel('Fluorescence Signal');
 
ylabel('Fluorescence Signal');
Line 49: Line 50:
 
</pre>
 
</pre>
  
Now that we have some synthetic data, we can test our fitting function and algorithm. The model function you wrote should be suitable for use with the Matlab <tt>nlinfit</tt> function (or other fitting routines). The input to nlinfit will then be the block temperature, fluorescence signal, the model function, and initial values of the model parameters. Additionally, there are many options you can set for the nlinfit algorithm [[https://www.mathworks.com/help/stats/nlinfit.html#inputarg_options| described here]]. The output of nlinfit are the fitted parameter values.
+
Now that we have some synthetic data, we can test our fitting function and algorithm. The model function you wrote should be suitable for use with the Matlab <tt>nlinfit</tt> function (or other fitting routines). The input to nlinfit will then be the block temperature, the model function, and initial values of the model parameters. Additionally, there are many options you can set for the nlinfit algorithm [[https://www.mathworks.com/help/stats/nlinfit.html#inputarg_options| described here]]. The output of nlinfit are the fitted parameter values.
  
 
<pre>
 
<pre>
fitValues = nlinfit(Temp, F, fitFunc, beta0, fitOptions);
+
fitValues = nlinfit(blockTemperature, F, @Vfmodel, beta0, fitOptions);
 
</pre>
 
</pre>
  
The initial parameters you choose for nlinfit (<tt>beta0</tt>) are essential for getting a reasonable output. In this particular case, we know exactly what the parameters should be. Try altering the parameters slightly from their known values to get a feel for what it means for you guess to be 'reasonable'. Think about how you would guess the parameters for your future real data. Use these slightly altered parameters in the following plot.
+
The initial parameters you choose (<tt>beta0</tt>) are essential for getting a reasonable output from nlinfit. In this particular case, we know exactly what the parameters should be. Try altering the parameters slightly from their known values to get a feel for what it means for your initial guess to be 'reasonable'. Think about how you would guess the parameters for your real data. Use these slightly altered parameters in the following plot.
  
 
[[Image:Noisy Simulated DNA Melting Fit.png|thumb|right| Noisy simulated DNA melting curve, and model function with initial guesses and best fit parameters.]]
 
[[Image:Noisy Simulated DNA Melting Fit.png|thumb|right| Noisy simulated DNA melting curve, and model function with initial guesses and best fit parameters.]]
{{Template:Assignment Turn In|message=a single plot showing:
+
{{Template:Assignment Turn In|message= Plot the following items on the same set of axes. Don't forget to include a legend!
# your simulated data
+
# your simulated fluorescence data as a function of block temperature
# your model function with initial best-guess parameters
+
# your model function with initial best-guess parameters as a function of block temperature
# your model function with fitted parameters (obtained by nlinfit)}}
+
# your model function with fitted parameters (obtained by nlinfit) as a function of block temperature}}
  
 
==The effects of noise on fit parameters==
 
==The effects of noise on fit parameters==

Revision as of 15:53, 2 November 2017

20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg


This is Part 2 of Assignment 9.

Testing your model function

Now it's time to test your code. Just like in the particle tracking assignment, we will create some simulated data with a known functional form, run the code, then see whether the outputs match what you put in.

The input to the model function, Vf,model, is the block temperature (from which sample temperature is derived) and the model parameters (to be determined). In assignment 8, you upgraded your DNA melting instrument to control the block temperature (rather than just having it turn on and off). You produced a temperature ramp from room temperature up to 95 degrees, held it there for a bit, then cooled it back down to room temperature. Let's simulate this in MATLAB:

Model block temperature.
% generate temperature ramp
Tmin = 25;
Tmax = 95;
blockTemperature = [ (0:0.1:900)*((Tmax-Tmin)/900) + Tmin, Tmax*ones(1,1200), ...
    (0.1:0.1:900)*((Tmin-Tmax)/900) + Tmax, Tmin*ones(1,1200) ]';
time = (1:length(blockTemperature))*0.1;
plot(time, blockTemperature)
xlabel('Time, seconds')
ylabel('Temperature, deg C')
title('Model block temperature')

Next, create a synthetic data set to mimic the DNA fluorescence. You'll need to choose some values for your initial parameters. It may be helpful to start with idealized parameters (for example, Kbleach = 0, Kgain = 1, etc.), and then change them individually until you obtain a realistic curve.

dnaConcentration = 60E-6;
deltaS = -500;
deltaH = -180E3;

% choose some reasonable values for your input parameters:
parameters = [...];

F = Vfmodel( parameters, blockTemperature );

Finally, and some random noise:

Noisy simulated DNA melting curve.
noise = 0.05;  % Roughly 5% noise
F = F + noise*randn(size(F));
plot(time, F)
xlabel('Temperature, deg C');
ylabel('Fluorescence Signal');
title('Simulated DNA Melting Curve');

Now that we have some synthetic data, we can test our fitting function and algorithm. The model function you wrote should be suitable for use with the Matlab nlinfit function (or other fitting routines). The input to nlinfit will then be the block temperature, the model function, and initial values of the model parameters. Additionally, there are many options you can set for the nlinfit algorithm [described here]. The output of nlinfit are the fitted parameter values.

fitValues = nlinfit(blockTemperature, F, @Vfmodel, beta0, fitOptions);

The initial parameters you choose (beta0) are essential for getting a reasonable output from nlinfit. In this particular case, we know exactly what the parameters should be. Try altering the parameters slightly from their known values to get a feel for what it means for your initial guess to be 'reasonable'. Think about how you would guess the parameters for your real data. Use these slightly altered parameters in the following plot.

Noisy simulated DNA melting curve, and model function with initial guesses and best fit parameters.


Pencil.png

Plot the following items on the same set of axes. Don't forget to include a legend!

  1. your simulated fluorescence data as a function of block temperature
  2. your model function with initial best-guess parameters as a function of block temperature
  3. your model function with fitted parameters (obtained by nlinfit) as a function of block temperature


The effects of noise on fit parameters

Even at a relatively large amplitude, random noise by itself should not appreciably affect the fitted parameter values; however, noise will generally contribute to uncertainty in the fit values. This uncertainty is seen in the parameter confidence intervals. Use the MATLAB function nlparci to obtain the 95% confidence intervals from the results returned by nlinfit, e.g.,

    [fitValues, residual, ~, COVB, ~] = nlinfit(Temp, F, fitFunc, beta0, fitOptions);
    CI = nlparci(fitValues, residual, 'covar',COVB);

where CI will be an N×2 array of values where N is the length of fitValues, and the two columns represent the upper and lower bound of the confidence interval. For example, if one of your parameters from fitValues is $ \alpha = 30.0903 $, and the corresponding row in CI is: [29.8243, 30.3563] what this really means is that if you were to repeat your experiment over and over, 95% of the time you will find a value for $ \alpha $ within the range of 29.8 to 30.4. A compact way to report this result is as follows: $ \alpha = 30.1 \pm 0.3 $ (in whatever units). (Note the confidence intervals are not necessarily symmetric for all types of statistics. But if the statistics are gaussian, (which we're already assuming) then it should be equally likely to measure a parameter above or below the mean.)

In your code, vary the noise in your simulated data from 0.01 to 0.05 in steps of 0.01. For each noise value, record the best-fit parameters and their uncertainties in a table.


Pencil.png

a table of your best fit parameters and their uncertainties for each noise level. Don't forget to include units and appropriate significant figures!


The ultimate goal of the DNA melting lab assignments is to identify an unknown sample of DNA by comparing it to three other samples. One thing you might notice while fitting your data is the values of $ \Delta H $ and $ \Delta S $ are rather unconstrained. The melting temperature, Tm, however, is much better defined, and will be a more robust metric to use to compare your data to one-another.

Using the expressions we derived in class for dsDNA concentration at a given temperature:

$ K_{eq} = e^\left [\frac{\Delta S^{\circ}}{R} - \frac{\Delta H^{\circ}}{R T} \right ] $
$ f = \frac{1 + C_T K_{eq} - \sqrt{1 + 2 C_T K_{eq}}}{C_T K_{eq}} $

derive an expression for the melting temperature, $ T_m $, which is defined as the temperature for which the fraction of double stranded DNA equals 0.5, in terms of $ \Delta H $ and $ \Delta S $. Calculate the melting temperature for each noise level from 0.01 to 0.05.


Pencil.png

an expression for the melting temperature as a function of $ \Delta H $ and $ \Delta S $. For each noise level from 0.01 to 0.05 for your simulated data, calculate Tm, turning in a plot of Tm vs. noise.



Pencil.png

an appendix with your code at the end of your assignment. Include all scripts and functions that you wrote, but no need to include unaltered code from the wiki.


Navigation

Back to 20.309 Main Page