Assignment 9, Part 2: Simulating DNA melting data and testing the model function

From Course Wiki
Revision as of 20:55, 2 November 2017 by Juliesutton (Talk | contribs)

Jump to: navigation, search
20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg


This is Part 2 of Assignment 9.

Testing your model function

Now it's time to test your code. Just like in the particle tracking assignment, we will create some simulated data with a known functional form, run the code, then see whether the outputs match what you put in.

The input to the model function, Vf,model, is the block temperature (from which sample temperature is derived) and the model parameters (to be determined). In assignment 8, you upgraded your DNA melting instrument to control the block temperature (rather than just having it turn on and off). You produced a temperature ramp from room temperature up to 95 degrees, held it there for a bit, then cooled it back down to room temperature. Let's simulate this in MATLAB:

Model block temperature.
% generate temperature ramp
Tmin = 25;
Tmax = 95;
blockTemperature = [ (0:0.1:900)*((Tmax-Tmin)/900) + Tmin, Tmax*ones(1,1200), ...
    (0.1:0.1:900)*((Tmin-Tmax)/900) + Tmax, Tmin*ones(1,1200) ]';
time = (1:length(blockTemperature))*0.1;
plot(time, blockTemperature)
xlabel('Time, seconds')
ylabel('Temperature, deg C')
title('Model block temperature')

Next, create a synthetic data set to mimic the DNA fluorescence. You'll need to choose some values for your initial parameters. It may be helpful to start with idealized parameters (for example, Kbleach = 0, Kgain = 1, etc.), and then change them individually until you obtain a realistic curve.

dnaConcentration = 60E-6;
deltaS = -500;
deltaH = -180E3;

% choose some reasonable values for your input parameters:
parameters = [...];

F = Vfmodel( parameters, blockTemperature );

Finally, and some random noise:

Noisy simulated DNA melting curve.
noise = 0.05;  % Roughly 5% noise
F = F + noise*randn(size(F));
plot(time, F)
xlabel('Temperature, deg C');
ylabel('Fluorescence Signal');
title('Simulated DNA Melting Curve');

Now that we have some synthetic data, we can test our fitting function and algorithm. The model function you wrote should be suitable for use with the Matlab nlinfit function (or other fitting routines). The input to nlinfit will then be the block temperature, the model function, and initial values of the model parameters. Additionally, there are many options you can set for the nlinfit algorithm [described here]. The output of nlinfit are the fitted parameter values.

fitValues = nlinfit(blockTemperature, F, @Vfmodel, beta0, fitOptions);

The initial parameters you choose (beta0) are essential for getting a reasonable output from nlinfit. In this particular case, we know exactly what the parameters should be. Try altering the parameters slightly from their known values to get a feel for what it means for your initial guess to be 'reasonable'. Think about how you would guess the parameters for your real data. Use these slightly altered parameters in the following plot.

Noisy simulated DNA melting curve, and model function with initial guesses and best fit parameters.


Pencil.png

Plot the following items on the same set of axes. Don't forget to include a legend!

  1. your simulated fluorescence data as a function of block temperature
  2. your model function with initial best-guess parameters as a function of block temperature
  3. your model function with fitted parameters (obtained by nlinfit) as a function of block temperature


The effects of noise on fit parameters

Even at a relatively large amplitude, random noise by itself should not appreciably affect the fitted parameter values; however, noise will generally contribute to uncertainty in the fit values. This uncertainty is seen in the parameter confidence intervals. Use the MATLAB function nlparci to obtain the 95% confidence intervals from the results returned by nlinfit, e.g.,

    [fitValues, residual, ~, COVB, ~] = nlinfit(Temp, F, fitFunc, beta0, fitOptions);
    CI = nlparci(fitValues, residual, 'covar',COVB);

where CI will be an N×2 array of values where N is the length of fitValues, and the two columns represent the upper and lower bound of the confidence interval. For example, if one of your parameters from fitValues is $ \alpha = 30.0903 $, and the corresponding row in CI is: [29.8243, 30.3563] what this really means is that if you were to repeat your experiment over and over, 95% of the time you will find a value for $ \alpha $ within the range of 29.8 to 30.4. A compact way to report this result is as follows: $ \alpha = 30.1 \pm 0.3 $ (in whatever units $ \alpha $ has). (Note the confidence intervals are not necessarily symmetric for all types of statistics. But if the statistics are gaussian, (which we're already assuming) then it should be equally likely to measure a parameter above or below the mean.) Uncertainty is typically reported with one or two significant figures. Round the uncertain quantity to the same decimal place as the uncertainty.


Pencil.png

Fill in the following table, making sure to include appropriate units and significant figures.


Parameter name Input parameter value Best fit parameter value Best fit parameter uncertainty
...


Pencil.png

Simulate a DNA melting curve and vary the noise magnitude from 0 to 0.05. At each noise level, fit the curve, and compute the confidence intervals for each parameter. Finally, plot the normalized confidence intervals as a function of the noise magnitude.


The ultimate goal of the DNA melting lab assignments is to identify an unknown sample of DNA by comparing it to three other samples. One thing you might notice while fitting your data is the values of $ \Delta H $ and $ \Delta S $ are rather unconstrained. The melting temperature, Tm, however, is much better defined, and will be a more robust metric to use to compare your data to one-another.


Pencil.png

Using the expressions we derived in class for dsDNA concentration at a given temperature:

$ K_{eq} = e^\left [\frac{\Delta S^{\circ}}{R} - \frac{\Delta H^{\circ}}{R T} \right ] $
$ f = \frac{1 + C_T K_{eq} - \sqrt{1 + 2 C_T K_{eq}}}{C_T K_{eq}} $

Derive an expression for the melting temperature, $ T_m $in terms of $ \Delta H $ and $ \Delta S $. The melting temperature is defined as the temperature for which the fraction of double stranded DNA equals 0.5. Calculate the melting temperature for each noise level from 0.01 to 0.05.



Pencil.png

an appendix with your code at the end of your assignment. Include all scripts and functions that you wrote, but no need to include unaltered code from the wiki.


Navigation

Back to 20.309 Main Page