Assignment 4 part 2: Measure resolution

From Course Wiki
Revision as of 20:37, 25 February 2020 by Juliesutton (Talk | contribs)

Jump to: navigation, search
20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg

This is part 2 of Assignment 4.

Measuring resolution

Optical resolution overview

Synthetic image of tiny microspheres used for measuring resolution.

One of the most commonly used definitions of the resolution limit $ R $ of an optical system is the distance between two point sources in the sample plane such that the peak of one source’s image falls on the first zero of the other source’s image. This particular definition is called the Rayleigh resolution.

The theoretical value of $ R $ is given by the formula

$ R=\frac{0.61 \lambda}{ \text{NA}} $,

where $ \lambda $ is the wavelength of light that forms the image, and NA is the numerical aperture of the optical system. The definition suggests a procedure for measuring resolution: make an image of a point source, measure the peak-to-trough distance in the image plane, and divide by the magnification. In this part of the lab, you will use a procedure inspired by this simple idea to estimate the resolution of your microscope. Instead of measuring the spot sizes with a ruler, you will use nonlinear regression to find best-fit parameters of a two dimensional Gaussian function that best approximates the digital images of (near) point sources that you will make. You will use the best-fit parameters from the regression to compute the resolution measurement.

One practical problem with this method is that true point sources are difficult to come by. If you were an astronomer testing a telescope, stars are readily available in the night sky, and they are very good approximations of point sources. Since there is no natural microscopic sample that is equivalent to the night sky, microscopists have to prepare a synthetic sample suitable for measuring resolution. Prehaps the most common method is to use a microscope slide sprinkled with tiny, fluorescent beads that have diameters in the range of 100-190 nm. These beads are small enough to be considered point sources. Unfortunately, beads small enough for this purpose are not very bright. Imaging them can be challenging. Your microscope must be very well aligned to get good results. (The images you make for this part of the lab will probably remind you of telescope images. If they don't, have an instructor take a look at your setup.)

Why fit a Gaussian instead of a Bessel function of an Airy disk? Gaussians are more amenable to nonlinear regression because they are smoother and faster to evaluate than Bessel functions. In addition, the Gaussian is a very good approximation to the central bump of a Bessel function. It is straightforward to convert the Gaussian parameters to Rayleigh resolution. See Converting Gaussian fit to Rayleigh resolution for a discussion of the conversion.

In outline, the sequence of steps for the resolution measurement is:

  1. get a (real or synthetic) image of point sources
  2. find the pixels that correspond to each microsphere and associate them into connected regions
  3. compute useful properties of the connected regions
  4. eliminate regions that are likely not images of a single microspheres
  5. use nonlinear regression to fit a Gaussian model function to the image of each microsphere
  6. compute summary statistics
  7. convert Gaussian parameter to Rayleigh resolution.

Identifying bright regions in an image and computing their properties

The first thing we want to do is get a (real or synthetic) image of point sources. Luckily, we're already generated a synthetic image of 90 nm radius spheres in the part 1 of this assignment.

The next step is to identify pixels of interest. Fortunately, this is a very simple matter because we went to great lengths to make an image that has high contrast, little background, and high SNR. The interesting pixels are the bright ones. A simple, global threshold works well — all the pixels brighter than a certain threshold are (probably) interesting. You might want to think a bit about how to choose the best threshold value. The MATLAB Image Processing Toolbox includes a global threshold function. im2bw( I, level ) applies a global threshold to image I and returns a binary image (also called a bilevel image or a mask). The binary image is a matrix the same size as I that contains only ones and zeros. You guessed it — there are ones in locations where the pixel value was greater than level and zeroes everywhere else.

The function regionprops operates on binary images. It can identify connected regions and compute properties of those regions. The function FindBrightObjectsInImage below uses im2bw and regionprops to segment an image and compute properties of each connected region of pixels. Region properties to compute (e.g. area, eccentricity) are specified in a cell array of strings. Here is a complete list of the properties regionprops can compute.

function RegionProperties = FindBrightObjectsInImage( ...
        InputImage, GlobalThreshold, DilationRadius, PropertyList )

    mask = im2bw(InputImage, GlobalThreshold);
    mask = imclearborder(mask);               % eliminates connected regions that touch the edge
    mask = imdilate(mask, strel( 'disk', DilationRadius ));  % open up the mask a little
    
    RegionProperties = regionprops( mask, InputImage, PropertyList );

end

FindBrightObjectsInImage returns a struct array with all of the computed properties —, an array where each element is a structure. The structure will contain one field for each of the properties in the PropertyList argument. The fields have the same name as the property. For example, if you included 'Centroid' in the property list and there were ten objects in the image, RegionProperties(3).Centroid would return a 1 x 2 matrix with the y and x coordinates of the centroid. If you wanted to create an N x 2 of all of the centroids, you can use MATLAB's : indexing syntax and a concatenation function: AllCentroids = vertcat( RegionProperties(:).Centroid );.

Try running FindBrightObjectsInImage on your synthetic image and examine the results. Put a breakpoint on the first line of FindBrightObjectsInImage. Use imshow to see how the mask is affected by imclearborder and imdilate. What is a good value for DilationRadius?

Using nonlinear regression to measure resolution

The task of measuring resolution would be super-simple if regionprops had a built-in property called RayleighResolution. Unfortunately, it doesn't, so we will have to write our own function. Nonlinear regression is a good tool for this sort of thing. As long as we have a mathematical model of what the images of beads look like, we can use nonlinear regression to find best-fit parameters for each of the PSF beads in an image.

Review

Regression is a method for finding a relationship between a dependent quantity $ O_n $ and one or more independent variables, in this case the spatial coordinates of the image $ x_n $ and $ y_n $. The relationship is described by a model function $ f(\beta, x_n, y_n) $, where $ \beta $ is a vector of model parameters. The dependent variable $ O_n $ is measured in the presence of random noise, which is represented mathematically by a random variable $ \epsilon_n $. In equation form:

$ O_n=f(\beta, x_n, y_n)+\epsilon_n $.

The goal of regression is to determine a set of best-fit model parameters $ \hat{\beta} $ so that $ f(\hat{\beta}, x_n, y_n) $ is as close as possible to your data $ O_n $. Because the dependent variable includes noise, $ \hat{\beta} $ cannot be determined exactly from the data. Increasing the number of observations or decreasing the magnitude of the noise tends to produce a more reliable estimate of $ \hat{\beta} $.

Linear and nonlinear regression are similar in some aspects, but the two techniques have a fundamental difference. Linear regression applies to any model functions that are polynomials, Nonlinear regression applies to essentially everything else. Nonlinear regression is obviously more flexible (since you can fit functions like sines, cosines, exponentials, and gaussians), but it unfortunately cannot be reduced to a single, deterministic formula like linear regression can. Finding the optimal solution to a nonlinear regression is an iterative process. Starting with an initial guess, each following iteration produces a more refined estimate of $ \beta $. The process stops when no better estimate can be found (or when something bad happens ... such as the solution not converging).

Ordinary nonlinear least squares regression assumes that:

  • the independent variables are known exactly, with zero noise,
  • the error values are independent and identically distributed,
  • the distribution of the error terms has a mean value of zero,
  • the independent variable covers a range adequate to define all the model parameters, and
  • the model function exactly relates $ O $ to $ x $ and $ y $.

These assumptions are almost never perfectly met in practice. It is important to consider how badly the regression assumptions have been violated when assessing the results of a regression.

The four things you need for nonlinear regression

As shown in the diagram below, you need four things to run a regression:

  1. a matrix containing the values of the independent variable(s);
  2. a vector containing the corresponding observed values of the dependent variable;
  3. a model function; and
  4. a vector of initial guesses for the model parameters.
Block diagram of nonlinear regression.

Here is a MATLAB function that puts all of the steps for nonlinear regression together:

function BestFitParameters = Fit2dGaussian( Values, Coordinates, PlotEnable )
    if( nargin < 3 )
        PlotEnable = false;
    end

    pixelCountAboveHalf = sum( Values > ( ( min( Values ) + max( Values ) ) / 2 ) );
    sigmaInitialGuess = 0.8 * sqrt( pixelCountAboveHalf / 2 / pi / log(2) );

    initialGuesses = [ ...
        mean( Coordinates(:, 1) ), ... % yCenter
        mean( Coordinates(:, 2) ), ... % xCenter
        range( Values ), ... % amplitude
        sigmaInitialGuess, ... % sigma
        min( Values ) ]; % offset

    if( PlotEnable )
        plot3( Coordinates(:,1), Coordinates(:,2), Gaussian2DFitFunction( initialGuesses, Coordinates ), 'x' )
        hold on
        plot3( Coordinates(:,1), Coordinates(:,2), Values, 'x' )
        drawnow;
    end

    BestFitParameters = nlinfit( Coordinates, Values, @Gaussian2DFitFunction, initialGuesses );
end

This code relies on the following model function:

function out = Gaussian2DFitFunction( Parameters, Coordinates )
    yCenter = Parameters(1);
    xCenter = Parameters(2);
    amplitude = Parameters(3);
    sigma = Parameters(4);
    offset = Parameters(5);
    
    out = amplitude * ...
        exp( -(( Coordinates(:, 1) - yCenter ).^2 + ( Coordinates(:, 2) - xCenter ).^2 ) ...
        ./ (2 * sigma .^ 2 )) + offset;
    
end

Let's spend some time to go through and understand the elements of this function step-by-step.

#3. The model function

nlinfit requires that the regression model be expressed as a function that takes two arguments and returns a single vector of predicted values. The model function must have the form:

[ PredictedValues ] = ModelFunction( Beta, X )

The first argument, Beta, is a vector of model parameters. The second argument, X, is a vector of independent variable values. The return value, PredictedValues, must have the same size as X.

The MATLAB function Gaussian2DFitFunction defined above computes the two dimensional function that we will use to model the image of a PSF bead. Parameters is a 1x5 vector that contains the model parameters in this order: Y center, X center, amplitude, sigma, and offset.

It's a good idea to test the model function out before you use it. The plot below shows four sets of curves generated by Gaussian2DFitFunction with different parameters. It's comforting to see that the curves have the expected shape.

Model function evaluated with different parameters.

As an example of how to use the model function, let's make a grid from -10 to 10 in x and y, and pick some arbitrary parameters to plot:

figure;
x = -10:10;
y = -10:10;
[X,Y] = meshgrid(x,y);
parametersToTry = [0, 0, 2, 5, 1.3];
Z = Gaussian2DFitFunction(parametersToTry, [X(:),Y(:)]);
plot3(X(:),Y(:),Z,'x')

Try changing the parameters and seeing how the plotted gaussian changes. Does it behave like you expect?

1. and 2. Independent variable and observations

Surface plot of dataset for one PSF microsphere. The x and y coordinates are the independent variables and the pixel value is the dependent variable.

In the regression we are about to do, the related quantities are x and y, the coordinates of each pixel in the image of a microsphere, and the corresponding pixel values (intensities) of each pixel. One important consideration is that the regression algorithm assumes that the independent variable has no (or very little) noise. Happily, the pixel coordinates are known with essentially zero error (the layout of pixels on the camera is extremely accurate), which means our regression will not violate this assumption. The pixel values, however, are subject to imaging noise. Considering all that, x and y are a good choice to use as independent variables and the pixel value will be the dependent variable.

The regionprops function can compute two properties that will be useful for the regression: PixelList and PixelValues. These properties exactly correspond to the regression variables. PixelList is an N x 2 matrix that contains the y and x coordinates of each pixel in the region. PixelValues is a vector of all the corresponding pixel values (listed in the same order as the coordinates).

4. Initial guesses

nlinfit requires an initial value for each of the five model parameters, contained in a 1x5 vector. (nlinfit infers the number of model parameters from the size of the Initial guess vector.) regionprops can calculate several properties that are useful for coming up with initial guesses. For example, Centroid is a good starting point for the center of the particle. nlinfit will refine the value as it works. The minimum pixel value is a good guess for offset, and the difference between the maximum and minimum (range) pixel values is a good guess for amplitude. The number of pixels brighter than half of the range is the basis for a guess at sigma.

Notice how the initial parameters are estimated in Fit2dGaussian (repeated below):

    pixelCountAboveHalf = sum( Values > ( ( min( Values ) + max( Values ) ) / 2 ) );
    sigmaInitialGuess = 0.8 * sqrt( pixelCountAboveHalf / 2 / pi / log(2) );

    initialGuesses = [ ...
        mean( Coordinates(:, 1) ), ... % yCenter
        mean( Coordinates(:, 2) ), ... % xCenter
        range( Values ), ... % amplitude
        sigmaInitialGuess, ... % sigma
        min( Values ) ]; % offset

The first step of all regressions is to plot the observations and the model function evaluated with the initial guesses versus the independent variable on a single set of axes. Don't attempt to run nlinfit until you've done this plot. It is much easier to ensure that the arguments to nlinfit are plausible before you invoke it than to debug a screen full of cryptic, red text afterwards. Side effects of premature regression include confusion, waste of time, fatigue, irritability, alopecia, and feelings of frustration. Contact your professor if your regression lasts more than four hours. There is no chance that nlinfit will succeed if there is a problem with one of its arguments.

Go ahead ... do the plot.

Pre-regression plot with observed values and model function evaluated with initial-guess parameters.

It looks like the initial guesses are good. Now we can proceed with confidence that the arguments to nlinfit are credible.

Now that you deeply understand the process of nonlinear regression, repeat the process for each bright region in your image

function [ Resolution, StandardError, BestFitData ] = MeasureResolutionFromPsfImage( ImageData )    
    % TODO list:
    % 1. think of a good way to pick the threshold
    % 2. figure out how to eliminate images that are not single beads
    
    objectProperties = FindBrightObjectsInImage( ImageData, 0.5, 2, { 'Centroid', 'PixelList', 'PixelValues' } );
        
    figure(1);
    imshow( ImageData / max( ImageData(:) );
    LabelObjectsInImage( objectProperties );
    
    % INSERT CODE TO ELIMINATE BAD OBJECTS HERE

    BestFitData = zeros( numel(objectProperties), 5);
    
    figure(2);
    
    % use nlinfit to fit a Gaussian to each object
    for ii = 1:length(objectProperties)
        
        BestFitData(ii, :) = Fit2dGaussian( objectProperties(ii).PixelValues, objectProperties(ii).PixelList );
        
        % plot data, initial guess, and fit for each peak
        figure(2)
        clf
        
        % generate a triangle mesh from the best fit solution found by 
        % nlinfit and plot it
        gd = delaunay( objectProperties(ii).PixelList(:,1), ...
            objectProperties(ii).PixelList(:,2) );
        trimesh( gd, objectProperties(ii).PixelList(:,1), ...
            objectProperties(ii).PixelList(:,2), ...
            Gaussian2DFitFunction(BestFitData(ii, :), ...
            objectProperties(ii).PixelList ) )
        hold on
        
        % plot image data
        plot3( objectProperties(ii).PixelList(:,1), ...
            objectProperties(ii).PixelList(:,2), ...
            objectProperties(ii).PixelValues, 'gx', 'LineWidth', 3)
        title(['Image data vs. Best Fit for Object Number ' num2str(ii)]);
        drawnow
    end
    
    Resolution = mean( BestFitData(:,4) ) ./ .336;
    StandardError = std( BestFitData(:,4) ./ .336 ) ./ sqrt( size( BestFitData, 1 ) );
end

function out = Gaussian2DFitFunction( Parameters, Coordinates )
    yCenter = Parameters(1);
    xCenter = Parameters(2);
    amplitude = Parameters(3);
    sigma = Parameters(4);
    offset = Parameters(5);
    
    out = amplitude * ...
        exp( -(( Coordinates(:, 1) - yCenter ).^2 + ( Coordinates(:, 2) - xCenter ).^2 ) ...
        ./ (2 * sigma .^ 2 )) + offset;
    
end

function LabelObjectsInImage( objectProperties )
    labelShift = -9;
    fontSize = 10;
    
    for ii = 1:length(objectProperties)
        unweightedCentroid = objectProperties(ii).Centroid;
        text(unweightedCentroid(1) + labelShift, unweightedCentroid(2), ...
            num2str(ii), 'FontSize', fontSize, 'HorizontalAlignment', ...
            'Right', 'Color', [0 1 0]);
    end

end

Use the above code to iterate through each bright object in your image and estimate the resolution of each bright bead.

It's frequently the case that a few of the beads in a real PSF image are not good candidates for measuring resolution. For example, there are sometimes two beads that too close together to separate. Sometimes, there are also aggregates of multiple beads in the picture. Identify some useful properties that regionprops computes for sorting out the bad regions and write code to eliminate them.

A bit of MATLAB syntax that you might find useful is this: you can remove an element from an array by assigning it to be the empty value []. Some examples:

RegionProperties(3) = []; % removes the third element of the struct array and reduces its size by 1
RegionProperties( [ 0 0 0 1 0 1 0 0 1 0 ] ) = []; % removes the 4th, 6th, and 9th elements and reduces size by 3

Testing the code

Example image processing on PSF beads to determine microscope resolution.
Example Gaussian fit of a PSF bead fluorescence emission profile to estimate microscope resolution.


Pencil.png

Use the synthetic image code you developed in part 1 of this assignment to test the MeasureResolutionFromPsfImage function using synthetic images of fluorescent microspheres with a diameter of 180 nm over a range of numerical apertures from 0.1 to 1.0. Plot the results, measured resolution versus predicted resolution. Turn in your code and the plot.


Measure the resolution of your microscope

  1. Make an image of a sample of 170 nm fluorescent beads with the 40X objective. (Several dozens to hundreds of PSF spheres should be captured in your image.)
    • Use 12-bit mode on the camera and make sure to save the image in a format that preserves all 12 bits.
    • Ensure that the image is exposed properly.
      • Over-exposed images will give inaccurate results.
      • Under-exposed images will be difficult to process and yield noisy results.
    • This procedure is extremely sensitive to the focus adjustment.
    • To minimize photobleaching, do not expose of the beads to the light source and longer than necessary.
    • Be sure to save the image and the histogram for your lab report.
  2. Use image processing functions to locate non-overlapping, single beads in the image.
  3. Use nonlinear regression to fit a Gaussian to each bead image.
  4. Convert the Gaussian parameters to resolution.


Pencil.png

Report the resolution you measured and discuss sources of error in the measurement.


Navigation

Back to 20.309 Main Page