Difference between revisions of "Assignment 9, Part 1: Analyze two-color yeast images"

From Course Wiki
Jump to: navigation, search
(Written questions)
(Analyze a frame)
Line 38: Line 38:
 
{{Template:Assignment Turn In|message = Choose a single frame of the movie. Make a four panel figure and display: the blue image, the green image (both with a scale bar), and their corresponding image histograms. Include an appropriate title for each panel. }}
 
{{Template:Assignment Turn In|message = Choose a single frame of the movie. Make a four panel figure and display: the blue image, the green image (both with a scale bar), and their corresponding image histograms. Include an appropriate title for each panel. }}
  
==Analyze a frame==
+
===Estimating background levels===
  
The first step in our analysis code is to write a function to extract relevant data from each movie frame. Our goal is to locate and identify individual cells and corresponding nuclei and then calculate the average Hog1 intensity in either region.
 
[[Image:FindingCellsAndNucleiExample.png|center|thumb|400px]]
 
 
Start with the following basic layout of the function:
 
 
<pre>
 
function hog1Response = AnalyzeFrame(TwoColorFrame)
 
 
    % 1. estimate the background levels for each image
 
    % 2. find nuclei regions using green images
 
    % 3. find cell regions from blue images
 
    % 4. calculate the average GFP intensity (which is proportional to Hog1 concentration) inside
 
    %    and outside the nucleus for each individual cell
 
 
end
 
</pre>
 
 
As you work through this part of the Assignment, fill out the <tt>AnalyzeFrame</tt> function to incorporate each of the outlined steps.
 
 
===Estimating background levels===
 
 
[[File:BlueImageHistogram.png|thumb|right]]
 
[[File:BlueImageHistogram.png|thumb|right]]
  
It's tricky to get good images of proteins expressed by living yeast cells. Not only do we need to use very high exposure times (on the order of 5 s) to capture a dim signal, but we want to image cells over long times (minutes to hours). These two factors work against us to produce high background levels caused by camera noise (especially dark noise), light leakage from the room, or autofluorescence of the surrounding medium. Any drift in the background level over time can obscure our results. One way to get around this is to use the information contained within each image to estimate the background level.  
+
It's tricky to get good images of fluorescent proteins expressed by living yeast cells. Not only do we need to use very high exposure times (on the order of 5 s) to capture a dim signal, but we want to image cells over long times (minutes to hours). These factors work against us to produce high background levels caused by autofluorescence of the surrounding medium, camera noise (especially dark noise), and potentially light leakage from the room. Any drift in the background level over time can obscure our results. Additionally, the background level may not be uniform over the entire image. For each frame of the movie, we will subtract off a background image (taken with no cells present) which has been scaled appropriately to account for changing levels over time.  
  
In this particular movie, the yeast cells only occupy a small proportion of the image pixels. This can be seen in the image histogram, which contains a large low-intensity peak in both blue and green images. We will estimate the distribution of background pixels by fitting the histogram with a Gaussian function: <math>p_1 \exp\left( -\frac{(x-p_2)^2}{2 p_3^2}\right)</math>.
+
In the example data provided, you should find a background image taken with no yeast present - only background coming from the medium autofluorescence and other background sources. To determine how much we need to scale the image, we will compare the background image intensity to an empty region of our cell images.  
 
+
The parameter <math>p_2</math> represents the mean background level, while <math>p_3</math> represents the width, or standard deviation of the background. Note that this method is based on the assumption that the intensity of the illumination is uniform (or, similarly that that the background distribution is the same everywhere in your image). This may not be the best assumption. Feel free to explore other ways of background correction in your analysis.
+
 
+
The following function can be used to estimate the mean and width of the pixel intensity distribution, assuming the background signal dominates:
+
  
 +
Use this function to locate an empty region of the first frame in your movie:
 
<pre>
 
<pre>
function [backgroundMean, backgroundStd] = FitImageHistogram(frame)
+
function BacgroundRectangle = LocateBacgroundRegion( InitialYeastImage )
 
      
 
      
    [counts, bins] = imhist(frame);
 
 
     figure
 
     figure
     plot(bins,counts)
+
     imshow( InitialYeastImage(:,:,1) )
     hold on
+
    title('Drag a rectangle around a region of the image containing only background')
 +
    BacgroundRectangle = getrect( gcf );
 +
     close
 
      
 
      
    gaussianFitFunction = @(p,x) p(1) * exp(-(x-p(2)).^2./(2 * p(3).^2));
+
end   
    p1Guess = max(counts);
+
</pre>
    p2Guess = bins(counts == p1Guess);
+
 
     [~, peakwidth] = min(abs(counts-p1Guess(1)/2));
+
Then for each frame, you can calculate the scale factor needed for your background image:
     p3Guess = abs(bins(peakwidth) - p2Guess(1));
+
<pre>
     beta0 = [p1Guess p2Guess p3Guess];
+
function bacgroundScaleFactors = FindBacgroundScaleFactor( InitialImage, BacgroundImage, BacgroundRectangle )
   
+
     bacgroundScaleFactors = nan(1,2);
    fitParameters = nlinfit(bins,counts,gaussianFitFunction,beta0);
+
     % Scale background
   
+
     for j = 1:2
    plot(bins, gaussianFitFunction(fitParameters,bins));
+
        backgroundImageCropped = imcrop(BacgroundImage(:,:,j),BacgroundRectangle);
    backgroundMean = fitParameters(2);
+
        initialImageCropped = imcrop(InitialImage(:,:,j),BacgroundRectangle);
    backgroundStd = fitParameters(3);
+
        bacgroundScaleFactors(j) = mean(initialImageCropped(:))/ mean(backgroundImageCropped(:));
      
+
     end
    legend('image pixel data','best fit')
+
 
      
 
      
 
end
 
end
 
</pre>
 
</pre>
  
{{Template:Assignment Turn In|message = Estimate the background mean and std for the blue and green color images of a single frame of the test movie. }}
+
Test the code by finding the background scale factors for the first frame of the GFP and RFP images. Try using several different rectangular regions and notice how much the output varies.
 +
{{Template:Assignment Turn In|message = Display an example of a background-subtracted and contrast-stretched image for a GFP and RFP image.}}
  
===Finding nuclei===
+
==Analyze a frame==
  
We'll use a familiar method of locating the nuclei in our green image by setting a threshold to find the bright regions. In addition, we'll use a morphological opening operation (an erosion followed by a dilation) to remove hot pixels or other noise.  
+
In each frame of our movie, we need to locate and identify individual cells, and find the correlation coefficient between the cell image and the nuclei image. If the Hog1 signal is localized in the nucleus, the correlation should be close to 1.  
 +
 
 +
[[Image:FindingCellsAndNucleiExample.png|center|thumb|400px]]
 +
 
 +
Start with the following basic layout of the function:
  
 
<pre>
 
<pre>
    nucleiMask = imbinarize(nucleiFrame,nucleiThreshold);
+
function NucleusHog1Correlation = AnalyzeFrame( TwoColorFrame, BackgroundImage )
     nucleiMask = imopen(nucleiMask,strel('disk',2));
+
 
 +
     % 1. estimate the background level
 +
    % 2. identify individual yeast cells in the GFP image
 +
    % 3. calculate the correlation coefficient between the GFP intensities an RFP intensities within each cell region
 +
    % 4. output the mean correlation coefficient for all cells in the frame
 +
 
 +
end
 
</pre>
 
</pre>
  
How should we choose the threshold? Since we just estimated the background level, one option is to use this information to pick a threshold well above the background level. Choosing a threshold greater than <math>3\sigma</math> of the mean should result in eliminating 99% of the background pixels.  
+
As you work through this part of the Assignment, fill out the <tt>AnalyzeFrame</tt> function to incorporate each of the outlined steps.  
  
 
===Finding cells===
 
===Finding cells===
  
We could similarly use a threshold to identify cells in a blue image. The problem here, is that thresholding doesn't allow us to identify single cells within a clump of cells. Since we know that yeast cells are roughly spherical, we can use <tt>imfindcircles</tt>, which implements the [https://www.mathworks.com/help/images/ref/imfindcircles.html| Hough transform].  
+
We could consider using a threshold to identify cells in an image, similar to how we found fluorescent particles in Assignments 4 and 5. The problem here, is that thresholding doesn't allow us separate a single cells from a clump of cells. Since we know that yeast cells are roughly spherical, we can use <tt>imfindcircles</tt> to locate them. <tt>imfindcircles</tt> is an implemenattion of the [https://www.mathworks.com/help/images/ref/imfindcircles.html| Hough transform].  
 
+
Here's a typical implementation:
+
  
 
<pre>
 
<pre>
Line 119: Line 105:
 
</pre>
 
</pre>
  
where <tt>Rmin</tt> and <tt>Rmax</tt> are the smallest and largest expected radii. How can you estimate <tt>Rmin</tt> and <tt>Rmax</tt>?
+
Here, <tt>Rmin</tt> and <tt>Rmax</tt> are the smallest and largest expected 'radii'. How can you estimate <tt>Rmin</tt> and <tt>Rmax</tt>?
 +
 
  
  
 
=== Loop through cells and calculate <GFP_cell> and <GFP_nucleus> ===
 
=== Loop through cells and calculate <GFP_cell> and <GFP_nucleus> ===
 +
 +
\*****WARNING: there is an alternate way to do this section that is more reliable. The wiki will be updated to reflect the new method on Monday, 4/21/2019*******
  
 
Now that we've found our cells and nuclei we need to loop through each cell, and calculate the GPF intensity inside and out of the nucleus.
 
Now that we've found our cells and nuclei we need to loop through each cell, and calculate the GPF intensity inside and out of the nucleus.

Revision as of 19:53, 21 April 2019

20.309: Biological Instrumentation and Measurement

ImageBar 774.jpg


Written questions


Pencil.png

Read the paper and answer the written questions in Assignment 9 Overview: Analyzing yeast images.


Yeast image analysis code

Start by downloading Assignment9TestData.mat. This is an example of the data you will be collecting in assignment 10 with an 8 minute valve oscillation period. The movie was collected using the same 20.309 microscope that you built in Assignment 8. During an oscillation experiment, the microscope first recorded an image with blue illumination followed by one with green illumination. It had a 40x objective and a 125 mm tube lens - remember that results in a 25X magnification.

The sample is similar to the one in Mettetal et al.: we are using S. cerevisiae with the protein Hog1 fused to GFP (which is excited by blue light) and an mRNA binding protein (we'll call MCP) fused to tagRFP (which is excited by green light) to locate the nucleus. Our goal is to calculate the average Hog1 intensity in the nucleus (<GFP_nucleus>) vs. the whole cell (<GFP_cell>) as a function of the high/low valve state.

In MATLAB, display the contents of the structure in the data file:

twoColorYeastTest = 

  struct with fields:

                         Movie: [544×728×2×16 double]
                          Time: [16×2 double]
        ValveOscillationPeriod: 480
     BlueCameraGainAndExposure: [5 5000000]
    GreenCameraGainAndExposure: [15 5000000]
                    ValveState: [16×1 logical]

The data structure contains all the important information from the experiment:

  • A two-color movie, where the 3d dimension is the color (1 = Blue, 2 = Green), and the fourth is the frame number. Notice that twoColorYeastTest.Movie has already converted to a double, but it is not yet normalized by 4095.
  • The time stamp of each frame, recorded in seconds from the start of the experiment. Column 1 contains the timestamps of the Blue images, column 2 is for the green images.
  • The oscillation valve period, in seconds.
  • The blue and green image camera settings in the format [gain, exposure], where exposure is in microseconds.
  • The valve state (0 for low salt, 1 for high salt) at the time of each image. If you need to know the valve state more precisely, it can be calculated with the following conditional statement: $ sin(2 \pi t/T) \ge 0 $, where T is the valve oscillation period, and t is the time since the start of the experiment (this is how the state is determined in the software).


Pencil.png

Choose a single frame of the movie. Make a four panel figure and display: the blue image, the green image (both with a scale bar), and their corresponding image histograms. Include an appropriate title for each panel.


Estimating background levels

BlueImageHistogram.png

It's tricky to get good images of fluorescent proteins expressed by living yeast cells. Not only do we need to use very high exposure times (on the order of 5 s) to capture a dim signal, but we want to image cells over long times (minutes to hours). These factors work against us to produce high background levels caused by autofluorescence of the surrounding medium, camera noise (especially dark noise), and potentially light leakage from the room. Any drift in the background level over time can obscure our results. Additionally, the background level may not be uniform over the entire image. For each frame of the movie, we will subtract off a background image (taken with no cells present) which has been scaled appropriately to account for changing levels over time.

In the example data provided, you should find a background image taken with no yeast present - only background coming from the medium autofluorescence and other background sources. To determine how much we need to scale the image, we will compare the background image intensity to an empty region of our cell images.

Use this function to locate an empty region of the first frame in your movie:

function BacgroundRectangle = LocateBacgroundRegion( InitialYeastImage )
    
    figure
    imshow( InitialYeastImage(:,:,1) )
    title('Drag a rectangle around a region of the image containing only background')
    BacgroundRectangle = getrect( gcf );
    close
    
end    

Then for each frame, you can calculate the scale factor needed for your background image:

function bacgroundScaleFactors = FindBacgroundScaleFactor( InitialImage, BacgroundImage, BacgroundRectangle )
    bacgroundScaleFactors = nan(1,2);
    % Scale background
    for j = 1:2
        backgroundImageCropped = imcrop(BacgroundImage(:,:,j),BacgroundRectangle);
        initialImageCropped = imcrop(InitialImage(:,:,j),BacgroundRectangle);
        bacgroundScaleFactors(j) = mean(initialImageCropped(:))/ mean(backgroundImageCropped(:));
    end
    
end

Test the code by finding the background scale factors for the first frame of the GFP and RFP images. Try using several different rectangular regions and notice how much the output varies.

Pencil.png

Display an example of a background-subtracted and contrast-stretched image for a GFP and RFP image.


Analyze a frame

In each frame of our movie, we need to locate and identify individual cells, and find the correlation coefficient between the cell image and the nuclei image. If the Hog1 signal is localized in the nucleus, the correlation should be close to 1.

FindingCellsAndNucleiExample.png

Start with the following basic layout of the function:

function NucleusHog1Correlation = AnalyzeFrame( TwoColorFrame, BackgroundImage )

     % 1. estimate the background level
     % 2. identify individual yeast cells in the GFP image
     % 3. calculate the correlation coefficient between the GFP intensities an RFP intensities within each cell region
     % 4. output the mean correlation coefficient for all cells in the frame

end

As you work through this part of the Assignment, fill out the AnalyzeFrame function to incorporate each of the outlined steps.

Finding cells

We could consider using a threshold to identify cells in an image, similar to how we found fluorescent particles in Assignments 4 and 5. The problem here, is that thresholding doesn't allow us separate a single cells from a clump of cells. Since we know that yeast cells are roughly spherical, we can use imfindcircles to locate them. imfindcircles is an implemenattion of the Hough transform.

     [cellCenters, cellRadii] = imfindcircles(cellFrame,[Rmin Rmax],'ObjectPolarity','bright','Sensitivity',0.95);

Here, Rmin and Rmax are the smallest and largest expected 'radii'. How can you estimate Rmin and Rmax?


Loop through cells and calculate <GFP_cell> and <GFP_nucleus>

\*****WARNING: there is an alternate way to do this section that is more reliable. The wiki will be updated to reflect the new method on Monday, 4/21/2019*******

Now that we've found our cells and nuclei we need to loop through each cell, and calculate the GPF intensity inside and out of the nucleus.

Here's a function to get you started:

function allCellProperties = FindCellularAndNuclearHog1Intensities(CellFrame, CellCenters, CellRadii, NucleiMask, enablePlotting)

if nargin<5
    enablePlotting = true;
end

if enablePlotting
    figure
    imshow(CellFrame)
    hold on
    visboundaries(createMaskFromCircles(CellCenters,CellRadii,size(CellFrame)),'Color','r');
    visboundaries(NucleiMask,'Color','w');
    title('found cells and nuclei')
end 

numCellsInFrame = length(CellRadii);
maskImageSize = size(NucleiMask);
allCellProperties = cell(0);

% loop through each individual cell
for k = 1:numCellsInFrame
    % create a mask for a single cell. 
    cellMask =  

    % create a mask of that cell's nucleus
    nucleusMask = 


    % is this a good cell?
    % choose a criteria to filter out bad cells. some cases to consider
    % are: does the cell have a nucleus? is the nucleus too big or too
    % small? 
    isGoodCell = 


    if isGoodCell
        
        if enablePlotting   
            visboundaries(cellMask,'Color','b')
            visboundaries(nucleusMask,'Color','g')
        end
        
        % use regionprops to to calculate the mean intensity of the nucleus
        % and of the whole cell
        



        % store the properties in a struct...
        singleCellProperties = struct();




        % ... and append the struct to a cell array to store all your data.
        allCellProperties{end+1} = singleCellProperties;
    end
    
end

end


Pencil.png

Write the rest of FindCellularAndNuclearHog1Intensities so that it outputs a cell array containing the average GFP intensities inside the nucleus, and for the whole yeast cell.


You may use the following helper function:

function Im = createMaskFromCircles(centers,radii,ImageSize)
    Im = zeros(ImageSize);
    [X, Y] = meshgrid(1:ImageSize(2), 1:1:ImageSize(1));

    for ii = 1:length(radii)
        isBright = (X-centers(ii,1)).^2+(Y-centers(ii,2)).^2 <= radii(ii,1)^2;
        isAlreadyFound = (Im == 1);
        Im = Im + (isBright & ~isAlreadyFound);
    end
    Im( Im > 1 ) = 1;
end

Calculating the response of Hog1

In Mettetal et al., they define the Hog 1 response as the average fluorescence of Hog1 inside the nucleus normalized by the total cell fluorescence. We also need to ensure to remove our background:

$ R_{Hog1} = \frac{<GFP_{nucleus}> - <GFP_{background}>}{<GFP_{cell}> - <GFP_{background}>} $


Pencil.png

Write a function called CalculateHog1Response that takes in the cell array of yeast properties, and outputs an array of cell responses, and the average response of all cells in one frame. You may also consider eliminating outliers from the array.


Hint: You may find the cellfun function useful for accessing an element from a cell array of structs:

outputArray = cellfun(@(c) c.structElement, yeastProperties);

Analyze the movie

Once you've set up your analysis for a single frame, you'll now want to loop through the movie and collect the responses as a function of time.


Pencil.png
  1. Analyze each frame of the movie and plot the average cellular response as a function of time.
    • In a single figure, plot the data in one subplot, and the pinch valve state (1 for high salt, 0 for low salt) in a second subplot.
  2. Write a function to fit the response to a sinusoid and extract the predicted amplitude and phase shift of the response signal. Plot the best fit function on the same plot as your data.
  3. Report the resulting best fit parameters.




Pencil.png

Turn in all your MATLAB code in pdf format. No need to include functions that you used but did not modify.


Navigation

  • Overview
  • Part 1: Analyze a two-color yeast movie frame

Back to 20.309 Main Page