Difference between revisions of "20.109(F22):M1D7"

From Course Wiki
Jump to: navigation, search
(Part 2: Complete data analysis)
(Part 1: Practice statistical analysis)
 
(8 intermediate revisions by 2 users not shown)
Line 27: Line 27:
 
*With infinite data, the standard deviation (''s'') approaches the true standard deviation (σ).
 
*With infinite data, the standard deviation (''s'') approaches the true standard deviation (σ).
  
Because standard deviation is only justified when sufficient data have been collected to generate a normal curve, you will use confidence intervals to report the likelihood that your results predict the true mean. A confidence interval is a defined interval that is calculated to define the true mean to a specified level of confidence.  Simply, it is possible to define a range in your data set that likely contains the true mean based on the calculated mean.
+
An assumption is made when using standard deviation to report the variation in a data set.  It is assumed that sufficient data have been collected to generate a normal curve.
*Confidence interval (CI) is defined as:
+
  
 
+
So, what does this all mean in regard to the data you will report?  As an example, if the calculated <math> \overline{\chi }</math> of a data set equals 80 au there is a 95% chance the &mu; is between 50 au and 110 au, where au = arbitrary units.  And how does this relate to ''s''?  If you know the &mu;, the &sigma; represents a 68% confidence interval.   
<center>
+
''CI ='' <math>\overline{\chi } \pm \frac{ts}{\sqrt{n}}</math>, ''where t = value from t table (dependent on specified confidence level and n)''
+
</center>
+
 
+
 
+
In your data, you should use the CI to generate error bars due the low ''n''.  Be sure to report which confidence level was used to calculate the intervals reported.  So, what does this all mean in regard to the data you will report?  As an example, if the calculated <math> \overline{\chi }</math> of a data set equals 80 au there is a 95% chance the &mu; is between 50 au and 110 au, where au = arbitrary units.  And how does this relate to ''s''?  If you know the &mu;, the &sigma; represents a 68% confidence interval.   
+
  
 
When interpreting data, the error bars are representative of the noise in the data or how different the data points are for each of the replicates. Replicates come in two types: technical and biological.  Technical replicates indicate that the same sample was tested multiple times and is measure of experimenter error (for example, pipetting errors between aliquots).  Biological replicates indicate that different preparations of the same sample were tested and is a measure of the difference in a response to a variable (for example, response to a treatment between separate cultures of the same cell line).  Though both types have value in data analysis, the interpretation of the error represented in each case is different.  Because of this it is important to indicate if the replicates used in the data analysis are technical or biological.  For your data, what type of replicates did you analyze for the &gamma;H2AX experiment?  For the CometChip experiment?
 
When interpreting data, the error bars are representative of the noise in the data or how different the data points are for each of the replicates. Replicates come in two types: technical and biological.  Technical replicates indicate that the same sample was tested multiple times and is measure of experimenter error (for example, pipetting errors between aliquots).  Biological replicates indicate that different preparations of the same sample were tested and is a measure of the difference in a response to a variable (for example, response to a treatment between separate cultures of the same cell line).  Though both types have value in data analysis, the interpretation of the error represented in each case is different.  Because of this it is important to indicate if the replicates used in the data analysis are technical or biological.  For your data, what type of replicates did you analyze for the &gamma;H2AX experiment?  For the CometChip experiment?
Line 54: Line 47:
  
 
===Part 1: Practice statistical analysis===
 
===Part 1: Practice statistical analysis===
 +
If you would like additional practice in completing statistical analysis, please complete Part 1.  If you are confident in your understanding, please proceed to Part 2. 
 +
 
Review data from an experiment where cells were exposed to increasing amounts of radiation (linked [[Media: CometAssay_M1D6stats_F14.xlsx |here]]). Your goal is to determine if a statistically significant amount of DNA damage was induced. For the purpose of this exercise, the values in the spreadsheet are in arbitrary units of 'DNA damage', where the higher numbers indicate more damage.   
 
Review data from an experiment where cells were exposed to increasing amounts of radiation (linked [[Media: CometAssay_M1D6stats_F14.xlsx |here]]). Your goal is to determine if a statistically significant amount of DNA damage was induced. For the purpose of this exercise, the values in the spreadsheet are in arbitrary units of 'DNA damage', where the higher numbers indicate more damage.   
  
Line 60: Line 55:
 
<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
 
<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
 
*Attach the completed spreadsheet.
 
*Attach the completed spreadsheet.
**Include a bar graph of the data with 95% confidence intervals.
+
**Include a bar graph of the data with standard deviations.
 
**Indicate if there is a statistically significant difference (''i.e.'' provide a ''p''-value) between the conditions tested.
 
**Indicate if there is a statistically significant difference (''i.e.'' provide a ''p''-value) between the conditions tested.
  
 
===Part 2: Complete data analysis===
 
===Part 2: Complete data analysis===
  
Use the tools above to analyze the data for your &gamma;H2AX and CometChip experiments.  The figures / analyses in your Data summary should include measures of variability (i.e. confidence intervals) and significance (i.e. ''p''-values).
+
Use the tools above to analyze the data for your &gamma;H2AX and CometChip experiments.  The figures / analyses in your Data summary should include measures of variability (i.e. standard deviation) and significance (i.e. ''p''-values).
  
 
'''For the &gamma;H2AX data:'''
 
'''For the &gamma;H2AX data:'''
  
In the analysis that you completed, you averaged the data from eight images for each condition. In addition, you used two different methods to analyze the raw data.  For the figure that you will include in the Data summary, plot the averaged values then perform the statistical analysis on values calculated for the averaged datasets.
+
In the analysis that you completed, you averaged the data from three images for each condition. Remember that you analyzed the data for your experiment and the data for a pilot experiment that was completed by the Instructors.  Your data will be Panel A and the Instructor data will be Panel B of the final figure.  For the figure that you will include in the Data summary, plot the averaged values for each experiment then perform statistical analysis on the calculated values for the averaged datasets.
  
 
'''For the CometChip data:'''
 
'''For the CometChip data:'''
  
In the analysis that you completed, you averaged three technical replicate samples.  For a more robust data set, you will now include the results from a second experiment that was completed according to the exact same protcol.  These data will serve as biological replicates for the conditions tested (dataset #2 linked [[Media:Fa20 M1D7 CometChip data dataset2.xlsx |here]]).
+
In the analysis that you completed for your experiment, you averaged the values of three technical replicates.  For the figure that you will include in the Data summary, plot the averaged values for each condition then perform statistical analysis on the calculated values for the data set.
 
+
The values provided in dataset #2 are presented in a layout similar to the data you analyzed M1D6 (dataset #1); however, in dataset #2 the triplicate samples are averaged for you.  For the figure that you will include in the Data summary, average the values from the two datasets and plot using a line graph.  Then perform the statistical analysis on the values calculated for the combined dataset.
+
 
+
<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
+
*Review the values presented in each of the datasets.
+
**Are the data in agreement (ie do the different datasets look similar as far as the overall results / trends are concerned?)?
+
**Are there any discrepancies between the datasets?
+
 
+
 
+
FROM FA20 - ANALYSIS OF TIMECOURSE DATA SET:
+
 
+
 
+
'''Analyze CometChip data'''
+
 
+
The data from the CometChip experiment were compiled into an Excel spreadsheet (linked [[Media:Fa20 M1D6 CometChip data.xlsx |here]]).  Use this spreadsheet to analyze the data for each of the treatments tested.
+
 
+
#First, orient yourself to the data that are provided in the spreadsheet.
+
#*The values grouped together in the table are for the samples that were treated with 20 &mu;M H<sub>2</sub>O<sub>2</sub>.  The values at the bottom of the sheet are for the no H<sub>2</sub>O<sub>2</sub> samples.
+
#*The information for the timepoints (0 min, 15 min, 30 min, and 60 min) is across the top.  Each condition was tested in triplicate so wells A1, A2, and A3 are replicates.
+
#*The information for the As concentration exposure (0 &mu;M, 2 &mu;M, and 10 &mu;M) is down the left side.
+
#Average the replicate samples for each condition and plot the data using a line graph.
+
#*To represent the data as a timecourse, use t = -15m for the no H<sub>2</sub>O<sub>2</sub> value.  For the remaining data, use t = 0, t = 15, t = 30, and t = 60 to indicate the length of time that repair was allowed to occur following H<sub>2</sub>O<sub>2</sub> treatment.  Time should be represented on the x-axis.
+
#*On the y-axis, plot the averaged % Tail DNA value for each condition.
+
 
+
<font color =  #4a9152 >'''In your laboratory notebook,'''</font color> complete the following:
+
  
*Attach the spreadsheet that contains your analyzed CometChip data.
+
In the analysis that you completed for the timecourse experiment, you averaged the values of three technical replicates. For the figure that you will include in the Data summary, plot the averaged values according to the guidelines provided on M1D6 then perform the statistical analysis on the calculated values for the data set.
*Compare the analyzed data to the raw images that were acquired for each of the treatment conditions.
+
**Review 2-3 of the .tif files for each condition tested (these are the image files that you copied into a folder on your computer). 
+
**Do the images appear consistent with the values in the analyzed data?
+
**Do you have confidence in these data?  Why or why not?
+
**Select 1-2 comets for each condition to serve as representative images in your Data summary.  Representative images are often included in figures with the analyzed data to show the reader the raw form of the data.  It also provides context for how the experiment was performed.
+
  
 
==Navigation links==
 
==Navigation links==
Next day: [[20.109(F22):M2D1 | Placeholder]]<br>
+
Next day: [[20.109(F22):M2D1 | Complete in-silico cloning of protein expression plasmid]]<br>
 
Previous day: [[20.109(F22):M1D6 | Image and analyze data for CometChip assay]]<br>
 
Previous day: [[20.109(F22):M1D6 | Image and analyze data for CometChip assay]]<br>

Latest revision as of 16:46, 6 October 2022

20.109(F22): Laboratory Fundamentals of Biological Engineering

Fa22 banner image v3.png

Fall 2022 schedule        FYI        Assignments        Homework        Class data        Communication        Accessibility

       M1: Genomic instability        M2: Drug discovery        M3: Project design       


Introduction

Today is the final laboratory session for Module 1! You have completed all of the bench work for your research; however, there is still data analysis to complete for your experiments. In addition to plotting the data, you will complete statistical analysis to determine the significance of your results.

Statistics are mathematical tools used to analyze, interpret, and organize data. The specific tools that you will use are confidence intervals (CI) and the Student's t-test. To begin, review the following definitions:

  • Mean (or average) is defined as:


$ \overline{\chi } = \frac{\sum_{i}^{n}\chi _{i}}{n} $, where $ \chi _{i} $ = individual value and n = number of samples


  • With infinite data, the mean ($ \overline{\chi } $) approaches the true mean (μ).
  • Standard deviation measures the variation in the data and is defined as:


$ s = \sqrt{\frac{\sum_{i}^{n }(\chi _{_{i}}-\overline{\chi })}{n - 1}} $, where n - 1 = degrees of freedom


  • With infinite data, the standard deviation (s) approaches the true standard deviation (σ).

An assumption is made when using standard deviation to report the variation in a data set. It is assumed that sufficient data have been collected to generate a normal curve.

So, what does this all mean in regard to the data you will report? As an example, if the calculated $ \overline{\chi } $ of a data set equals 80 au there is a 95% chance the μ is between 50 au and 110 au, where au = arbitrary units. And how does this relate to s? If you know the μ, the σ represents a 68% confidence interval.

When interpreting data, the error bars are representative of the noise in the data or how different the data points are for each of the replicates. Replicates come in two types: technical and biological. Technical replicates indicate that the same sample was tested multiple times and is measure of experimenter error (for example, pipetting errors between aliquots). Biological replicates indicate that different preparations of the same sample were tested and is a measure of the difference in a response to a variable (for example, response to a treatment between separate cultures of the same cell line). Though both types have value in data analysis, the interpretation of the error represented in each case is different. Because of this it is important to indicate if the replicates used in the data analysis are technical or biological. For your data, what type of replicates did you analyze for the γH2AX experiment? For the CometChip experiment?

Lastly, you will use Student's t-test to report if your data are statistically different between treatments.

  • Student's t-test is defined as:


$ t = \frac{\left | \overline{\chi_{_{1}}} - \overline{\chi_{_{2}}} \right |}{s_{pooled}}\sqrt{\frac{n_{1} n_{2}}{n_{1}+n_{2}}} $, where $ s_{pooled} = \sqrt{\frac{s_{1}^{2} (n_{1} -1) + {s_{2}^{2} (n_{2} - 1)}{}}{n_{1} + n_{2} - 2}} $


The value you calculate with the Student's t-test equation is referred to as tcalculated. This tcalculated value is compared to the ttabulated value in the the t table, according to the appropriate n - 1 using the p-value for the two-tailed distribution (which assumes that you do not know how the data will shift). If the tcalculated value is greater than the ttabulated, then the data sets are significantly different at the specific p-value. So, what does this all mean in regard to the data you will report? As an example, if the tcalculated for a data set with n - 1 = 10 is 3 (given that the ttabulated is 2.228), then the data sets are different with a p-value ≤ 0.05. Which means that there is less that a 5% chance that the data sets are the same.

Protocols

Part 1: Practice statistical analysis

If you would like additional practice in completing statistical analysis, please complete Part 1. If you are confident in your understanding, please proceed to Part 2.

Review data from an experiment where cells were exposed to increasing amounts of radiation (linked here). Your goal is to determine if a statistically significant amount of DNA damage was induced. For the purpose of this exercise, the values in the spreadsheet are in arbitrary units of 'DNA damage', where the higher numbers indicate more damage.

When interpreting the statistics, consider how you may use the information to convince someone that the DNA damage was significant. You may find the spreadsheet originally created by Prof. Bevin Engelward and modified for the 20.109 laboratory, helpful for this exercise (linked here).

In your laboratory notebook, complete the following:

  • Attach the completed spreadsheet.
    • Include a bar graph of the data with standard deviations.
    • Indicate if there is a statistically significant difference (i.e. provide a p-value) between the conditions tested.

Part 2: Complete data analysis

Use the tools above to analyze the data for your γH2AX and CometChip experiments. The figures / analyses in your Data summary should include measures of variability (i.e. standard deviation) and significance (i.e. p-values).

For the γH2AX data:

In the analysis that you completed, you averaged the data from three images for each condition. Remember that you analyzed the data for your experiment and the data for a pilot experiment that was completed by the Instructors. Your data will be Panel A and the Instructor data will be Panel B of the final figure. For the figure that you will include in the Data summary, plot the averaged values for each experiment then perform statistical analysis on the calculated values for the averaged datasets.

For the CometChip data:

In the analysis that you completed for your experiment, you averaged the values of three technical replicates. For the figure that you will include in the Data summary, plot the averaged values for each condition then perform statistical analysis on the calculated values for the data set.

In the analysis that you completed for the timecourse experiment, you averaged the values of three technical replicates. For the figure that you will include in the Data summary, plot the averaged values according to the guidelines provided on M1D6 then perform the statistical analysis on the calculated values for the data set.

Navigation links

Next day: Complete in-silico cloning of protein expression plasmid

Previous day: Image and analyze data for CometChip assay