Difference between revisions of "20.109(S07): Microarray data analysis"

From Course Wiki
Jump to: navigation, search
m (8 revisions: Test transfer 20.109(S07) to HostGator)
 
(5 intermediate revisions by one user not shown)
Line 2: Line 2:
  
 
==Introduction==
 
==Introduction==
 +
Sony Playstation or Microsoft X-box? Boxers or briefs? Coke or Pepsi? We all know that taste can’t be mandated, but then how do standards arise. Standards are a fundamental and required aspect of engineering. Without them, machines can’t talk to each other, hardware is difficult to repair, and profits disappear (try to estimate the recent earnings by Betamax). In many cases, standards are government mandated, e.g. the US public school curriculum, cell phone technology in Europe, internet protocols worldwide. On other occasions, external events or pressures influence standards. Sweet N’ Low was essentially the only artificial sweetener on the market until the saccharine it contained was “shown” to cause cancer in lab rats. On rare occasions, standards arise through extreme behavior. In 1888, Thomas Edison wanted to demonstrate the superior safety of direct current (the technology his company marketed) so he publicly electrocuted dogs with 1000 volts of alternating current, the technology his competitor, Westinghouse, was marketing for use in homes. [[Image:Macintosh HD-Users-nkuldell-Desktop-metric-english.jpg|thumb]]
 +
 +
How do standards arise when there is no traditional financial market for them? In the case of [http://bbf.openwetware.org/ BioBricks], the [http://parts.mit.edu/registry/index.php/Main_Page Registry of Standard Biological Parts] is relying on the goodwill of the community to contribute standard parts that conform to the Registry’s rules. The payoff isn’t market share of the biological parts market, but rather the establishment of a shared resource that is reliable, reusable and useful. Community compliance to standards for microarray experiments and data analysis is similarly driven. Despite disagreement within the scientific community about how to collect meaningful microarray data, a “Minimum Information About a Microarray Experiment” ([http://www.mged.org/Workgroups/MIAME/miame.html MIAME]) checklist has been generated and is largely adhered to. “Minimum information” means only that the microarray data can be examined and interpreted by others…not a high bar for publication standards but one that is difficult to achieve since the arrays themselves are provided by different commercial vendors who disclose different amounts of information about their arrays. Moreover, the effort required to annotate MIAME data is significant and authors vary in their compliance.
 +
 +
Corroboration of published microarray data is further compounded by a lack of standards surrounding the data analysis itself. Processing the raw data mixes art and science. Algorithms used vary dramatically, and a single data set can appear compelling or noisy, depending on the analysis choices made by the investigator. For example, Cy3 and Cy5 are commonly used fluorescent probes but others dyes can be used and may be processed with different background correction and normalization factors. Not surprisingly, experiment protocols make a difference too. Researchers who indirectly label may find different outcomes than researchers who perform the same experiment but directly incorporate fluorescent dyes into their RNA. Also worth noting is human error, since microarrays experiments require many steps over many days. There are even stories of people scanning their slides backwards and consequently mis-identifying every spot on the array.
 +
 +
This lack of consensus should be both liberating for you today and also burdensome. You will have great freedom in how to analyze and interpret your data. Some initial steps are suggested but then you’re free to try different approaches that you are interested in and that make sense to you. You will need to carefully annotate and justify the choices you make, to allow others to understand and critique your approach. Good luck and have fun! 
 +
 
==Protocols==
 
==Protocols==
 
Here is a rough outline of the steps you can take to examine your microarray data. There are many variations on this that are acceptable and that may be more interesting or appropriate for you. You should explore the data as you see fit.  
 
Here is a rough outline of the steps you can take to examine your microarray data. There are many variations on this that are acceptable and that may be more interesting or appropriate for you. You should explore the data as you see fit.  
 
#open txt file in xls (tab delimited)
 
#open txt file in xls (tab delimited)
#delete top 21 rows
+
#delete top 9 rows
 
#label a new worksheet for working with your data
 
#label a new worksheet for working with your data
 
#copy columns for: GeneName, SystematicName, Description, gMeanSignal, rMeanSignal gMedianSignal, rMedianSignal, gBGMeanSignal, rBGMeanSignal, gBGMedianSignal, rBGMedianSignal
 
#copy columns for: GeneName, SystematicName, Description, gMeanSignal, rMeanSignal gMedianSignal, rMedianSignal, gBGMeanSignal, rBGMeanSignal, gBGMedianSignal, rBGMedianSignal
#format the numberical cells as numbers with no decimal place
+
#format the numerical cells as numbers with no decimal place
 
#consider mean and median variations and background, to correct as you see fit. Be sure you keep track in your notebook or in the xls file of your analytical decisions.   
 
#consider mean and median variations and background, to correct as you see fit. Be sure you keep track in your notebook or in the xls file of your analytical decisions.   
 
#start new column with ratio of green signal/red signal.  
 
#start new column with ratio of green signal/red signal.  
Line 14: Line 22:
 
#Select entire sheet by clicking on diamond in corner then sort by log2 (green/red).
 
#Select entire sheet by clicking on diamond in corner then sort by log2 (green/red).
 
#Sort cells in decending order according to log2green/red
 
#Sort cells in decending order according to log2green/red
#What do you see? Are the duplicates in agreement? Are there particular genes you expect to see up or down regulated in the two samples (e.g. URA3)? What happens to the SAGA-subunits? Are there particular kinds of genes (e.g. mating type genes, gal regualated genes...) that are up or down regulated by the deletion? Ask the questions you want about this data...
+
#What do you see? Are the duplicates in agreement? Are there particular genes you expect to see up or down regulated in the two samples (e.g. URA3)? What happens to the SAGA-subunits? Are there particular kinds of genes (e.g. mating type genes, gal regulated genes...) that are up or down regulated by the deletion? Ask the questions you want about this data...
 
#save as XLS worksheet or workbook  
 
#save as XLS worksheet or workbook  
 
DONE!
 
DONE!
==For next time==
 
Your first draft of your Mod 3 lab report is due next time. Remind yourself of the class expectations for [http://openwetware.org/wiki/20.109%28S07%29%3AGuidelines_for_writing_a_lab_report your report]. Some extra information to guide you when you prepare your Mod3 lab report is included here. 
 
===Abstract===
 
*Please keep the number of words under 250.
 
*Do not include references in the abstract.
 
*Try drafting this section after you’ve written the rest of the report.
 
*If you’re truly stuck, start by modifying one crystallizing sentence from each of the sections of your report.
 
*Please do not plagiarize (accidentally or other) the class wiki. This applies to your entire report.
 
  
===Introduction===
+
==For next time==
The homework you wrote after the first day of this new module will serve at the heart of your introduction. You should add (at least) one final paragraph to narrow the information “funnel,” ending your introduction with a clear description of the problem you’re studying and the method you are using. If you would like to preview for the reader your key results and conclusions in the last sentence of your introduction, you may.  
+
Your first draft of your Mod 3 lab report is due next time. Remind yourself of the class expectations for [http://openwetware.org/wiki/20.109%28S07%29%3AGuidelines_for_writing_a_lab_report your report]. Some extra information to guide you when you prepare your Mod3 lab report is included [[20.109(S07): Expression engineering report| here]]. Before arriving in lab next time, email your report to nkuldell AT mit DOT edu and breindel AT mit DOT edu.
 
+
===Materials and Methods===
+
If you used any kits for any of the manipulations, it is sufficient to cite the manufacturer’s directions, e.g. “yeast were transformed according to the Q-biogene transformation kit protocol.”
+
Subdivide this section into the following
+
#Yeast strains and plasmids
+
#*list genotypes and plasmid names when known
+
#PCR
+
#*include primer design info here
+
#*include primer sequences, for knockout and for candidate verification
+
#*include PCR cycling conditions
+
#Yeast transformation
+
#* include how you selected for transformants
+
#* include what you did to verify that URA3 was integrated where you thought.
+
#Yeast Microarray
+
#*mention kits as relevant, including any deviation from published protocol if any
+
#*mention how many yeast and how much RNA was used
+
#*describe array analytical methods in results section rather in Materials and Methods
+
 
+
===Results===
+
====figures====
+
You should include but are not limited to the following figures and tables
+
#Figure 1
+
#*panel A: table describing transformation results
+
#*panel B: agarose gel verifying URA3 insertion
+
#Figure 2
+
#*Spot test images
+
#Figure 3
+
#*microarray analytics
+
#Figure 4
+
#*microarray conclusions
+
Each figure should be numbered, and should have a title and legend
+
 
+
====text====
+
*In paragraph form, describe each figure and the observations you made.
+
*As much as possible, reserve conclusions about your data for the discussion section. Clearly an exception to this will be which of your deletion candidates was correct, as this information is critical for the next steps in the experiments.
+
 
+
===Discussion===
+
You should include but are not limited to
+
*conclusions you can draw from your work, including any uncertainties
+
*other data (published or personal communications) that support or contradict your conclusions
+
*limitations of your work, e.g. what kinds of experiments/controls/samples would have been great to include
+
*next experiments you would like to try to extend your findings and strengthen your conclusions
+

Latest revision as of 15:32, 15 June 2015


20.109: Laboratory Fundamentals of Biological Engineering

Macintosh HD-Users-nkuldell-Desktop-20.109template.png

Home        People        Schedule Spring 2007        Lab Basics        OWW Basics       
Genome Engineering        Biophysical Signal Measurement        Expression Engineering        Biomaterial Engineering       

Introduction

Sony Playstation or Microsoft X-box? Boxers or briefs? Coke or Pepsi? We all know that taste can’t be mandated, but then how do standards arise. Standards are a fundamental and required aspect of engineering. Without them, machines can’t talk to each other, hardware is difficult to repair, and profits disappear (try to estimate the recent earnings by Betamax). In many cases, standards are government mandated, e.g. the US public school curriculum, cell phone technology in Europe, internet protocols worldwide. On other occasions, external events or pressures influence standards. Sweet N’ Low was essentially the only artificial sweetener on the market until the saccharine it contained was “shown” to cause cancer in lab rats. On rare occasions, standards arise through extreme behavior. In 1888, Thomas Edison wanted to demonstrate the superior safety of direct current (the technology his company marketed) so he publicly electrocuted dogs with 1000 volts of alternating current, the technology his competitor, Westinghouse, was marketing for use in homes.
Macintosh HD-Users-nkuldell-Desktop-metric-english.jpg

How do standards arise when there is no traditional financial market for them? In the case of BioBricks, the Registry of Standard Biological Parts is relying on the goodwill of the community to contribute standard parts that conform to the Registry’s rules. The payoff isn’t market share of the biological parts market, but rather the establishment of a shared resource that is reliable, reusable and useful. Community compliance to standards for microarray experiments and data analysis is similarly driven. Despite disagreement within the scientific community about how to collect meaningful microarray data, a “Minimum Information About a Microarray Experiment” (MIAME) checklist has been generated and is largely adhered to. “Minimum information” means only that the microarray data can be examined and interpreted by others…not a high bar for publication standards but one that is difficult to achieve since the arrays themselves are provided by different commercial vendors who disclose different amounts of information about their arrays. Moreover, the effort required to annotate MIAME data is significant and authors vary in their compliance.

Corroboration of published microarray data is further compounded by a lack of standards surrounding the data analysis itself. Processing the raw data mixes art and science. Algorithms used vary dramatically, and a single data set can appear compelling or noisy, depending on the analysis choices made by the investigator. For example, Cy3 and Cy5 are commonly used fluorescent probes but others dyes can be used and may be processed with different background correction and normalization factors. Not surprisingly, experiment protocols make a difference too. Researchers who indirectly label may find different outcomes than researchers who perform the same experiment but directly incorporate fluorescent dyes into their RNA. Also worth noting is human error, since microarrays experiments require many steps over many days. There are even stories of people scanning their slides backwards and consequently mis-identifying every spot on the array.

This lack of consensus should be both liberating for you today and also burdensome. You will have great freedom in how to analyze and interpret your data. Some initial steps are suggested but then you’re free to try different approaches that you are interested in and that make sense to you. You will need to carefully annotate and justify the choices you make, to allow others to understand and critique your approach. Good luck and have fun!

Protocols

Here is a rough outline of the steps you can take to examine your microarray data. There are many variations on this that are acceptable and that may be more interesting or appropriate for you. You should explore the data as you see fit.

  1. open txt file in xls (tab delimited)
  2. delete top 9 rows
  3. label a new worksheet for working with your data
  4. copy columns for: GeneName, SystematicName, Description, gMeanSignal, rMeanSignal gMedianSignal, rMedianSignal, gBGMeanSignal, rBGMeanSignal, gBGMedianSignal, rBGMedianSignal
  5. format the numerical cells as numbers with no decimal place
  6. consider mean and median variations and background, to correct as you see fit. Be sure you keep track in your notebook or in the xls file of your analytical decisions.
  7. start new column with ratio of green signal/red signal.
  8. start new column called log2green/red and use data in green/red column as =LOG(cell#,base), for example =LOG(D3,2) and drag corner to apply formula to all 11K cells. Again format to whole numbers if this does not happen automatically.
  9. Select entire sheet by clicking on diamond in corner then sort by log2 (green/red).
  10. Sort cells in decending order according to log2green/red
  11. What do you see? Are the duplicates in agreement? Are there particular genes you expect to see up or down regulated in the two samples (e.g. URA3)? What happens to the SAGA-subunits? Are there particular kinds of genes (e.g. mating type genes, gal regulated genes...) that are up or down regulated by the deletion? Ask the questions you want about this data...
  12. save as XLS worksheet or workbook

DONE!

For next time

Your first draft of your Mod 3 lab report is due next time. Remind yourself of the class expectations for your report. Some extra information to guide you when you prepare your Mod3 lab report is included here. Before arriving in lab next time, email your report to nkuldell AT mit DOT edu and breindel AT mit DOT edu.