In late May, Lior Pachter posted a blog entitled Pachter’s P-value prize, offering a cash prize for providing a probability calculation (P value) based on a “justifiable” null model for the claim of Kellis, Birren and Lander, 2004, hereafter KBL, that some results from an analysis of yeast duplicate genes “strikingly” favored the classic neo-functionalization model of Ohno over the contemporary DDC (duplication-degeneration-complementation) model.
This attracted, not just a huge number of views (for a geeky science blog), but an extensive online discussion among readers that was carried out at a very high intellectual level. Dozens of scientists commented on the blog, including Manolis Kellis, the first author of KBL, and scientists well known in the field of comparative genome analysis. Mostly they were discussing what was an appropriate statistical test, but they also discussed the nature of the Ohno and DDC models, the responsibilities of authors, the flaws of the peer-review process, the appropriateness of blogging and tweeting about science, and so on.
Here, I’m going to use graphics and simulated data to illustrate null models in relation to the duplicate gene data from KBL. I also want to make a comment about the expectations of the DDC model. All of the plots and calculations are available as embedded R code in the R-Markdown file kbl_stats.Rmd. (more…)