Reviewer report 1
Dr. Eugene V. Koonin
National Center for Biotechnology Information
National Library of Medicine
National Institute of Health
This is a critical overview of the 2R hypothesis (two rounds of whole genome duplication) on the origin of vertebrates. The conclusion that, to a large extent, is based on the unexpected genomic complexity of organizationally simple animals, such as sea anemone, and on the modest number of 2-fold and, particularly, 4-fold paralogons in vertebrate genomes, is that there is currently no basis to accept the 2R hypothesis. Instead, it is proposed that the vertebrate genome evolved by relatively small, regional duplications.
This is an old controversy to which this paper does not add any new analysis, only discussion, and that, in my opinion, somewhat perfunctory. As I see the situation, the jury is still out with regard to the 2R. It is important to realize that the 2R hypothesis has gone a long way since the days it was first proposed by Ohno. The 2R hypothesis now claims support from the comparisons of the gene order in the vertebrate paralogons and in Amphioxus, i.e., the duplications in vertebrates appear to be synteny-preserving rather than synteny-disrupting . Even more importantly, perhaps, a global analysis of the distribution of old paralogs in vertebrate genomes has been claimed to support 2R [5, 16]. Thus, 2R is not sheer speculation, considerable effort has been undertaken to test this hypothesis, and there are strong claims of evidence that consistently supports it.
Personally, I have a certain epistemological sympathy for the position taken in this paper in the sense that I believe that, as a matter of principle, the piecemeal duplication model should be the null hypothesis of (in this case, vertebrate) genome evolution that has to be falsified in favor of WGD scenarios. I doubt that the current statistical argument for such falsification is overwhelming so that 2R is to be accepted as the final verdict. However, I also think that the evidence in support of 2R is rather diverse and rather substantial, so it needs to be addressed seriously rather than summarily dismissed, primarily, on the basis of the high complexity of primitive metazoan genomes which is not a logically consistent argument against 2R. I believe that, for a really critical assessment of the 2R hypothesis, the evolutionary genomics literature, and in particular, the evidence claimed in support of 2R should be examined in considerably greater detail and more carefully, with special attention to the underlying assumptions of the statistical models employed in the respective studies.
I am thankful to Dr. Koonin for his comments on this manuscript.
1. It is advisable to recognize that the statistical support for the spread of old paralogs or anciently conserved (vertebrate-invertebrate) syntenic fragments among multiple vertebrate chromosomes [5, 6] does not constitute the evidence for the mechanism of origin of vertebrate paralogy regions. I propose therefore, that, special care should be taken in interpreting the sheer map distribution of a subset of ancient vertebrate genes as a strong support of polyploidization in vertebrate early evolutionary history.
2. In this article I do not intend to challenge the statistical models describing the vertebrate genome evolutionary events. My purpose here is to highlight the fact that well preserved genetic architecture of basal metazoans and comparative analysis of primate genomes (or any other group of vertebrates with comparable relatedness) casts serious doubt on the plausibility of the 2R hypothesis. Instead the recently sequenced genomes of animals from interspersed time points clearly shows that much of the genomic complexity seen in the modern vertebrates is very ancient than was previously anticipated (by 2R proponents). It appears that vertebrates had accomplished this genomic complexity through piecemeal duplications at widely different times over the evolution of life.
Statistical testing of vertebrate genome evolutionary scenarios is often based on comparative observations from few vertebrate and highly derived invertebrate genomes, and thus could inadvertently lead to unfounded conclusions. Therefore, I recommend that future statistical approaches to test hypothesis concerning vertebrate genome evolution, should take into account the newly sequenced genomes of basal metazoan animals and recently diverged vertebrate species (for instance primates).
3. In the light of your comments I have considerably expanded the survey of evolutionary genomics literature.
Reviewer report 2
Dr. Jerzy Jurka
President & Director
Genetic Information Research Institute
Mountain View, USA.
The author presents a critical review of the so-called "Ohno's hypothesis" or "2R hypothesis" postulating that the early vertebrate lineage underwent one or more complete genome duplications. The author argues that the genome sequence data do not support the 2R hypothesis. While I am not sure if the 2R hypothesis is falsifiable based on genomic data, I would support this publication if the author could include discussion of a recent paper in favor of the 2R hypothesis.
Masanori Kasahara, "The 2R hypothesis: an update", "Current Opinion in Immunology" (2007), doi:10.1016/j.coi.2007.07.009
I am thankful to Dr. Jurka for reviewing this manuscript.
In the revised manuscript, by keeping in view the suggestion of Dr. Jurka, I included the recent paper from Kasahara M. (2007)  and other articles favoring 2R.
Reviewer report 3
Dr. Joshua L. Cherry
National Center for Biotechnology Information
National Library of Medicine
National Institutes of Health
Nominated by David J Lipman, National Center for Biotechnology Information, NIH, Bethesda, USA.
This review article assesses the hypothesis of two whole-genome duplications in vertebrate evolution in light of recent sequence data. I agree with the article's conclusion that there is no good reason to believe this hypothesis. I have a few comments about some of the arguments presented and the implications of some of the language used.
I found the role of the SP gene family in resolving the history of HOX clusters to be unclear. In fact the argument is more tenuous than the discussion would suggest. How can phylogenetic analysis of SP reveal both that the SP genes "share their evolutionary history with HOX clusters" and that HOX genes have "arisenthrough three separate segmental duplication steps"? Knowing that the evolutionary histories are the same would presumably entail knowing the phylogeny of HOX, so that the SP phylogeny would provide no additional information about HOX. It is in fact simply assumed that the candidate HOX phylogeny that agrees with the SP phylogeny is the correct one. This is possible, but if other linked paralogs have different histories, as suggested by the cited references, it is far from certain.
I would add that it is too strong to say that the alternative rooted topology, ((HOXC HOXD) (HOXA HOXB)), "favors two rounds of whole genome duplication events". This topology is consistent with 2R, but also with three local duplication events. In the absence of whole-genome duplications it would not be surprising to find some sets of paralogs with this type of topology.
I am uncertain of the meaning, in paragraph 3 of Paralogy Regions in the Human Genome, of "those vertebrate species that have recent evolutionary origin." Because the species analyzed are primates, this might be taken to imply, incorrectly, that humans and our closest relatives are more recently evolved than other organisms and that evolution is a thing of the more distant past for other groups. Comparative analysis of primates can of course yield valuable information, but the same role could be played here by any other group of vertebrate species with comparable relatedness. Other expressions in the manuscript also suggest a ladder-like view of evolution, even if that is not the author's intent: "genetic architecture of deepest as well as most recent branches of animals" (Abstract); "evolutionary basal invertebrate" (paragraph 1 of Rapid Paralogous Gene Increase); "an ideal chordate ancestral genome" (same paragraph); "the deepest branches of life" (final paragraph).
I am grateful to Dr. Cherry for valuable comments and useful suggestions on this manuscript.
1. The most parsimonious explanation of the order of branching in HOX cluster and closely linked SP phylogenies is that, both of these gene families arose simultaneously (co-duplicated group) through three independent duplication steps (Figure 2). Other genes families (having three or four members linked to HOX clusters) in the HOX cluster paralogons, e.g. ERBB, COL, GLI, HH, SLC4A and others have recently been resolved into four discrete co-duplicated groups . It has been shown that genes within each of these co-duplicated groups (of HOX cluster paralogons) are duplicated in concert with each other whereas the constituent genes of two different co-duplicated groups may not have duplicated simultaneously . This observation is contrary to 2R scenario, which assumed that constituent gene families of HOX cluster paralogons arose simultaneously through two rounds of WGD.
2. I must agree that the SP phylogeny helps in understanding the phylogeny (duplication history) of HOX and would provide no additional information about HOX evolutionary history. Therefore, I replaced the term "share their evolutionary history" with the "share their duplication history" (HOX cluster duplication and the history of vertebrate genome evolution).
3. I must also agree with Dr. Cherry's argument that in the absence of whole-Genome duplications it would not be surprising to find some sets of paralogs with (AB)(CD) type topology.
4. In the revised manuscript I tried to erase a ladder-like view of evolution and have used precise terms to explain the evolutionary relatedness among animals.
5. Minor issues were also addressed in the revised manuscript.