Reviewer 1: Rob Knight (University of Colorado)
In this intriguing manuscript, Wolf & Koonin combine comparative genomics with Eigen's (1978) concept of the error threshold to provide a new, comprehensive model for the origins of translation. Specifically, they build on Szathmary's (1993) model of amino acids as coenzymes in an RNA metabolism as a starting point for the genetic code. As pointed out by Knight & Landweber (2000), there are three pathways to a protein-based genetic code from the RNA world that preserves continuity of features of the genetic code: the RNAs that bind directly could have played the roles of tRNAs, mRNAs, or aminoacyl-tRNA synthetases. Wolf & Koonin favor a model along the lines of the latter role, suggesting that cofactor-enhanced catalysis, and then nonribosomal synthesis of short peptides, were the original driving force for RNA-catalyzed translation. They present an intriguing new overall model of the evolution of the translation system, and highlight aspects of this model that could be tested in the laboratory. The main weakness of the manuscript in its current form is its endorsement of the frozen accident model (FAM) of the genetic code's evolution without the presentation of alternative explanations of the evidence in favor of the optimality of the genetic code relative to random codes, and the coding triplet/binding site associations that have been observed through SELEX and in the Group I intron. However, as the authors themselves point out, the resurrection of the frozen accident model is not an important feature of their overall model for the emergence of translation, and this discussion could be omitted without diminishing the manuscript's contribution.
The manuscript presents some interesting ideas that I have not seen elsewhere and that appear to shed substantial new light on the difficult problem of the origin of translation.
For example, the discussion on p. 13 that shows that the domains in the aaRS are highly derived relative to domains in other proteins is extremely interesting, because we might have expected the aaRS to be among the earliest proteins. If they are not, the likelihood that they displaced some other system for coded translation increases dramatically (Theobald & Wuttke's 2005 study of OB-fold superfamily relationships also supports this idea). One point that should be specifically noted in this context is that not only do these relationships imply that the aaRS are relatively late arrivals, but also that coded translation must have predated the aaRS so that the sequence information that allows us to determine the phylogenetic relationships among these folds could be transmitted to the present. In other words, if comparable folds were once produced by a different synthesis mechanism, either we would need either a system of reverse translation to copy the sequence information into nucleic acids, or all of the proteins produced by that mechanism would have been lost when coded translation took over.
Similarly, the discussion on pp. 33–39 of a plausible scenario for the evolution of the modern translation system seems plausible and is more detailed than most such scenarios to be found in the literature.
A couple of areas of the manuscript could potentially be supported by drawing on additional literature. For example, on p. 8, Dennett has an excellent discussion in "Darwin's Dangerous Idea" (Simon & Schuster, 1995) of the production of apparently irreducibly complex phenomena through simplification of an even more complex system, e.g. building an arch by taking away stones from a pile of rubble. The complexity of the system of peptide- specific synthetases that would be required for the model proposed here might make this an appropriate metaphor. Similarly, Yarus's (2001) article "On translation by RNAs alone", and Yarus & Welch's (2000) article "Peptidyl transferase: ancient and exiguous" contain some thoughts that would be relevant here and later in the manuscript.
Author response:Dennett's metaphor of the Roman arch is, indeed, excellent and might be relevant, even if not directly, because, here, we are talking more of stepwise displacement than selective elimination, and do not really postulate an initial state that was more complex than the final one. In any case, one of the strengths of the Biology Direct model is that the review is published, so the reader can read about this metaphor here. Ditto for the reviews by Yarus: the reader now knows of them and may turn to them if desirable (other work from Yarus' laboratory is cited extensively).
The discussion of ribozymes on p. 18 could possibly benefit from a discussion of riboswitches and their implications for control mechanisms in the cell, and/or for the other roles or RNA that suggest the RNA World (use in cofactors, role in nucleotide metabolism, use of RNA as a primer in DNA synthesis, etc.) However, the manuscript is fairly long as it is, and most of these points have been raised many times in the cited literature already.
Author response:Yes, the paper is fairly long, and we believe that riboswtiches are of no direct relevance.
Finally, some of the specific contentions could benefit from more elaboration. For example, on pp. 11–12, we find the statement:
"Put another way, the conservation of the core of the translation machinery is the strongest available evidence that some form of LUCA actually existed (it is, in principle, conceivable that life started off as a multitude of distinct forms but a single variant of the translation system subsequently took over as a result of a sweeping horizontal gene transfer; however, this is a decidedly non- parsimonious scenario)."
Given that the present manuscript already proposes the evolution of an entire suite of RNA-based aminoacyl-tRNA synthetases that no longer exist, and given that some authors such as Carl Woese propose that the division of life into distinct phylogenetic lineages was a relatively late event (e.g. Woese 2002), it is unclear why horizontal gene transfer should be dismissed in this context.
Author response:Upon more careful consideration (also considering Mushegian's comments below), we have deleted this whole claim. Suffice it to say, in this context, that the conservation of the translation machinery is evidence ofsomeform of LUCA.
Similarly, on p. 20, the authors seem to be strongly in favor of the hydrothermal vent scenario for the origin of life. A few words of caution to the effect that this is one of many hypotheses for life's origin, and that data are still far from conclusive, might be in order.
Author response:we have included a few words to that effect but also cite new references that, we believe, add credibility to the hydrothermal vent scenario (refs. 75, 76).
The discussion of the current evidence relating to the hypothesis that the genetic code arose through direct interactions between RNA and amino acids on p. 23 is good, but on p. 41 we read that "these affinities are weak, only manifest as a statistical trend, and worst of all, are seen, mostly, for chemically complex amino acids like arginine or histidine, rather than simple ones, such as glycine or alanine, that would be readily produced abiogenically." This statement requires some elaboration. Many of the potentially prebiotic amino acids, such as glycine, are difficult to evaluate with the affinity chromatography paradigm for technical reasons. It is possible that other methodologies, such as the allosteric selections pioneered by Tang & Breaker (1997), will allow us to see interactions in these cases, but for now absence of evidence should not be taken as evidence of absence. It is also far from certain that the biosynthesis of complex amino acids such as arginine would have been beyond the capabilities of RNA World organisms, so the primordial genetic code need not have been confined to simple amino acids. Second, the physical interactions involved are often far from weak: some amino acid aptamers, such as the best of Famulok's (1996) arginine aptamers, have sub-micromolar dissociation constants. It is true that the inconsistency between codon and anticodon modes of recognition remains to be resolved, but I do not agree with the assertion that "objectively, we should accept FAM as the most likely model for the emergence and evolution of translation". To accept FAM given what we know now about the optimality of the genetic code relative to random genetic codes, and the relationships between amino acid binding sites and cognate triplets, requires an alternative explanation for the strong statistical evidence that supports these hypotheses. In the absence of such an alternative explanation for why we see these patterns, which would be extremely unlikely under the FAM, I would recommend that the discussion be confined to pointing out where these processes would most likely be able to act in the model (for example, everyone agrees that direct interactions between coding triplets and amino acids are not relevant to the modern genetic code). It is possible that FAM is not an optimal description of what is actually meant in the discussion in the text – really, the claim seems to be that there is no necessary relationship between triplets of RNA and amino acids, rather than that there is in fact no pattern. However, in my opinion, the discussion of FAM vs. ARM vs. CRM as presented is likely to be a distraction from the overall value of the new ideas presented in the manuscript.
Author response:We cannot agree that this description is a distraction; we think it is part and parcel of the paper, even if the choice between ARM, CRM, and FAM has a limited effect on the actual model considered here. However, this discussion has been shortened and modified to make it more neutral with regard to the choice between the model of amino acid- T RNA recognition. The statement regarding weak interactions between amino acids and aptamers has been dropped along with the over-assertive statement regarding FAM as "the most likely model". It seems like in the text we clearly explain what we mean by FAM – indeed, it is about a lack of any direct connection between amino acids and cognate triplet. Also, we consider the amended version of FAM where subsequent adaptation of the code is deemed likely.
Finally, the description of experimental tests on p44 could benefit from more detail. Which properties of the postulated T RNAs are in doubt, and which steps would, if experimentally confirmed, best support the model? More specific guidance might increase the probability that supporting laboratory work would be carried out.
Author response:A brief discussion has been added.
Reviewer 2: Doron Lancet (Weizmann Institute of Science)
This reviewer made no comments.
Reviewer 3: Alexander Mankin, University of Illinois at Chicago (nominated by Arcady Mushegian)
It is a fairly straightforward task to evaluate an experimental paper driven by the data. It is a much more fuzzy assignment to evaluate a theoretical paper discussing a possible evolutionary scenario of the origin of protein synthesis. It is very tempting to buy into all of the authors' arguments. It is equally tempting to criticize them all.
The main postulate of Wolf and Koonin is that they are trying to build a model based on the Continuity Principle. In lay language, this means they are trying to put little solid rocks into the vast swamp that separates the evolutionary island of the RNA World, where most of the biochemical reactions are catalyzed by ribozymes, from the island of the modern nucleic acid-protein world, where biochemistry is carried out primarily by protein enzymes whilst nucleic acids are involved mostly in storage and expression of genetic information. Trying to bridge this gap, the authors envision the intermediate steps on the evolutionary path to the genetic code and coded protein synthesis, where innovations that arose at each of the steps could be selected for. In this approach, Wolf and Koonin strive to allow for the fewest number of evolutionary gaps that would require a significant leap rather than a small jump. Not that this is a new approach – most of the previous attempts to delineate the origin of protein synthesis were based on a generally similar idea. However, in the prior works, it was probably more of an intuitive attempt to build a plausible scenario than a formulated goal as in the essay of Wolf and Koonin.
The question is how closely those rocks of Wolf and Koonin are spaced and how solid they are. Some of them appear to be nicely positioned and are fairly solid, whereas the others, in my view, are either shaky or missing.
It seems to be a very reasonable idea that some of the RNA World ribozymes could benefit from a bound amino acid cofactor or even cofactors. It appears to be a much more far-fetched speculation that two or even more of these cofactors would bind in such close proximity of each other that the formation of a peptide bond between them would be possible and beneficial. Furthermore, it is not entirely clear from where a hypothetical peptide ligase would derive the energy that is required for peptide bond formation. In the modern ribosome, the energy that powers peptide bond formation is conserved in the high-energy ester bond that links the C-terminal amino acid of a nascent peptide to tRNA. The energy of this ester bond is derived from ATP consumed by an aminoacyl-tRNA synthetase – a source hardly available in the RNA world.
Author response:Yes, the issue of the energy source is important. One would have to propose that one of the substrates of the primordial peptide ligase was an activated amino acid, perhaps, even an aminoacyl adenylate. In the RNA world, such derivatives would have to be produced by other ribozymes, and ribozymes with such an activity, indeed, have been described (see Table 1). Alternatively, the original ribozyme R might have been an ATPase such that the emerging peptide ligase would couple ATP hydrolysis with peptide synthesis. The text was amended to address these issues.
Though the proposed route that leads to the origin of the original peptide ligase/aminoacyl polymerase is questionable, the resulting entity – a ribozyme capable of polymerizing amino acids into peptides in an unprogrammed fashion – seems highly plausible. As early experiments of Monro have shown, the large ribosomal subunit of the modern ribosome, a ribozyme in its own right, is still capable of carrying out such a reaction if provided with properly activated amino acids. So, if one is to accept Wolf and Koonin's idea of a peptide ligase derived from a ribozyme that is able to connect its amino acid cofactors into a single peptide, then the next few steps in their scenario are rather convincing. The use of the resulting peptides by other ribozymes, a subfunctionalization of the original peptide-ligating ribozyme into a specialized peptide ligase or amino acid polymerase, and the general benefit of having such a peptide ligase ribozyme in the assembly of selfish cooperatives appear to pave a rather smooth path for the ancestor of the large ribosomal subunit.
Having 'prepared' the key catalyst of protein synthesis, Wolf and Koonin then address the problem of a tRNA adaptor. An elegant idea they propose to justify the evolutionary necessity for establishing a link between pre-tRNAs and amino acids is that this would limit the diffusibility of a small amino acids and would help to increase their local concentration. Given that ribozymes with tRNA aminoacylating activities have been identified in SELEX experiments, it is easy to imagine that ribozymes with similar activities could have been selected through natural evolution in the RNA World. When considering the correspondence between the tRNA anticodon and the amino acid, Wolf and Koonin chose to not take sides in the discussion of whether the origin of the genetic code is based on a chemical complementarity between an amino acid and a codon or anticodon or is a result of a frozen evolutionary accident. Though the all-inclusive approach inevitably makes the description of this step somewhat fuzzy, any of several scenarios mentioned in this section are pleasantly consistent and provide good food for thought.
The next step is equally convincing: the invention of aminoacyl-tRNA organically leads to its use by the prototype peptide ligating/aminoacyl polymerizing ribozyme and thus completes the route to the large ribosomal subunit ancestor.
The origin of the coded protein synthesis is based on availability of three main players: the adaptor aminoacyl-RNA molecules with a strict amino acid-anticodon correlation, an enzyme that can polymerize the activated amino acids (the large ribosomal subunit precursor), and a precursor of the small ribosomal subunit, a "reading head" that selects the adaptor aminoacyl-RNA according to the input genetic text. Wolf and Koonin derive the origin of the ancestor of the small ribosomal subunit not from a pre-existing ribozyme but from a segment of the large subunit precursor. In this 'Adam's rib' scenario, an accessory RNA subunit RS evolves as a tool to enhance binding and positioning of aminoacyl-tRNA on the catalytic subunit, then acquires the "burden of specific recognition," and later on, one of its own parts assumes the role of a diffusible template. I am not sure whether this, rather sketchy scenario, satisfies the acclaimed Continuity Principle. Furthermore, it is poorly supported by the fact that the modern large ribosomal subunit can rather efficiently catalyze peptide bond formation using tRNA substrates even in the absence of the small subunit (Wohlgemuth, Beringer, Rodnina, (2006) EMBO Rep., 7, 699–703). From the point of view of this reviewer, it is more reasonable to root the origin of the small subunit in one of the pre-existing ribozymes that could operate with RNA templates. The extant activities of the modern small ribosomal subunit, including its interaction with an RNA template (mRNA) and ability to assemble on it the complementary sequences of the tRNA anticodons, bear the features expected from the ancestral RNA replicase/RNA ligase. Such a ribozyme could be viewed as an ancestor of the ribosome decoding center. The suspected ability of the modern 30 S subunit to cleave mRNA during ribosome stalling or under the influence of specific protein factors argues that the putative ancient catalytic center capable of breaking (and thus forming) phosphodiester bonds may still exist in the ribosome.
Author response:The possibility that the small subunit of the ribosome evolved from an RNA replicase/triplicase is an interesting one, and we have considered a version of it when working on the current model. This could directly connect the model discussed here with the triplicase model of Pool-Jeffares-Penny. However...direct evidence is missing, so we decided to avoid "overfitting" the model. Let the reader learn about this idea from Mankin's comment. However, it is completely unclear to us why the work of Wohlgemuth et al. is construed as evidence against the model presented in the paper. We believe that, on the contrary, it is readily compatible with this model, and we cite it in the revision.
In conclusion, the essay of Wolf and Koonin is an interesting and highly stimulating work. Inadvertently, my review sounds more critical than was intended. The reason is simple: the ideas we disagree with are more interesting for us than the points we easily accept. The majority of the points in the paper are of this latter category; the points my comments mostly focus on are of the former.
Other points of critique and comments:
1. The discussion of the model per se starts on p. 28. It seems that an almost 30-page introduction is excessive and often repetitive. The work would strongly benefit if the first 28 pages were expressed more succinctly, possibly as bulleted points in 2 pages.
Author response:We appreciate the virtues of brevity but this paper was conceived as a specific model for the origin of translation placed against the critically examined background of the relevant general evolutionary principles and previous research in the area. We feel that it has to stay that way.
Reviewer #4: Arcady Mushegian
The most significant contribution of this study is in decomposing the tantalizingly complex problem of the origin of genetic code, translation, and RNA replication into a series of proposed small evolutionary transitions, each associated with its own contribution to the fitness of the genetic system that experiences these transitions. I whole-heartedly recommend this manuscript for publication and expect that this series of transitions will be further scrutinized, perhaps along the lines of necessity and sufficiency.
My only scientific complain is about the half-haphazard conclusion that the frozen-accident model of adaptor recognition by amino acids is the most likely one. It might be, or it might be not: the fact that current direct experiments fail to establish specific recognition of cognate (anti)codons for evolutionarily more primitive amino acids does not make a "frozen accident" mechanistically attractive. Moreover, if, for example, primitive nucleobases were abiotically derivatized (see the work from S.Benner's lab that seems to point in this direction), then the experiments with the present-day codons or anticodons are not even answering the right question. The authors should mention that work or at least stay even more agnostic about the recognition model.
Author response:we infused considerable extra agnosticism, also, in response to Knight's comments (see above).
Other, minor, comments:
"The Continuity Principle" has connections with Anton Dorn's change-of-function principle (Ursprung der Wirbeltiere und das Prinzip des Funktionswechsels, Leipzig, 1875) – perhaps this is worth acknowledging.
Author response:In truth, the principle really goes back to Darwin, the rest are reformulations and explanations. We jump to a modern version immediately, leaving Dorn out.
As discussed by the authors, should Darwin-Eigen cycle be renamed Darvin-Eigen-Lynch-Conery cycle?
Author response:If one wants to be really fair, then, maybe, Darwin-Eigen-Penny-Lynch-Conery -(Wolf-Koonin)? For the time being, we are sticking with the original name, after Penny.
The study is well-written, but perhaps it can be edited a bit more. For example, the notion that "evolution has no foresight", however important, is seen at least five times, including two times within one bulleted list on pg 29.