The origin of life is a spatially localized stochastic transition
© Wu and Higgs; licensee BioMed Central Ltd. 2012
Received: 16 July 2012
Accepted: 12 November 2012
Published: 24 November 2012
Skip to main content
© Wu and Higgs; licensee BioMed Central Ltd. 2012
Received: 16 July 2012
Accepted: 12 November 2012
Published: 24 November 2012
Life depends on biopolymer sequences as catalysts and as genetic material. A key step in the Origin of Life is the emergence of an autocatalytic system of biopolymers. Here we study computational models that address the way a living autocatalytic system could have emerged from a non-living chemical system, as envisaged in the RNA World hypothesis.
We consider (i) a chemical reaction system describing RNA polymerization, and (ii) a simple model of catalytic replicators that we call the Two’s Company model. Both systems have two stable states: a non-living state, characterized by a slow spontaneous rate of RNA synthesis, and a living state, characterized by rapid autocatalytic RNA synthesis. The origin of life is a transition between these two stable states. The transition is driven by stochastic concentration fluctuations involving relatively small numbers of molecules in a localized region of space. These models are simulated on a two-dimensional lattice in which reactions occur locally on single sites and diffusion occurs by hopping of molecules to neighbouring sites.
If diffusion is very rapid, the system is well-mixed. The transition to life becomes increasingly difficult as the lattice size is increased because the concentration fluctuations that drive the transition become relatively smaller when larger numbers of molecules are involved. In contrast, when diffusion occurs at a finite rate, concentration fluctuations are local. The transition to life occurs in one local region and then spreads across the rest of the surface. The transition becomes easier with larger lattice sizes because there are more independent regions in which it could occur. The key observations that apply to our models and to the real world are that the origin of life is a rare stochastic event that is localized in one region of space due to the limited rate of diffusion of the molecules involved and that the subsequent spread across the surface is deterministic. It is likely that the time required for the deterministic spread is much shorter than the waiting time for the origin, in which case life evolves only once on a planet, and then rapidly occupies the whole surface.
Reviewed by Omer Markovitch (nominated by Doron Lancet), Claus Wilke, and Nobuto Takeuchi (nominated by Eugene Koonin).
When we consider the network of biochemical reactions that exists inside a living organism, it is important to realize that these reactions are in a dynamical steady state, rather than at thermodynamic equilibrium. The reaction system is maintained out of equilibrium because there is a continual input of energy and material (food) and because the rates of reactions are controlled by enzyme catalysts that permit desired reactions to occur much more rapidly than they would in a non-living mixture of molecules. The presence of biopolymer catalysts is an essential feature that distinguishes living systems from non-living chemical systems. Another key feature of living systems is that they are autocatalytic, i.e. they can reproduce and make more of their own components. Once again, it is biopolymers that allow organisms to do this. Cells use biopolymers to store the genetic information that allows them to produce the required catalysts for metabolism and to enable replication of the genetic polymers themselves. A commonly used definition of life is that it is a self-sustained system capable of undergoing Darwinian evolution . As it is biopolymers that allow life to be self-sustained and enable heredity and evolution, we take the view that it is the emergence of autocatalytic biopolymer systems that is the key step in the origin of life. Thus, our aim is to understand how an autocatalytic biopolymer system could have emerged from a non-living chemical system.
It is usually thought that the interdependent system of DNA, RNA and proteins that sustains today’s cells is too complex to have arisen de novo. A prime candidate for a simpler biopolymer system that could have existed in early organisms is the RNA world hypothesis [2–5], which envisages that RNA sequences played both the genetic and catalytic roles. This is supported by the fact that RNA is the key component of the ribosome and by a large number of experimental studies of ribozymes in vitro[6–14]. In stronger forms of the RNA world hypothesis, it is argued that RNA sequences were the first autocatalytic biopolymers, and hence the first living system, rather than just an intermediate step between the origin of life and current life. It is then necessary to demonstrate that formation of functional RNAs was possible by non-living chemistry. Progress continues to be made on mechanisms of prebiotic synthesis of nucleotides and RNA oligomers [15–22]. In our view, the case that the origin of life occurred via an RNA world is quite strong, but it is still far from proven. Several authors have emphasized the difficulties associated with the RNA world and have considered alternative scenarios [23, 24]. A review discussing why the RNA world is currently the best theory we have for the origin of life, despite acknowledged limitations, appeared very recently .
Although we will consider RNA world scenarios in this paper, the two most important points that we emphasize here are sufficiently general to apply if some other kind of replicating molecular system arose prior to the RNA world or instead of it. The first point is that the non-living and living states are two alternative dynamically stable states of the same chemical system. The second point is that the origin of life is a stochastic transition between these two states that is initiated by concentration fluctuations involving relatively small numbers of molecules in a localized spatial region.
We previously showed that two alternative dynamically stable states exist in a chemical reaction system that models RNA polymerization . The model calculates the concentrations of monomers and polymers of different lengths. Monomer synthesis and polymerization reactions can occur at a small spontaneous rate in absence of ribozymes. Polymerization can also occur at a rapid rate due to catalysis by ribozymes, if these are present. In one stable state, which we call dead or non-living, the ribozyme concentration is negligible and the reactions proceed at their spontaneous rates only. In the other stable state, which we call living, there is a substantial ribozyme concentration and the polymerization rate is high, which causes autocatalytic synthesis of more ribozymes. We calculated the phase diagram as a function of the rate parameters for spontaneous and catalyzed reactions. We showed that there is a region where the reaction system is bistable (i.e. both solutions exist simultaneously) and also regions where only one or other of the two solutions is stable. We subsequently showed that a similar phase diagram exists for RNA systems with nucleotide synthase and polymerase ribozymes instead of polymerases  and in asymmetrical autocatalytic systems describing the origin of homochiral biopolymers . In this paper, we will argue that the form of the phase diagram is a generic feature of models describing autocatalysis and the origin of life, and we will present a very straightforward model that shows this in its simplest and clearest form.
It is important that the real world must be in the bistable region of the phase diagram. The living state must exist in the real world, but it cannot be the only solution, because the origin of life is a difficult and rare event. We do not see spontaneously replicating systems popping up easily in every test tube or puddle. The existence of two stable states is also demonstrated by the fact that it is easy to kill an organism by depriving it of some required substance, but the organism is not easily brought back to life if the substance is resupplied (e.g. an organism that died of suffocation is not brought back to life by giving it oxygen). We presume that the world began in the non-living state, and that it was possible, but difficult, for a transition to occur to the living state. The question is how the transition could have occurred.
We previously argued that stochastic concentration fluctuations are essential for the origin of life . In a large, well-mixed chemical system, we can write down deterministic ordinary differential equations to describe the concentrations of different types of molecule. The non-living system is dynamically stable for ever in such a deterministic case. In a system of finite volume, with finite numbers of molecules, stochastic fluctuations can occur that allow the system to jump from the non-living to the living state. In the RNA polymerization model, it is presumed that only fairly long RNA sequences have any chance of being functional ribozymes. Such sequences are very rare, and may only be present in a handful of copies, even though monomers and short oligomers may be common. Stochastic fluctuations in the ribozyme concentration are thus important, even if the monomer concentration is effectively deterministic.
In this paper, we want to investigate more carefully the factors that determine the time required for the stochastic transition to occur. We observed previously that the transition to life occurs most easily in systems of intermediate size . If the system is too small, it is very difficult to create sequences long enough to be ribozymes. If the system is too large, the concentration fluctuations are too small to permit the transition to occur. It is clear, however, that there is something missing from this story. In reality, molecules have a finite diffusion rate. Molecules have a finite lifetime before they are broken down or destroyed. Only molecules relatively close to one another on the surface of a planet have any chance of interacting with one another. There must be a length scale controlled by diffusion above which different regions of space are effectively independent of one another on the time scale of the molecular lifetime. The assumption that the system is well mixed breaks down at larger lengths scales than this. Our aim in this paper is therefore to study two-dimensional spatial models (representing the surface of a planet) in which reactions occur locally and there is a finite rate of molecular diffusion. We will show that the concentration fluctuations that lead to the origin of life can occur locally in any one region of space. Once the stochastic transition has occurred in one place, the living state can then spread deterministically across the rest of the surface.
In this section, we briefly summarize our RNA polymerization model  and show how the phase diagram is obtained in this case. We then introduce a very simple generic model for replicating molecules and show that the phase diagram is equivalent.
where r 0 is a small spontaneous polymerization rate and kP m is the term due to ribozyme catalysts. The constant k is the catalytic efficiency of the long polymers, which may be written as k = k 0 f, where f is the fraction of the long polymers that function as catalysts and k 0 is the catalytic efficiency of a functional catalyst.
Having seen that the kind of phase diagram in Figure 1 occurs in many different models, we now wish to introduce the simplest model we can that has the same phase diagram. The simplified model will then be useful to investigate the dynamic properties of the stochastic transition to life in the following sections of this paper.
The main aim of this paper is to consider the influence of spatial concentration fluctuations and limited diffusion rate on the stochastic transition that leads to life. Therefore, we will now define a spatial lattice model, which we call the Two’s Company Model, that is a spatial version of Eqn. 5, and reduces to Eqn. 5 when the system is well mixed.
Hopping of molecules between sites is implemented using either local or a global hopping rules in the following way. Each molecule attempts to hop at rate h. A destination site is chosen for the molecule. In the case of local hopping rules, the destination is chosen at random from the 8 neighbouring sites of the original site. In the case of global hopping rules, the destination site is chosen at random from all the other sites on the lattice. The molecule hops successfully to the destination site if it finds a vacancy there, i.e. the hop is successful with probability 1-n/3 (as shown in Figure 3). If the hop is unsuccessful, the molecule remains on its original site. Further details of the stochastic dynamics used to implement the model are given in the Methods section.
The local hopping rules simulate local molecular diffusion, which is an important part of this model. The global hopping rules are intended as a comparison that helps to illustrate the importance of local hopping. The introduction of local hopping rules means that correlations exist between the numbers of molecules on neighbouring sites. However, if h is sufficiently large, the system reaches a well-mixed limit where molecules are randomly positioned over the whole lattice and there is no remaining correlation between sites. In the case of global hopping, there is no correlation between sites, even when h is small. Below, we will discuss the mean field approximation, which is exact for the global hopping dynamics but it is only an approximation for the model with local hopping dynamics. In the Methods section, we show that when h is very large, both global and local hopping models reach the same well-mixed limit, which corresponds to the generic replicator model in Equation 5.
Simulations are initiated in the dead state and followed until a transition to the living state occurs. The concentration in the dead state should be close to the stable solution ϕ 1 of Equation 5. In order to initiate the stochastic simulation in the dead state, each lattice site is seeded with molecular numbers sampled from a binomial distribution with average concentration ϕ 1 . In a typical simulation, the system remains in the dead state for a long time until a localized patch of high concentration arises that is sufficiently large to be stable in the living state. The origin of this living patch is a rare stochastic event. However, once it is formed, the living patch then spreads deterministically across the whole lattice.
We performed many simulations of this model in order to measure the way the time taken for the transition depends on the parameters. We define T sys = T reg + T spread , where T reg is the time until a local transition occurs in any one region of the lattice, and T spread is the time for the living state to spread from one region across the rest of the lattice, and T sys is the time required for the full lattice to reach the living state. These times were measured as described in the Methods section.
As shown in left part of Figure 6, when the system is small, it is more or less well mixed. Hence the transition will happen globally across the whole system at the same time. When the system is larger, it is no longer homogeneous. The transition will happen at a local region first, then it will spread out to the whole system. The number of independent local regions should be proportional to L 2 . The average time taken for the transition to occur in any one of these regions should vary inversely with the number of regions, i.e. T reg ~ 1/L 2 . If the spread of the living state is rapid, the spreading time, T spread is small compared with the time taken for the transition to occur, T reg . In this case the time taken for the whole system to go through the transition is close to T reg . It can be seen that there is an intermediate range of L in Figure 6, where T reg and T sys are very close to one another and where both scale as 1/L 2 , as expected from this argument. Comparison of the results with h = 24 and h = 45 shows that when diffusion is slower, the transition from homogeneous system to heterogeneous system happens for a smaller system size.
If the system size is extremely large, there comes a point where T spread is comparable to T reg , and the curves for T sys and T reg separate in Figure 6. At this point it is possible for the transition to life to occur in a second region independently before the living state from the first transition has spread across the lattice. Beyond this point, T reg continues to decrease, but T sys reaches a constant value, which is what we expect if there are multiple origins happening in different places.
A useful way to understand the effects of the limited diffusion rate in this model is to calculate the mean field solution. In the Methods section, we calculate the probability P n that there are n molecules on a site, using a mean field approximation that ignores correlations between neighbouring sites. The mean field solution depends on h and it is not equivalent to the well-mixed case, except for the limit of large h, where both the lattice model and the mean field approximation converge to the well-mixed case. We expect that the mean field approximation will be fairly good for the local hopping model if h is large, because correlations between neighbouring sites should be small in this limit. For the global hopping model, the mean field solution is exact, even for small h.
Figure 7 illustrates that reduction in the local diffusion rate has both positive and negative effects on the replication process. As h is reduced, the concentration of the living state is reduced because there is more local interference between molecules on fully occupied sites. Also the minimum value of k required for the living state to be stable increases as h decreases. Thus, rapid diffusion is favourable for the living state, if the living state is already established. However, we have already seen in Figure 5 that high diffusion rates cause the system to be well mixed, in which case the transition time is extremely long. It is therefore important to have a diffusion rate that is not too large in order for the transition to the living state to occur at an appreciable rate in the first place.
The Two’s Company model is the simplest model that can be used to study the stochastic transition to life in a spatially distributed system. However, we wish to show that the same behaviour occurs in other models. We will return to the model of RNA polymerization described in Equations 1–1 in order to study the effects of diffusion and spatial concentration fluctuations in this case.
The simulation is carried out on 2D square lattice composed of L × L sites, with each site defined to have a size l = 1. We keep track of how many molecules of each kind there are on each site. The reactions of monomer synthesis, activation, and polymerization all occur locally on individual lattice sites. The implementation of stochastic dynamics for this model using the Gillespie algorithm is described in our previous paper  for a non-spatial model. In the spatial case, the same method is used to define reaction rates on each site as a function of the numbers of molecules of each type on each site. For simplicity, molecules of all kinds are assumed to hop with the same rate h. Hopping may either be local or global, as with the Two’s Company model. However, there is no need to impose a maximum number of molecules per site, because concentration is limited by food input rather than by a carrying capacity. Therefore every attempted hop is successful, and hops are not blocked by the molecules on the destination site, as they are for the Two’s Company model.
When the transition occurs, the increase in the catalyst concentration is accompanied by a sharp decrease in the activated monomer concentration. As there are many more activated monomers than catalysts, the activated monomer concentration varies more smoothly than the catalyst concentration. Therefore, we used the decrease in the activated monomer concentration as a marker for the transition time. This concentration was averaged over a sliding time interval, and this was used to define regional and system transition times, in the same way as for the Two’s Company model (see Methods section).
The models we have discussed here have features in common with various theoretical models discussed by other authors. There has been a lot of work on maintenance of replicators in the face of accumulation of mutations. The standard error threshold problem  deals with linear replicators in a non-spatial model. A spatial lattice version of this problem has been studied  and this has been compared to the case of catalytic replicators in a spatial system  using a model similar to the Two’s Company model. The fundamental difference between linear and catalytic replicators is that when a replicator acts on another sequence as template it is being altruistic. Evolution of catalytic replicators is therefore related to the evolution of cooperation. A large number of studies have shown that cooperation can evolve in spatial models when the equivalent non-spatial models would be invaded by parasites [35–43]. These models make it clear that formation of clusters of cooperating molecules and formation of spatial patterns is generally very important for evolution and survival of catalytic replicators. However, none of these papers dealt with the origin of the replicating system. Our work also shows that incorporation of local diffusion and spatial concentration fluctuations is essential to enable the origin of life to occur with an appreciable probability. Other models [44–46] that deal with prebiotic replicators and consider the transition to a replicating state are closer to our work. Spatial effects are also apparent in these models. Another approach to the origin of the RNA world  uses very complex sets of reaction equations for the emergence of RNA replication and also makes a link with nucleation phenomena. Models for the origin of replication [29–31] have been studied previously that are very similar to ours, however the stochastic dynamics and spatial effects were not looked at by these authors.
Despite important differences in their details, the Two’s Company model and the RNA polymerization models exhibit surprisingly similar behaviour, both for the phase diagram (Figures 1 and 2) and the dynamics of the transition to life (Figures 6 and 9). The similarity arises because the models share the features of nonlinear feedback and limited resources. For the Two’s Company model, the replication rate of the molecule depends nonlinearly on the concentration of the molecule because of the kϕ 2 term, and limited resources are modelled by the rule that there can be no more than 3 molecules on the same site. For the RNA polymerization model, the nonlinearity arises because of the feedback in polymerization (Eqn. 4) because of the way the ribozyme concentration depends on the polymerization rate (see ). The limited supply of food molecules from the environment is included in the model, and this prevents the ribozyme concentration increasing indefinitely in the living state. Given these two features, both models have two stable states and a bistable region in the phase diagram.
The two models exhibit essentially the same stochastic transition arising as a result of concentration fluctuations. When the system is well mixed, fluctuation will decrease with increasing system size due to increasing number of molecules in the system. Hence the transition becomes increasingly difficult for larger system sizes. With limited diffusion, the system is no longer well mixed and breaks down into many weakly interacting regions. The larger the system, the more such independent regions there are in which the transition can occur. Therefore the transition becomes faster for larger systems in a 2D model with limited diffusion.
Both models illustrate that multiple origins of life could occur if the system is large enough. As far as we know, however, all current life on Earth is descended from a single origin. The size of the Earth is very much larger than the distance over which a molecule could easily diffuse. However, the window of time during which life arose is narrowed to a few hundred million years by geological evidence . If the origin of life is a rare event taking hundreds of millions of years, it is still perfectly possible that it occurs only once (or not at all) on a large planet. Once the living state has arisen, the time for multiplication and spread across the planet is determined by molecular time scales (replication and diffusion) which are likely to be very much faster than the waiting time for the rare event that creates life in the first place. In other words, it is likely that the real world falls in the intermediate part of Figure 9, where the transition time varies like 1/L 2 and there is a single origin. However, the scenario involving multiple origins could still be compatible with the view of single tree of life if there were competition between different forms of life and only one remained, or if there were symbiosis and merging of features that derived from different origins.
Two parameters that are important in the RNA polymerization model are m, the minimum sequence length required to be a catalyst, and f, the probability that a sequence of length n ≥ m is a catalyst. In this paper, we ran many simulations for each set of parameters in order to study the way the mean transition time depended on h and L. For this reason, it was necessary to choose parameters for which the transition occurred fairly easily. Therefore, we made the length short (m = 5) and chose the easiest case where f = 1. In our first paper, we looked at much longer ribozymes (m = 50) with f much less than 1, which seems more realistic if we are looking for ribozymes that would have specific sequences and structures, and the essential behaviour is very similar. In reality, we do not know how long the first catalytic sequences might have been. We are used to thinking of enzymes hundreds of amino acids long in current organisms, and currently known polymerase ribozymes are all well over 100 nucleotides long [7, 9, 14]. Hence, we tend to think of biopolymers as rather long and very specific in their sequence. It seems reasonable to suppose that the earliest catalysts must have been much shorter and less sequence specific if they were to have had any chance of arising in prebiotic conditions. Short sequences would be relative common, and relatively easy to replicate even if the fidelity was quite low. These are important advantages of short sequences, even if they have only very limited catalytic ability compared to much longer, well-adapted ribozymes.
In all our models of RNA polymerization, we have made the simplification of ignoring nucleotide sequences and keeping track only of polymer length. It is supposed that polymerase ribozymes catalyze all polymerization reactions equally. Hence, they will catalyze the formation of random non-functional sequences in addition to increasing their own formation rate. This would be extremely inefficient, and it seems likely that real ribozymes could have done better than this by being more specific. Nevertheless, an autocatalytic living state arises even in this simple non-specific case, and there is a clear difference between the living and the dead state, which is the aspect of the model we wish to emphasize. The case of non-specific polymerases has sometimes been called “pre-life” [29–31], and the term “life” has been reserved for a specific polymerase that catalyzes only the steps that lead to formation of its own sequence. Such a specific polymerase would have been much more efficient, as it would not waste its time making non-functional polymers. However, it is difficult to see how a specific polymerase would distinguish between the polymerization steps that contribute to its own formation and those that do not. In fact, polymerases in modern organisms are general, not specific. They can synthesize any general sequence, but they make specific sequences because they make use of templates, not because they distinguish between the reaction steps that link different kinds of monomers.
Of course, not all sequences are equivalent in the real world, and whether a molecule can function as a catalyst depends on its sequence and structure. It would be possible to consider a model for polymerization of four nucleotides in which the base sequence of each polymer in the mixture is stored in memory. One could then determine whether a sequence was a catalyst using criteria based on secondary structure or whether it contained a specific sequence motif within it. This would be more realistic than simply giving each sequence a probability f of being a catalyst, as in our current model, but the essential point that the formation of catalysts causes a feedback in the polymerization rate would remain unchanged in a more complex model that considered RNA sequence and structure in more detail. However complex and realistic we try to make a model, there will always be factors that must be left out. In some cases, simplicity is a virtue in computational models if this helps to show the essential points clearly. We believe that the simple models studied here illustrate several essential features of the origin of life problem as it occurred in the real world, i.e. the origin of life is a transition between two alternative steady states of a chemical reaction system that is driven by stochastic concentration fluctuations involving relatively small numbers of molecules in a localized region of space.
We simulated the stochastic dynamics of the Two’s Company Model using the Gillespie algorithm. At each time step, one elementary reaction was chosen at random with a probability proportional to its rate. The duration of the time step was a random time τ, chosen from an exponential distribution P(τ) = Re− Rτ , where R is the sum of the rates of all the elementary reactions that could occur at that point in time.
Simulations were performed to measure the mean time until the transition to the living state occurred. The simulations were stopped when the global concentration became close to ϕ 2 , the solution of Equation 5 corresponding to the living state. In order to insure that the system was stable in the living state, ϕ was averaged over a sliding time interval of width 20, which is 20 times the average life time of a molecule. The simulation was stopped when the sliding average density reached 90% of ϕ 2 . This time was then recorded as the system transition time of one trial. The average system transition time, T sys, was obtained from 1000 trials for each set of parameters. When the local hopping rule is used, the transition happens at one place then spreads across the whole system. We may write T sys = T reg + T spread , where T reg is the time until a local transition occurs in any one region, and T spread is the time for the living state to spread from one region across the rest of the lattice. The system was divided into multiple regions of size 10×10 lattice sites. The concentration was calculated within each region and averaged over a time window in the same way as the global concentration. T reg was defined as the point at which the sliding window concentration of any one region reached 90% of ϕ 2 .
where ϕ n = (q n1 + 2q n2 + 3q n3)/3 is the mean density on sites that are neighbours of sites with n particles. Note that there are 8 neighbours from which a molecule could come, but only 1/8 of the hops from any one neighbour move to the site in question. Therefore these two factors of 8 cancel out.
The mean field solutions shown in Figure 7 are the stationary states of these equations. The simulation with global hopping rules follows the mean field solution exactly because there are no correlations between sites when hopping is random. The mean field solution is an approximate solution for the case with local hopping rules.
Reviewer 1: Omer Markovitch, Weizmann Institute of Science (nominated by Doron Lancet).
The authors argue that a phase diagram, exhibiting living and non-living states (when the templating rate is respectively bigger or equal to the spontaneous polymerization rate), is a generic feature of replicator-first origin of life models. The authors then develop a simple lattice model in which a site can contain up to 3 molecules, templating can occur when exactly 2 monomers exist in a site and diffusion between adjacent sites is allowed. Their spatial model exhibits patches of replicating molecules, some of which can eventually grow to dominate the system akin to a percolation transition, in analogy to a bistable diagram. They find that rapid diffusion in a large lattice inhibits the non-life to life transition because fluctuations are smaller. The manuscript deals with spatial effects in replicating polymers, which did not receive full attention before, and presents the results in a clear manner, thus deserves publication.
The discussion of 3-molecules-limit vs. no-limit is suggested to be extended. The authors should consider performing an intermediate test in order to verify that carrying capacity is qualitatively similar to concentration limit.
Response: We would not call this a percolation transition. The analogy we are making is with a nucleation phenomenon, such as nucleation of solid crystals in a liquid followed by subsequent rapid growth of the crystals.
In the RNA polymerization model, resources are explicitly modelled, and therefore no carrying capacity is necessary. In the Two’s Company it is necessary to impose a carrying capacity in order to avoid unbounded growth. It would also be possible to add a resource molecule explicitly to the Two’s Company model, but this would go against our intention of making this model as simple as possible.
1) The abstract will benefit from being toned down, example given from “The key step is…” to “A key step is…”.
Response: We changed this to A in the abstract.
2) Providing information on the existence of alternative views to the RNA world is suggested.
3) Section 2 (A Generic Phase Diagram for Replicating Molecules) is too long. It is suggested to provide fewer details of a previously published work and stress the stress the conclusion right at the beginning.
Response: We have made only very slight reductions in this section. All the equations are necessary for what follows in this paper, and the textual discussion is important. The first two sentences already state the purpose and conclusion of this section.
4) The reader could benefit from an illustration / diagram of the “Two is a company” model.
Response: A new Figure 3 has been added showing the rules for this model.
Reviewer 2: Claus Wilke, University of Texas, Austin.
Comment: This article uses mathematical modeling and computer simulations to study a simple origin-of-life model. The two main contributions this article makes are (i) a simplified model of the origin of life that maps in a straightforward fashion to a more complex model and (ii) a detailed analysis of the origin-of-life model in a spatial setting. The authors address the conditions under which a living state can emerge from a non-living state, and find that a spatially extended system with moderate diffusion can greatly facilitate the emergence of the living state. Most importantly, in this regime, the probability of emergence scales with the size of the system, so that the emergence of life somewhere is virtually guaranteed for sufficiently large systems. Overall, this is a very nice contribution. The modeling is sound, and the authors use multiple avenues (different models, different modeling approaches) to support their theory.
In the expression k = k 0 f, wouldn’t f be vanishingly small for standard RNA sequences, possibly so small that even a large k 0 would not be sufficient to yield a reasonable k? The authors give some qualitative discussion of this topic towards the end of their paper, and I’d like to see the arguments fleshed out a bit more and possibly made quantitative. Do we have a sense of what a reasonable k 0 would look like, and what k needs to be to make the origin of life happen within, say, a billion years? What would the corresponding f be, and is it reasonable?
Response: We agree that if only a single sequence is functional and if the sequence is very long, then f would be very small. We also agree that if f is very small then k 0 must be large. However, it is probable that many sequences folding to the same secondary structure would have had similar functions, which makes f larger. This argument also suggests that it may be easier to begin with rather short ribozymes with a rather low catalytic ability than with longer, but rarer sequences that are better catalysts. These questions are clearly important, but we do not know how to put concrete numbers on these quantities. To give a quantitative answer to the time scale and required reaction rates, we would need to the concentrations of reagents on the primitive Earth, the reaction pathways by which nucleotides and RNAs were synthesized, and the temperature, pressure and pH at which these reactions were occurring. Current knowledge of these things is very incomplete.
Comment: I think the manuscript would improve if it had a separate Methods section containing some of the more technical details of the simulations and derivations. Section 4 in particular mixes results with highly specific implementation details that don’t move the story forward. Similarly, other sections contain material that doesn’t directly impact the overall story, such as the proof that the Two’s Company model agrees with Eq. (5). I’m not asking the authors to delete this material, just reorganize it such that the a reader can get the most important results without also having to read all the supporting evidence and materials.
Response: A methods section has been added at the end of the paper and the technical sections have been moved to the methods. The separate sections on the Two’s Company have now been merged into one main section.
Reviewer 3: Nobuto Takeuchi, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, USA. (Nominated by Eugene Koonin).
Comment: In their manuscript, Wu and Higgs present a minimal replicator model that exhibits a stochastic transition from a “non-living” to a “living” state. Such a transition was originally proposed and investigated by Dyson in a different model [32, 49]. Using this minimal model, the authors investigate the factors determining the waiting time for such a transition. In particular, the authors calculate the waiting time as a function of the diffusion rate of replicators. The calculation reveals that the waiting time is smallest when the diffusion rate assumes an intermediate value. This result is intuitively explained by the authors as follows: Too strong diffusion diminishes the stochasticity of dynamics and so impedes the transition, whereas too weak diffusion prevents replicators from encountering each other and, thus, from catalyzing each other’s replication. Moreover, the authors examine the waiting time as a function of the system size. The calculation shows that the waiting time decreases as the system size increases, provided that the diffusion is finite. This result sharply contrasts with that obtained for infinite diffusion. The authors again give an intuitive explanation: If diffusion is finite, a non-living-to-living transition first occurs in a local region and then propagates through the entire system; thus, the larger the system, the greater the number of local regions (assuming finite diffusion), and so is the probability of a transition per unit time. Also, the authors check this result in a more complex model that considers RNA polymerization.
The manuscript describes its content in a clear manner. I obtained only one specific question regarding the results (more general comments to follow). The transition time is shortest when the diffusion rate takes an intermediate value in the minimal replicator model. Does this result also hold in the RNA polymerization model in general?
Response: Yes, although we have not fully investigated the regime of very low diffusion rate in the RNA polymerization model. Very low diffusion rate should be unfavourable because the catalysts can only be effective if monomers and growing chains encounter the catalysts and if newly created catalysts can spread beyond the site in which they were created.
Comment: I have three general comments (and a question). First, the non-living-to-living transition exhibited by the models rests on the combination of the two processes: the stochastic occurrence of rare events and the subsequent deterministic amplification thereof. Interestingly, such a combination also plays an important role for the stability of a spatially extended RNA-like replicator system . Namely, the stable coexistence between catalysts and parasites is enabled by the formation of traveling wave patterns, whose emergence rests on the combination of the two processes: rare stochastic events in which a few catalysts are spatially segregated from parasites and the subsequent expansion of these catalysts through replication (followed by infection with parasites). This kind of dynamics, in which rare stochastic events are deterministically amplified, might be a generic feature of (life-like) systems that have a finite system size and positive feedback (see also the last paragraph).
Response: We agree that it is interesting to note that stochastic effects and limited spatial diffusion favour both the origin of life and the stable coexistence of replicators and parasites.
Comment: My second comment is concerned with the implication of the current study in relation to Dyson’s study [32, 49]. Dyson’s model is conceived under the metabolism-first scenario for the origin of life: it is created as a simplest model of metabolic systems and assumes no replicators. By contrast, the minimal replicator model investigated in the current study, of course, assumes replicators. The RNA polymerization model of Wu and Higgs is conceived under the RNA world hypothesis, but does not actually assume replicators (i.e., it does not assume template-directed polymerization). Despite these differences, all these models exhibit essentially the same general result: two stable states that are regarded as non-living or living and stochastic transitions between them. Therefore, the general result seems to be independent of whether the model assumes metabolism, replicators, or the RNA world. This independence is also implied by the reason why these models produce the same result. Wu and Higgs identify two features shared by their models that are essential for the general result: resource limitation and nonlinear (positive) feedback—the latter does not necessarily imply replication. These features are clearly shared by Dyson’s model as well. Taken together, the results of the current study seems to imply that Dyson’s proposal of identifying the origin of life as a stochastic transition between two stable states is orthogonal to whether one considers the metabolism-first scenario, the replicator-first scenario, or the RNA world hypothesis.
Response: Although we like the idea of the stochastic transition to life that was introduced first by Dyson, we feel that Dyson was not clear on what was meant by the ‘active’ and ‘inactive’ states and why the active state corresponds to life. In both our RNA polymerization model and the Two’s Company model, the high concentration of catalysts in the living state is maintained by the reactions carried out by the catalysts. The definition of the living state in our models is that it has autocatalytic biopolymers. Although we could presumably write down a metabolism-only model without biopolymers and without replication that would have two stable states and a possibility of a stochastic transition, this would be a purely chemical system and it would not constitute a transition to life, in our view, because the high-concentration state would not have replication, heredity and evolutionary potential.
Comment: My last comment is concerned with the authors’ treatment of the linear growth term in the minimal replicator model. In investigating this model, the authors regard this term as less important than the other terms. However, I find three arguments to challenge this view. First, the linear growth term can play an important, though negative, role for the reported results of the model. Namely, if the linear growth rate r is greater than the decay rate u, the model has only one stable state (Figure 2 shows results for r < u). Second, replicators whose growth is described by a linear term (exponentially growing replicators, for short) are theoretically the simplest possible replicators . Moreover, they have been experimentally synthesized .
The last argument concerns the very core of the current study. The non-living-to-living transition conceived by the authors entails rare stochastic events and the deterministic amplification thereof. Such a transition might be exhibited by exponentially growing replicators as well if one abandons the presupposition that the non-living state must be dynamically stable, as follows. Let us suppose that RNA molecules are randomly synthesized as assumed in the RNA polymerization model. Moreover, some RNA sequences are capable of self-replication; however, these sequences form but a tiny subset of the sequence space. Now, the non-living state corresponds to the state in which the system hardly contains any self-replicating molecules; the living state, the state in which the system contains self-replicating molecules in abundance. The rare stochastic event corresponds to the emergence of a (few) self-replicating molecule(s) through random synthesis; the deterministic amplification, the expansion of the nascent population of these molecules through self-replication. This argument suggests a topic for further discussion: Must “non-living” states be dynamically stable?
Response: If r > u in our Two’s Company model, there will be no non-living state and there will be spontaneous multiplication of replicators without any waiting. This does not seem realistic to us. It is important to have non-linearity in some way in order to have two stable states. In the Two’s Company model there is only one way to do this, which is the quadratic term representing a two-molecule replication process. However, in more complex models there are many ways to do it. In the RNA polymerization models, the nonlinearity can arise either by feedback in the polymerization rate or in the monomer synthesis rate []. It is true that the ligase reaction in [] corresponds to an exponentially growing replicator, but this system only works if it is supplied with very specific complementary oligomer strands. This example does not seem very close to the r reaction in our model. For the r reaction, we envisage a template directed process (e.g. on the surface of a clay catalyst) in which a single strand acts rather passively as a template. Although the template ability of different RNAs might depend to some extent on the sequence, it seems likely that most sequences could act as a passive template to some degree. If the mineral catalyzed r process was already good enough to support exponential growth (r > u), then there would be immediate multiplication of large numbers of RNAs with no sequence specificity, which seems unreasonable. The reviewer’s suggestion that the non-living state need not be dynamically stable makes sense as a logical possibility, but it seems unlikely to us, because the origin of life would just be too easy in this case.
The possible interplay between template directed processes and ribozyme-catalyzed processes is interesting and is not fully captured in the models we studied so far. In our view, a non-specific r process could be important in generating diversity of RNA strands prior to life, but life itself would only arise when specific ribozymes came along to carry out the k process. Note that the ligase replicator in  is not acting as a template to specify the sequence of the new strand, because the sequence is already specified in the oligomer strands that are supplied to it. For an RNA strand to act as a catalyst it needs to fold to a specific structure. It is difficult to see how any strand could be a folded catalyst and an unfolded template at the same time. Therefore we are led almost inevitably to consider two-molecule replication processes.
Reply to the authors’ reply: Given the persistent differences in the definitions of life, it is all the more interesting that these models—conceived under different views of life—produce the same general result based on the same principle. The authors consider that there is unlikely to be a replicator that grows exponentially and requires specific sequence patterns to act both as a catalyst and as a template (I assumed such replicators when I suggested that the non-living state need not be dynamically stable). The ligase replicator of Lincoln and Joyce (2009) is an example of such replicators as elaborated below. Although it does not answer the question I raised (its growth requires substrates prepared by humans), it suggests great possibilities of what RNA molecules can do (it was generated by (only) six rounds of in vitro selection).
Contrary to the authors’ interpretation, the ligase replicator of Lincoln and Joyce (2009) does act as a template to specify the sequence of the new strand as follows. Their replicator consists of two ligases (denoted by E and E’), each composed of two substrates (denoted by A and B, and A’ and B’, respectively). The substrates A and B’ contain sequences complementary to each other, and so do the substrates A’ and B. Based on this complementarity, each ligase (say E) forms a complex with two substrates (A’ and B’) and catalyzes the ligation between them, synthesizing the other ligase (E’). In this way, a ligase acts both as a template and as a catalyst for the synthesis of the other ligase. More important, Lincoln and Joyce (2009) generated 12 sets of distinct substrates (denoted by An, Bn, A’n, and B’n where n ranges from 1 to 12). During a serial transfer experiment, base pair mismatches generate “recombinants” such as a ligase composed of A5 and B3 (denoted by E5,3). Such a recombinant can replicate; for example, an E5,3 forms a complex with A’3 and B’5 and catalyzes the synthesis of an E’3,5, which, in turn, catalyzes the synthesis of an E5,3. In this way, the replicator can transmit “genetic information”, thanks to the ability to act as a template.
This work was supported by the Natural Sciences and Engineering Research Council of Canada. We are grateful for computational facilities made available by the SHARCNET research computing consortium.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.