The results clearly show that an RNA chromosome can spread when it adopts a circular form and its “sense strand” is readily broken at the sites between genes (Figures 2, 3 and Additional file 1: Figure S1). Both the circularity and the inter-gene breaking are important for the spread of the chromosome (Figure 4). Therefore, the computer simulation study supports our hypothesis that circularity plus self-cleavage may have been used as a strategy for the emergence of a chromosome in RNA-based protocells (Figure 1).
In our model, we assumed that there was end-degradation for a linear RNA chain. The results show that a circular RNA chromosome, which can avoid end-degradation, would spread, whereas a “fictive” linear RNA chromosome with all the other properties which have been assumed for the circular chromosome (the high probability of turning into template and the self-cleaving feature) cannot spread (Figure 4A, top-left panel). This finding means that such an assumption is important for the scenario described here. However, is such an assumption conceivable?
In modern prokaryotes, circularity of the DNA chromosome is believed to be a strategy that is used to resist end-degradation, which is caused mainly by exonuclease cleavage of terminal phosphodiester bonds (of course, the circularity is also believed to be important as a strategy to prevent the chromosome-shortening in replication due to the 5′-3′ direction of DNA polymerization requiring RNA primers). RNA degradation in modern cells has been studied in detail , in which it was shown that exonuclease activities are apparently more prevalent than endonuclease activities. These clues imply that there may be some chemical reasons that rend the breaking of terminal phosphodiester bonds easier. Chemical RNA degradation in the absent of proteins was explored very early , however, as far as we know, to date there is no direct evidence showing that the breaking of terminal phosphodiester bonds is apparently easier than those in the middle. On the other hand, this assertion seems reasonable according to our knowledge in this area. The stability of a phosphodiester bond may be affected by the geometry of the linkage, where the position of the attacking 2′-oxygen nucleophile relative to the 5′-oxyanion leaving group is important . Base stacking within a double stranded RNA can stabilize phosphodiester bonds by preventing the formation of the linkage conformation favoring the hydrolytic reaction. Likewise, base stacking within a single stranded RNA would also contribute to the chemical stability . It may be expected that terminal base stacking would be less stable than base stacking within the chain, thus rendering the terminal phosphodiester bonds easier to break. Alternatively, one may speculate that the reason for the prevalence of exonucleases in modern cells is that the terminal phosphodiester bonds would be more exposed to possible enzymes in solution than internal ones. Then, likewise, they would also be more exposed to other possible catalysts. Indeed, it has been reported that some putative prebiotic oligopeptides (those that existed before the emergence of the translation mechanism) could catalyze the cleavage of RNA chains , albeit in these cases it was internal bonds that were cleaved. Therefore, in the RNA world, the breaking (or cleavage) of terminal phosphodiester bonds may have also been a significant issue.
Another kind of end-degradation of RNA may be the result of the spontaneous decay of nucleotide residues at the ends to their precursors. In modern cells, this kind of RNA degradation may be negligible because of the efficient nuclease activities that cleave phosphodiester bonds. However, in prebiotic conditions this effect may have been innegligible. In the scenario described in our model, nucleotides may decay to their precursors (with the probability P
). It is unreasonable to assume that RNA cannot decay into nucleotide precursors until every phosphodiester bond has been hydrolyzed. At least, end-residues, which are more exposed to the solution may, like mononucleotides, be subject to decay, although to a less extent. Therefore, while we assumed that residues within an RNA chain cannot decay, we also assumed that end-residues may decay to nucleotide precursors, but with a smaller probability than that of free nucleotides (i.e., P
should be smaller than P
). The spontaneous decay of end-residues may result in the RNA end-degradation. A detailed mechanism that can be imagined is that the glycosidic bond between the ribose and the base of an end-residue may become exposed to solution and break, resulting in the dropping off of the base; then, without the protection of base stacking , the end phosphodiester bond between the ribose that is left and the second nucleotide residue may be easier to break, resulting in the loss of the ribose. Interestingly, an experimental study suggested that in some possible prebiotic conditions (e.g., in solution with relatively low pH), the glycosidic bond would be much more unstable than the phosphoester bond in a nucleotide . Apparently, in these conditions, the spontaneous decay of end-residues cannot be neglected, and it may even be significantly more intensive than the breaking of internal phosphodiester bonds of RNA chains (i.e., P
would be greater than P
In the model, we only assumed the spontaneous decay of end-residues, but did not assume the possibly easier breaking of the terminal phosphodiester bonds of RNA chains. This is a conservative consideration. If both kinds of end-degradation exist, the benefit of circularity to prevent end-degradation can be expected to be more apparent.
The simulations per se do not demonstrate the de novo emergence of a chromosome from unlinked genes. The appearance of the first chromosome molecule would have been a rather occasional event, involving random chain ligation/recombination plus cyclization. Indeed, this single event may have occurred but subsequently been “abandoned” more than once, considering the high chance of RNA degradation. The real important issue is how the initial chromosome molecules could have had any chance of spreading, as was explored here. About the history of this transition, there are also some messages in the study. When ribozymes were simple in structure, and thus not very efficient in catalysis, they may have acted as good templates themselves (with higher P
, e.g., 0.9 or 0.5, as shown in Figure 4B, bottom-left panel), and there would have been a world of these unlinked genes (see also our previous work , in which cooperation as well as competition of these ribozymes as unlinked genes was discussed). When the ribozymes evolved to a more efficient form with a more complicated structure, they may no longer be able to act as good templates (with lower P
, e.g., from 0.2 to 0.01, as shown in Figure 4B, bottom-left panel), then the chromosome would have an opportunity to emerge (provided that it adopted a strategy of circularity and self-cleavage).
The simulation reported here was based on a model with a resolution at the monomer level, that is, individual nucleotides (A, U, G and C) and amphiphiles, and therefore is very computer-intensive. For simplification, the model adopts a two-dimensional grid system, like the traditional stochastic cellular automaton used by the replicator models [3–6, 9]. However, here a grid room can accommodate a quantity of molecules that are deemed to be adjacent enough to interact with each other, which is different from the traditional stochastic cellular automation, in which one molecule occupies one grid room and molecular interactions occur between neighboring grid rooms. This treatment, somewhat similar to the approach used in a recent simulation study on prebiotic sequence evolution , saves computational costs and favors simulations involving complicated interactions at the monomer level. For simplification, the characteristic domains of the ribozymes that are assumed in the model are shorter (8 nt in the cases shown here, and 10 nt in some other cases) than in reality, and no structural features are considered. However, the principle that function is determined ultimately by sequence should have been sufficiently represented. Additionally, a self-cleaving site, which should be labeled by a hammerhead ribozyme subsequence, is only represented here by two residues (i.e., “U-G” in the cases shown here, see the legend to Figure 2). However, the mechanism of self-cleavage between genes should have been sufficiently represented. Certainly, increasing the length of the ribozymes and the self-cleaving label sites may bring our simulations more towards reality, but the system scale (represented, for example, by T
) would increase correspondingly, and computation would become more cumbersome, even unmanageable.
In the simulation, it can be observed that the spread of the chromosome depends on the function of the genes that it carried. The control, similar to the chromosome but with a different sequence without any genes, cannot spread (white triangles in Figure 2 and in the Additional file1: Figure S1; white bars in Figure 3-top-row; grey bars in Figure 4B and in the Additional file 1: Figures S2-S5). When the function of the Rep becomes less efficient (in the Additional file 1: Figure S2, P
decreases), or relatively less efficient compared with the non-enzymatic reaction (in the Additional file 1: Figure S2, P
increases), the spread of the chromosome is disfavored. Similar results are also shown for Nsr and Npsr (in the Additional file 1: Figure S2, P
). These results emphasize that the spread of the chromosome depends on the ribozymes it encodes. However, for Asr, the result is somewhat different: the spread of the chromosome is disfavored when P
decreases from 0.9 to 0.2 (in the Additional file 1: Figure S2), similar to the results for the other ribozymes; but in contrast, the spread is favored when P
decreases from 0.2 to 0.01. Additionally, the spread is favored when Asr becomes less efficient in comparison with the non-enzymatic reaction (in the Additional file 1: Figure S2, P
increases). This difference should be caused by another factor that affects the spread of the chromosome. That is, a more efficient Asr would result in a faster membrane growth, and lead to protocell division, which at this early stage would be caused by random physical forces in the environment as the protocells increased in size. As a result, the ribozymes (Rep, Nsr, Npsr and Asr) are more likely to separate from the chromosome accompanying the protocell division, and will not “serve” the chromosome any more. The result that a higher probability of protocell division disfavors the spread of the chromosome (in the Additional file 1: Figure S3, P
) supports this argument.
This result, concerning P
, is a little surprising per se. In our previous study on the cooperation of different ribozymes without the existence of a chromosome, we showed that the co-spread of the ribozymes was disfavored when P
was higher . The reason is that a higher rate of protocell division may result in more intensive gene loss during the division. Before the present simulation was conducted, we had expected that the spread of the chromosome (with linked genes) might resist faster cell division apparently.
Noticeably, the numbers of ribozymes in the system are quite few compared with the number of the chromosome (Figure 2 and Figure 3-top-right; see also Figure 4B and in the Additional file 1: Figures S2-S5), except for the cases in which the chromosome cannot spread and ribozymes become prosperous when the probability of ribozymes acting as template themselves rises (Figure 4B, P
). In the primordial strategy suggested here, the ribozymes are only byproducts of the chromosome replication (via self-cleaving, see Figure 1). As a result, the numbers of ribozymes are only retained at a low level. In this situation, fast cell division would be clearly deleterious. Ribozymes produced from the chromosome are likely to “serve” the chromosome only for a short time, and protocells with the chromosome might lack ribozymes of this kind or that kind. This phenomenon might be called “ribozyme loss”.
The emergence of the chromosome in RNA-based protocells would favor the appearance of more genes and the corresponding ribozymes and there would not be the problem of gene loss accompanying the protocell division. The problem of “ribozyme loss” is not as serious as that of gene loss, because the genes are always preserved in the chromosome and the ribozymes would be produced continuously from the chromosome. Subsequently, of course, a mechanism of transcription that could use tags like promoters in modern cells may have emerged to produce more copies of ribozymes, thereby, alleviating the problem of ribozyme loss. Further study to model this possible subsequent stage will be important and interesting, particularly to show to what extent of complexity the RNA world may have developed before the advent of DNA and proteins.