Neutral genetic drift can alter promiscuous protein functions, potentially aiding functional evolution

Background Many of the mutations accumulated by naturally evolving proteins are neutral in the sense that they do not significantly alter a protein's ability to perform its primary biological function. However, new protein functions evolve when selection begins to favor other, "promiscuous" functions that are incidental to a protein's original biological role. If mutations that are neutral with respect to a protein's primary biological function cause substantial changes in promiscuous functions, these mutations could enable future functional evolution. Results Here we investigate this possibility experimentally by examining how cytochrome P450 enzymes that have evolved neutrally with respect to activity on a single substrate have changed in their abilities to catalyze reactions on five other substrates. We find that the enzymes have sometimes changed as much as four-fold in the promiscuous activities. The changes in promiscuous activities tend to increase with the number of mutations, and can be largely rationalized in terms of the chemical structures of the substrates. The activities on chemically similar substrates tend to change in a coordinated fashion, potentially providing a route for systematically predicting the change in one activity based on the measurement of several others. Conclusion Our work suggests that initially neutral genetic drift can lead to substantial changes in protein functions that are not currently under selection, in effect poising the proteins to more readily undergo functional evolution should selection favor new functions in the future. Reviewers This article was reviewed by Martijn Huynen, Fyodor Kondrashov, and Dan Tawfik (nominated by Christoph Adami).


Background
Nature employs proteins for a vast range of tasks, and their capacity to evolve to perform diverse functions is one of the marvels of biology. Recently, it has become possible to reconstruct convincing scenarios for how new protein functions evolve. One of the most important conclusions of this work is that the initial steps may occur even before the new functions come under selection [1][2][3][4][5][6]. The reason is that in addition to their primary biological functions, most proteins are at least modestly effective at performing a range of other "promiscuous" functions [1,2,[7][8][9][10]. In laboratory experiments, selection can rapidly increase these promiscuous functions, often without much immediate cost to a protein's original function [2]. In a particularly compelling set of experiments, Tawfik and coworkers have shown that selection for a promiscuous activity likely explains the origin and evolution of a bacterial enzyme that hydrolyzes a synthetic compound only recently introduced into the environment [2,11,12]. Mounting evidence therefore supports the idea that new protein functions evolve when selection favors mutations that increase an existing weak promiscuous function.
But for as long as 50 years, since Linus Pauling and Emile Zuckerkandl published their seminal analysis of molecular change in proteins [13], it has been clear that only a fraction of the mutations that accumulate in most naturally evolving proteins are driven by selection for an entirely new function. Instead, many of the mutations responsible for natural sequence divergence do not change a protein's primary biological function, but rather are due to either neutral genetic drift [14] or pressure for a subtle recalibration of protein properties unrelated to the acquisition of an entirely new function [15]. However, mutations that accumulate under the constraint that they not interfere with a protein's primary function could still substantially alter other, promiscuous functions. Such alterations could then aid in the subsequent evolution of new functions.
Here we have experimentally investigated this possibility using a set of enzymes that have undergone genetic drift that is neutral with respect to a well-defined laboratory selection criterion for enzymatic activity on a single substrate [16]. We have examined how these enzymes have changed in their promiscuous activities on five other substrates. As described below, we find that the enzymes have often undergone substantial changes in their promiscuous activities, suggesting that neutral genetic drift could play an important role in enabling future functional evolution.

A set of neutrally evolved cytochrome P450 enzymes
We focused our analysis on cytochrome P450 proteins. P450s are excellent examples of enzymes that can evolve to catalyze new reactions, since they are involved in a wide range of important functions such as drug metabolism and steroid biosynthesis [17,18]. We worked with P450 BM3, a cytosolic bacterial enzyme that can perform a variety of hydroxylation and epoxidation reactions, and is thought to have a natural function that includes the hydroxylation of fatty acids [19,20]. We have previously described a set of P450 BM3 heme domain variants that were created by laboratory neutral evolution from a common parent sequence [16]. Here we briefly recap the procedure used to create these P450s in order to explain their origin and why they can properly be viewed as the product of neutral genetic drift, albeit under a selection criterion that represents a severe simplification of the pressures likely faced by P450s evolving in nature.
The essential difference between neutral genetic drift and adaptive evolution is that in the former case mutations that have no substantial effect on fitness spread stochastically in a population, while in the latter case mutations spread because they are beneficial and so favored by selection. Of course, it may be difficult to discern whether a specific mutation in a natural population has spread neutrally or due to favorable selection. But in the laboratory it is possible to define an arbitrary selection criterion to ensure that all mutations spread due to neutral genetic drift. Specifically, we imposed the requirement that the P450s had to hydroxylate the substrate 12-p-nitrophenoxydodecanoic acid  with an activity exceeding a specific threshold [16]. All mutant P450s were therefore straightforwardly classified as either functional (if they exceeded the threshold) or nonfunctional (if they did not). We readily acknowledge that the situation is almost surely more complicated for natural P450 BM3 enzymes, which are probably selected for several different enzymatic functions (the natural substrates of these enzymes are not definitively known [19,20]). In addition, we make no attempt to define how changes in a P450's enzymatic functions might map to changes in the overall fitness of its host organism, since there is not sufficient data on the metabolic and ecological role of P450 BM3 to convincingly reconstruct such a mapping. Instead, our selection criterion is simply an abstraction of the evolutionary requirement that an enzyme's primary activity must exceed some critical level in order to allow its host organism to robustly survive and reproduce. To implement laboratory neutral evolution using this selection criterion, we began with a single parent P450 BM3 heme domain variant (called R1-11) and used error-prone PCR to create random mutants of this parent [16]. Mutants that failed to yield sufficient active protein to hydroxylate at least 75% of the 12-pNCA of the R1-11 parent when expressed in Escherichia coli were immediately eliminated, while all other mutants were carried over to the next generation with equal probability. Any mutations that spread among the offspring sequences were therefore by definition due to neutral genetic drift, since there was no opportunity for any functional mutant to be favored over any other.
Throughout the remainder of this work, we will refer to mutations that spread by neutral genetic drift during our experiments as "neutral" -the reader should keep in mind that these mutations might not be neutral in all other evolutionary scenarios. In addition, in many natural settings a mutation may be slightly deleterious or slightly advantageous. A more nuanced description would therefore characterize a full fitness landscape containing different gradations of fitness [21,22], with evolution moving along ridges of high fitness [23]. Because the binary functional/nonfunctional classification made in our experiments ignores such gradations, these ridges in the fitness landscape are reduced to a neutral network, with all mutants either on the network (if they are functional) or off the network (if they are nonfunctional). We also emphasize that the fact that the mutations spread due to neutral genetic drift does not mean that they have no effect on the protein's properties. Indeed, one of the growing realizations about protein evolution is that mutations that spread by neutral genetic drift may still have an impact on future evolution [24,25]. One mechanism for this impact is that neutral genetic drift can change a protein's stability and so alter its tolerance to future mutations [26][27][28]. As will be demonstrated below, another mechanism is that neutral genetic drift can alter a protein's promiscuous functions.
As described previously [16], the end result of the neutral evolution was 44 different P450 variants, each of which satisfied the selection criterion for activity on 12-pNCA (these are the combined final sequences from the monomorphic and polymorphic populations in [16]). For the current study, we analyzed the promiscuous activities of 34 of these neutrally evolved P450 variants. The sequence diversity of these P450s is shown in the phylogenetic tree of Figure 1 and the full genotypes are given in Additional file 1; they have accumulated an average of four nonsynonymous mutations each. Most of the mutations in the P450 variants are unique -out of a total of 105 different amino acid substitutions, only twelve occur in more than variant (Additional file 1).

Activities of the neutrally evolved P450 enzymes
All of the P450 variants had evolved under selection solely for their ability to hydroxylate 12-pNCA. We examined their promiscuous hydroxylation activities on the five other substrates shown at the top of Figure 2. Two of these promiscuous substrates, propranolol (PROP) and 2amino-5-chlorobenzoxazole (2A5C, also known as zoxazolamine), are drugs that are metabolized by human P450s [29,30]. The other three promiscuous substrates, 11-phenoxyundecanoic acid (11PA), 2-phenoxyethanol (2PE), and 1,2-methylenedioxybenzene (MDOB), are organic compounds of increasing structural dissimilarity to 12-pNCA. The parent P450 possessed at least some hydroxylation activity on all of these substrates (throughout the remainder of this work, "activity" refers to total substrate turnovers per enzyme).
We measured the activities of all 34 neutrally evolved P450s on the five promiscuous substrates as well as 12-pNCA. Figure 2 shows the fold change in activity of each of the variants relative to the parent P450 on all six sub-Phylogenetic tree of the neutrally evolved P450s Figure 1 Phylogenetic tree of the neutrally evolved P450s. The tree shows the relationship among the 34 neutrally evolved P450 variants examined in this study. All of the P450s neutrally evolved from the same R1-11 parent P450. The horizontal lengths of the branches are proportional to the number of nonsynonymous mutations, as indicated by the scale bar. The vertical arrangement of the branches is arbitrary.

M1
M2 one mutation strates, and Figure 3 shows the same data with standard errors (raw data are in Additional files 2 and 3). As is apparent from these figures, many of the neutrally evolved P450s have undergone changes in their activities that substantially exceeded the standard errors of the measurements (for a more detailed analysis of measurement variability see the Methods and Additional file 4). Even on 12-pNCA, some of the variants have undergone modest increases or very mild decreases in activity. The modest increases in 12-pNCA activity were unsurprising, since the parent P450 only hydroxylates 12-pNCA with about a quarter of the activity reported for a P450 engineered for maximal 12-pNCA activity [31]. Likewise, the mild decreases in 12-pNCA activity were due to the fact that during neutral evolution the P450s were only required to maintain this activity above a minimal threshold (75% of the total 12-pNCA conversion of the parent protein when expressed in E. coli [16]). The changes in the promiscuous activities were often much larger than those on 12-pNCA. For example, several of the neutrally evolved variants have undergone nearly four-fold increases in activity on one or more of 2PE, 2A5C, and MDOB. Other variants have experienced equally large decreases in one or more of the promiscuous activities. The changes in promiscuous activities are therefore quite substantial, although still much less than the 10-to 40fold increases reported in directed evolution experiments that explicitly selected for improvement in a previously unselected P450 BM3 activities [31][32][33].
Fold changes in P450 activities with standard errors Figure 3 Fold changes in P450 activities with standard errors. The bar graphs show the fold change in activity of all 34 neutrally evolved P450 variants on all six substrates. This is the same data as in Figure 2, except these graphs also give error bars showing the standard errors in two separate measurements of the activities. In most cases the standard errors are much smaller than the activity changes themselves.  1  2  3  7  8  1  2  3  4  5  9  0  1  2  3  4  1  3  4  6  8  9  0  1  3  5  6  7  8  9  0  2  3  Each row shows the data for a different P450 variant, while each column shows the activity on a different substrate. The fold change in activity is the ratio of the variant's activity to that of the neutral evolution parent. Both the substrates and the P450 variants are hierarchically clustered according to the activity profiles, as shown by the dendrograms at top and left. Substrate abbreviations are the same as those used in the text: PROP -propranolol, 2A5C -2-amino-5-chlorobenzoxazole, MDOB -1,2-methylenedioxybenzene, 2PE -2phenoxyethanol, 11PA -11-phenoxyundecanoic acid. The standard errors for the changes in activity displayed in the heat map tend to be much smaller than the changes themselves; these errors are shown explicitly in Figure 3.

Broad patterns of change in activity can be rationalized in terms of substrate properties
The data in Figures 2 and 3 clearly indicate that some of the P450s have undergone substantial changes in their activities. In an effort to understand the nature of these changes, we sought to determine whether there were any clear patterns in the activities. In Figure 2, the substrates have been hierarchically clustered so that each successive cluster contains substrates on which the P450s have increasingly similar activities (the clustering is illustrated by the tree-like dendrogram at the top of the figure, with similar substrates in adjacent columns). This clustering of the substrates based on P450 activities is also readily rationalized in terms of their chemical structures. For example, 2A5C and MDOB cluster, meaning that P450s with high activity on one of these substrates also tend to have high activity on the other. Presumably, they cluster because the similarity of their structures (both are fusions of six and five membered rings) means that they have similar modes of docking in the substrate binding pocket. Likewise, 12-pNCA and 11PA are phenoxycarboxylic acids of similar chain length, and are in the same cluster. To a lesser extent, 2PE resembles 12-pNCA and 11PA in its phenolic ether structure, and it falls into a higher level cluster with these two substrates. PROP shares a fused ring structure with 2A5C and MDOB, and these three substrates share a common higher level cluster. Overall, the hierarchical clustering indicates that substrates that appear similar to the human eye are also "seen" this way by the P450s, since the P450s tend to increase or decrease their activities on these substrates in a coordinated fashion. Figure 2 also shows the P450 variants arranged in hierarchical clusters. A visual inspection immediately indicates that there is an overall association among all of the activities. Some of the P450 variants (redder rows) tend to show improved activity on most substrates, while others (bluer rows) tend to show decreased activity on most substrates. Taken together with the clustering of the similar substrates, this overall association suggests that there are two main trends in the activity changes. First, the P450s appear to have undergone general changes in their catalytic abilities that are manifested by broad increases or decreases in activity on all substrates. Second, the P450s appear to have experienced shifts in specificity to favor either the fused ring or the phenolic ether substrates.
To test whether these two apparent trends in activity changes are supported by a quantitative examination of the data, we performed principal component analysis. Principal component analysis is a well-established mathematical technique for finding the dominant components of variation in a data set, essentially by diagonalizing the covariance matrix. As suggested by the foregoing visual inspection, principal component analysis revealed that two components explained most of the changes in P450 activities ( Table 1). The first component contained positive contributions from all six substrates, and so represents a general improvement in catalytic ability. The second component contained positive contributions from the fused ring substrates and negative contributions from the phenolic ether substrates, and so represents an increased preference for the former class of substrates over the latter. Together, these two components explain 82% of the variance in activities among the 34 P450 variants. The remaining 18% of the variance is explained by the four remaining components, which represent more subtle shifts in activity that are less easily rationalized by intuitive chemical arguments. Note that the major trends in activity changes (overall increase/decrease in catalytic ability, and increased preference for either phenolic ether or fused ring substrates) are unlikely to be caused by the occurrence of the same mutation in several P450 variants, since most of the variants share no common mutations (Additional file 1). Instead, there must be a number of different mutations that can contribute to each of these trends.

Overall distributions of change in the activities
The preceding sections have demonstrated that neutral genetic drift can lead to substantial changes in P450 activities, and that many of these changes can be understood as resulting from either fairly general increases/decreases The first two principal components explain 82% of the variance in P450 activity profiles. The table shows the composition of these two components and the variance explained by each. The first component contains positive contributions from all substrates and can be thought of as representing a general high catalytic ability. The second component can be thought of as representing discrimination between fused ring substrates (PROP, 2A5C, and MDOB) and phenolic ether substrates (12-pNCA, 11PA, 2PE). Substrate abbreviations are as defined in the legend to Figure 2. The analysis was performed on the logarithms of the fold changes in activity.
in catalytic ability or shifts in preference for different broad classes of substrate structures. In this section, we examine whether there are any pervasive trends in the distributions of activity changes -for example, did most of the promiscuous activities tend to increase or decrease? If a property is not under any evolutionary constraint, then during neutral genetic drift its values might be expected to be distributed in a roughly Gaussian fashion, as the neutrally evolving proteins freely sample from the presumably normal underlying distribution. On the other hand, if a property is constrained by selection to remain above a certain threshold, then during neutral genetic drift its values should display a truncated distribution since selection culls proteins with values that fall below the threshold (such a distribution has been predicted for protein stability by simulations [34] and theory [25]). In the case of our experiments, selection acted explicitly only on 12-pNCA activity, and changes in the other promiscuous activities were limited only by the extent that these activities were affected indirectly by this constraint (an effect termed "apparent selection" [35]). Figure 4 shows the distribution of changes in activity for each of the six substrates. The distribution for 12-pNCA appears to be truncated on the left, as expected since the P450s neutrally evolved under a requirement to maintain the ability to hydroxylate 12-pNCA. Some of the P450s have undergone a mild decrease in 12-pNCA activity, reflective of the fact that the neutral evolution selection criterion provided a small amount of latitude by allowing the total amount of hydroxylated 12-pNCA to drop to 75% of the parental value [16]. A number of P450s have neutrally evolved 12-pNCA activity that modestly exceeds that of the parent -again unsurprising, because the parental 12-pNCA activity falls well below the maximal value achievable for this type of protein [31]. The distribution for 11PA resembles that for 12-pNCA, probably because the activities on these two chemically similar substrates are highly linked, as discussed in the previous section. This is an example of apparent selection on one activity due to actual selection on a closely linked activity.
The other four promiscuous activities are less linked to 12-pNCA activity, and their distributions are much more symmetric. The symmetric shapes of these distributions suggest that neutral genetic drift has sampled from a roughly Gaussian distribution for these four promiscuous activities. For three of the substrates (PROP, 2A5C, and MDOB), the distributions of activities are approximately centered around the parental activity. This centering indicates that the promiscuous activities of the parent on these three substrates are typical of what would be expected of a neutrally evolved P450. The distribution for 2PE, on the other hand, is shifted towards activities higher than that of the parent. This shift indicates that the parent is less active on 2PE than a typical neutrally evolved P450, since most neutral mutations tend to increase this activity.
If the activity distributions of Figure 4 truly reflect what would be expected after a very long period of neutral genetic drift (i.e., if they are "equilibrium" distributions), then each variant represents a random sample from the underlying distribution of activities among all P450s that can neutrally evolve under this selection criterion. In this case, there should be no correlation between the extent of change in activity and the number of accumulated mutations, since the P450s should have lost all "memory" of the parent's activity. On the other hand, if there has only been a small amount of genetic drift, then the neutrally evolved P450s may maintain activity profiles that are similar to the parent simply because there have not been enough mutations to alter unselected properties of the enzyme (such as hydrophobicity properties of the active site that are not essential for the selected activity, but influence the promiscuous activities). In this sense, neutral genetic drift can be thought of as analogous to the equilibration of a physical system, since after a long period of time an equilibrated physical system will explore all microstates compatible with external constraints regardless of the initial condition, but after a short period of time the observed state may still resemble the initial condition.
In order to determine whether the neutral genetic drift had proceeded for enough mutations to "equilibrate" the activity profiles, we examined whether variants with fewer mutations more closely resembled the parent's activity profile. Such a resemblance would indicate that the neutral genetic drift had not yet completely eliminated residual memory of the parental activity. We computed the correlation between the magnitude of each variant's change in activity and the number of nonsynonymous mutations it possessed relative to the parent. Table 2 shows that the magnitude of activity change is positively Distributions of activity changes on each of the six substrates Figure 4 Distributions of activity changes on each of the six substrates. The histograms show the distributions of fold changes in activity for all 34 neutrally evolved P450 variants on each of the six substrates, with a value of one indicating that the activity is the same as the neutral evolution parent. MDOB fold change in activity relative to parent correlated with the number of mutations for all six substrates. Although the correlations for the individual substrates are mostly not statistically significant due to the small number of samples, the overall correlation for all six substrates is highly significant (P = 10 -3 ). Therefore, the P450 activities are still in the process of diverging from the parental values by neutral genetic drift. If the variants were to undergo further neutral genetic drift, one would expect to see even larger changes in their promiscuous activities. We note that the correlation between activity changes and number of mutations can be explained either by a model in which each successive mutation causes a progressively larger change in promiscuous activity, or by a model in which variants with more mutations are simply more likely to have experienced one of a relatively small number of mutations that lead to large changes in promiscuous activities.
We also examined whether P450 variants with mutations near the substrate binding pocket were more likely to have undergone large changes in their activities. Five of the P450 variants had a mutation to a residue that was within 5 Å of the surrogate substrate in the P450 BM3 crystal structure [36]: variant M2 had A74V, M8 had A330V, M13 had M354I, M15 had A74P, and M24 had I263V [16]. Two of these mutated residues are of clear importance, since mutating residue 74 has previously been shown to shift substrate specificity [29,37,38] and residue 263 plays a role in the substrate-induced conformational shift [39]. We compared the activity changes for the five variants with mutations near the binding pocket to those for the 29 variants without any such mutations, computing the magnitude of activity change as the absolute value of the logarithm (base two) of the fold change in activity averaged over all six substrates. The average magnitude of activity change for the five variants with mutations near the active site was 0.88, while the average for the other 29 variants was 0.47. These averages are significantly different, with an unequal variance T-test P-value of 10 -2 . Therefore, variants with mutations near the substrate binding pocket have a higher probability of undergoing large changes in activity than all other variants, although many variants without mutations near the pocket also underwent substantial activity changes.

Conclusion
We have shown that neutral genetic drift can lead to changes of as much as four-fold in the promiscuous activities of P450 proteins. The ubiquity of these changes is striking -even though many of the neutrally evolved P450s had only a handful of mutations, most of them had experienced at least some change in their promiscuous activities. P450s may be especially prone to this type of change, since their catalytic mechanism involves large substrate-induced conformational shifts [40] that can be modulated by mutations distant from the active site [32,37,41]. In addition, P450s have a tendency to eventually undergo irreversible inactivation that can be promoted by reduced coupling between substrate binding and conformational shifts, as well as by other poorly understood determinants of catalytic stability [19,42,43]. Therefore, the changes in P450 activities that we observed can be caused by factors that alter the catalytic lifetime of the enzyme in addition to factors that alter the maximum catalytic rate. There are thus ample opportunities for mutations that spread by neutral genetic drift to cause subtle alterations in a P450's promiscuous activities. But we believe that neutral genetic drift is also likely to cause substantial changes in the promiscuous activities of enzymes with other catalytic mechanisms. In support of this idea, a recent study by Tawfik and coworkers [44] indicates that mutations with little effect on the native lactonase activity of serum paraoxonase can alter this enzyme's promiscuous activities. Taken together, this study and our work suggest that promiscuous protein functions can change during neutral genetic drift. These changes could in turn have important implications for future functional evolution. For example, one can easily imagine a scenario in which neutral genetic drift enhances a promiscuous protein function, and then a subsequent gene duplication allows natural selection to transform one of the genes into the template for a protein with a fullfledged new functional role [1][2][3][4][5][6].
One of the most attractive aspects of our study is the degree to which the changes in P450 activities during neutral genetic drift could be understood in terms of the chemical structures of the substrates. Neutral genetic drift did not simply cause unpredictable shifts in activities. The extent of change in activity is positively correlated with the number of nonsynonymous mutations the P450 has undergone relative to the neutral evolution parent. Each column shows the Pearson correlation between the number of nonsynonymous mutations and the absolute value of the logarithm of the fold change in activity for a different substrate, computed over all 34 P450 variants. The final column (ALL) is the correlation among the 6 × 34 pooled data points for all six substrates. The P-values are shown in parentheses; none of the correlations for the individual substrates are significant at a 1% level (due to the small number of data points), but the overall correlation for all substrates is highly significant. Substrate abbreviations are as defined in the legend to Figure 2.
Instead, most of the variation was explained by two eminently intuitive components: an overall increase or decrease in catalytic ability, and a preference for either fused ring or phenolic ether substrates. We have suggested that neutral genetic drift under a fixed selection criterion can be viewed as sampling underlying "equilibrium" distributions of activities. The distributions for different activities are linked, since we have shown that P450s with good activity on one substrate will frequently also be highly active on chemically similar substrates (similar linkages have been observed in proteins created by recombination [45,46]). So while it may be impossible to know exactly how any specific mutation will affect a given activity, measuring a handful of activities may allow one to make relatively accurate predictions about other closely linked activities [47]. The prerequisite for making such predictions is an understanding of the linkages among activities in the set of sequences explored by neutral genetic drift. We have made the first steps in elucidating these linkages for P450s that have neutrally evolved under one specific selection regime. The linkages are very similar to those that would have been made by an organic chemist grouping the substrates on the basis of their chemical structures. Knowledge of these linkages is of use in understanding the origins of enzyme specificity [10,48] -if an enzyme displays high activity on one substrate but low activity on another, then either these two activities are negatively linked during neutral genetic drift or selection has explicitly disfavored one of them.
Our work also has implications for the general relationship between neutral genetic drift and adaptive evolution [49][50][51][52][53][54]. A number of studies focused on RNA [50][51][52] or computational systems [53,54] have suggested that genetic drift might aid in adaptive evolution. Our study and that of Tawfik and coworkers [44] support this notion for the evolution of new protein functions. However, the way that drift in promiscuous functions promotes adaptive evolution is slightly different than the paradigm proposed for RNA [50][51][52] and computational systems [53,54]. In those systems, neutral genetic drift is envisioned as allowing a sequence to move along its neutral network until it reaches a position where it can jump to a new higher-fitness and non-overlapping neutral network. In contrast, promiscuous protein functions change even as a protein drifts along a single neutral network. The adaptive benefits of this drift come when new selective pressures suddenly favor a previously irrelevant promiscuous function, in effect creating a new neutral network that overlaps with parts of the old one.
Overall, experiments have now demonstrated two clear mechanisms by which neutral genetic drift can aid in the evolution of protein functions. In the first mechanism, neutral genetic drift fixes a mutation that increases a pro-tein's stability [24,25,55], thereby improving the protein's tolerance for subsequent mutations [26][27][28], some of which may confer new or improved functions [28]. In the second mechanism, which was the focus of this work and the recent study by Tawfik and coworkers [44], neutral genetic drift enhances a promiscuous protein function. This enhancement poises the protein to undergo adaptive evolution should a change in selection pressures make the promiscuous function beneficial at some point in the future.

Determination of P450 activities
We attempted to determine the activities of all 44 neutrally evolved P450 variants described in [16] (22 from the final monomorphic populations and 22 from the final polymorphic population). Ten of these variants expressed poorly in the procedure used here (as described in more detail below), and so were eliminated from further analysis since their low expression led to large errors in the activity measurements. That left activity data for the 34 neutrally evolved P450 variants listed in Figures 2 and 3, as well as for the R1-11 neutral evolution parent. The activities for each of these P450 variants were measured on all six substrates (12-pNCA, 2PE, PROP, 11PA, 2A5C, and MDOB). In all cases, the activities represent the total amount of product produced after two hours, and so are proportional to total substrate turnovers per enzyme. P450 BM3 enzymes typically catalyze only a finite number of reaction cycles before becoming irreversibly inactivated, and we believe that all reactions were essentially complete after two hours, so these activities should represent the total turnovers of the enzymes during their catalytic lifetimes.
To obtain P450 protein for the activity measurements, we expressed the protein using catalase-free Escherichia coli [56] containing the encoding gene on the isopropyl β-D- The P450 concentration was determined using the carbon monoxide (CO) difference spectrum assay [57]. Immediately before use, we prepared a 5× stock solution of 50 mM sodium hydrosulfite in 1.3 M potassium phosphate, pH 8.0. A multichannel pipette was used to add 20 µl of this stock solution to each well of the microtiter plates (which contained 80 µl of a dilution of clarified lysate), so that the final sodium hydrosulfite concentration was 10 mM in each well. The plates were briefly mixed and the absorbances were read at 450 and 490 nm. The plates were then incubated in a CO binding oven [57] for 10 minutes to bind CO to the iron. The absorbance was then again read at 450 and 490 nm. The amount of P450 is proportional to the increase in the magnitude of the absorbance at 450 nm minus the absorbance at 490 nm. At each dilution along the standard curve, the reading for the null control (lysate dilutions without P450) was subtracted from the reading for each P450 variant to control for clarified lysate background. Additional file 2 shows the standard curves for all P450 variants. Ten P450 variants had standard curve slopes less than or equal to 0.020, indicating a low P450 concentration. These were the ten P450 variants that we discarded from further analysis, since the low P450 concentration decreased the accuracy of the measurements.
To determine the activity on 12-pNCA, we monitored the formation of the yellow 4-nitrophenolate compound that is released upon hydroxylation of the twelfth carbon in the 12-pNCA molecule [31,58]. Immediately before use, we prepared a 6× stock solution of 12-pNCA by adding 3.6 parts of 4.17 mM 12-pNCA in DMSO to 6 2) and mixed. We incubated the plates on the benchtop for two hours. To detect the formation of phenolic products, a pipetting robot was used to add and mix 120 µl of quench buffer (4 M urea in 100 mM sodium hydroxide) to each well. We then used the robot to add and mix 36 µl per well of 0.6% (w/v) of 4-aminoantipyrene in distilled water, and immediately read the absorbance at 500 nm. To catalyze formation of the red compound produced by coupling a phenolic compound to 4-aminoantipyrene [59,60], we then used the pipetting robot to add and mix 36 µl per well of 0.6% (w/v) of potassium peroxodisulfate in distilled water. The plates were incubated on the benchtop for 30 minutes, and the amount of product was quantified by the gain in absorbance at 500 nm. At each dilution along the standard curve, the corresponding null control lysate dilution was subtracted from the reading to control for lysate background. Additional file 2 show the standard curves for all P450 variants.
In order to extract enzymatic activities from the standard curves, we fit lines to the data points. For some of the substrates (most notably 12-pNCA and 2PE), many of the P450 variants were sufficiently active to either saturate the substrate or exceed the linear range of absorbance readings. Therefore, we examined each standard curve by eye to determine which points remained in the linear range. Lines were then fit to the points in the linear range. These fits are shown in Additional file 2. In the plots in this file, all points that were deemed to fall in the linear range (and so were used for the fits) are shown as filled shapes, while all points that were deemed to fall outside the linear range (and so were not used in the fits) are shown as empty shapes. The figures show the slopes of the lines for all replicates (two replicates for all P450 variants except for R1-11, which had four replicates). These slopes are averaged for a best estimate of the slope, and the standard error computed over these two measurements is also reported.
To compare the activities (total substrate turnovers per enzyme) among the different P450 variants, it is first necessary to normalize to the enzyme concentration. To do this, we took the ratio of the slope for each substrate divided by the slope of the CO different spectrum, propagating the errors. These normalized slopes are proportional to the activity on each substrate. The normalized slopes are given in Additional file 3. This file also lists the number of nonsynonymous mutations that each P450 variant possesses relative to the R1-11 parent sequence, as originally reported in [16]. These normalized slopes allow for accurate comparisons among the P450 variants, and were used in the analyses in this paper. To convert these normalized slopes into total substrate turnovers per enzyme, it is necessary to multiply them by the ratio of extinction coefficients. The extinction coefficient for the CO difference spectrum reading (the absorbance at 450 nm minus that at 490 nm) is 91 mM -1 cm -1 [57], and we calculated the extinction coefficient at 398 nm for the 4nitrophenolate group in our buffer to be 12,000 M -1 cm -1 . Therefore, for 12-pNCA, the total number of substrate turnovers per P450 enzyme is 7.58 times the ratio of the 12-pNCA standard curve slope to the CO difference spectrum slope. This indicates that our parent protein had about 250 12-pNCA turnovers per enzyme, compared to the 1,000 reported for a variant engineered for maximal 12-pNCA activity [31]. For the other substrates assayed with the 4-AAP assay, the extinction coefficient at 500 nm for the 4-AAP/phenol complex has been reported to be 4,800 [60]. However, we believe that this extinction coefficient could be of dubious accuracy for our data. Depending on the exact type of phenolic compound created by P450 hydroxylation, the extinction coefficient for the 4-AAP/phenol complex may vary. Assuming the extinction coefficient of 4,800 M -1 cm -1 is accurate, then the total number of substrate turnovers per P450 enzyme is 19.0 times the ratio of the substrate standard curve slope to the CO difference spectrum slope. Using this coefficient, the parent P450 had roughly 1,000 turnovers on 2PE, 30 turnovers on PROP, 400 turnovers on 11PA, 50 turnovers on 2A5C, and 80 turnovers on MDOB. The high activities on 2PE and 11PA are presumably due to the fact that lack of polar substituents on the aromatic ring allows these compounds to enter the hydrophobic P450 BM3 binding pocket [36] more easily than 12-pNCA. However, we emphasize that the exact numerical values for the turnovers for these five substrates are questionable. Definitive determination of the extinction coefficients would require analytical analysis of the enzymatic products for each P450 variant on each substrate, which is beyond the scope of this study.

Analysis of activity data
The raw activity values computed for the P450 variants are listed in Additional file 3. To analyze and display this data, we computed the fold change in activity of each variant relative to the R1-11 parent P450. The fold change is simply the variant activity divided by the parent activity on each substrate, with the standard errors propagated to give an error on the fold change. In Figures 2 and 3, these fold changes are displayed on a logarithmic scale so that each unit corresponds to a two-fold increase or decrease in activity. In Figure 2, the substrates and the P450 variants have both been clustered, as shown by dendrograms on the side of the heat map. The clustering was performed using the standard hierarchical clustering function of the R statistical package. This is complete linkage hierarchical clustering, with the distances computed as the Euclidian distance between the logarithms of the fold changes in activity. The standard errors on the fold changes in activity are not incorporated into Figure 2 or any of the related analysis. However, these standard errors are shown in Figure 3; it is apparent from this figure that the errors tend to be much less than the fold changes in activity themselves.
In Figure 4, the histogram bins are logarithmically spaced so that each bin contains a 2 0.5 -fold range of activities. For example, the histogram bin centered at one contains all variants with between 2 -0. 25  The principal component analysis shown in Table 1 was performed using the R statistical package, with inputs being the logarithms of the fold changes in activity. Since these log fold changes in activity contained no arbitrary units (they were already normalized to the parent), the data was neither scaled nor zeroed before performing the analysis. Table 1 shows the composition and the percent of variance explained (the eigenvalue for that component divided by the sum of all eigenvalues) for the first two components. The remaining four components were relatively unimportant, explaining 7%, 5%, 4%, and 2% of the total variance. Figures 2 and 3 show the changes in activity for each of the 34 neutrally evolved P450s. The data in this figure are the mean of two separate measurements performed on protein samples from the same preparation of expressed and lysed protein. Each of the two measurements involves independent determination of the P450 concentration and the enzymatic activity via the standard curves shown in Additional file 2. The two measurements are not fully independent biological replicates, since they come from the same preparation of protein. However, expression and lysis typically introduces variability in measurements due to differing concentrations of protein, and since each of the two measurements independently determines P450 concentration, they should correct for most of the variation induced by expression and lysis. To confirm this fact, and to obtain a more detailed characterization of the variability in the readings, we performed measurements on different preparations of the R1-11 parent protein. Overall, we used protein from four separate expression/lysis preparations of this protein. For each of these preparations, we performed two separate measurements of P450 concentration and activity. The resulting data are shown in the plots of Additional file 4, which is scaled so that it can be directly compared to Figure 3. The plots of Additional file 4 demonstrate that the variability in measurements on samples from different expression/lysis preparations is no larger than the variability between measurements from the same preparation. This confirms that the independent determination of P450 concentration for each measurement adequately controls for any differences between expression/lysis preparations. Additional file 4 also demonstrates that the overall variability in repeated measurements on the parent sample is small compared to the difference between samples, and is adequately captured by the error bars shown in Figure 3. So overall, while repeated measurements on the same sample do show a small amount of variability, this variability is smaller than the typical differences between different P450 variants, and is adequately captured by the duplicate measurements from the same preparation of protein that were used in this study.

Phylogenetic tree
The phylogenetic tree shown in Figure 1 is based on the number of nonsynonymous mutations the P450 variants have relative to the R-11 neutral evolution parent. Each of the P450s that evolved in a monomorphic population (prefix of M) are known to have diverged independently, and so are drawn on their own branch regardless of any sequence identity to other variants. The exact phylogenetic relationship of the P450s that evolved in the polymorphic population (prefix of P) is not known, so the portion of the tree for these mutants was reconstructed by maximum parsimony. The tree is based only on the nonsynonymous mutations, and all mutations weighted equally. Full nucleotide and amino acid sequences of the P450s are given in Additional file 1.

Authors' contributions
JDB, PR, and FHA designed the study. JDB, PR, and ZL performed the experiments. JDB and PR analyzed the data. JDB and FHA wrote the paper.

Reviewer 1: Dr. Martijin Huynen, Nijmegen Center for Molecular Life Sciences and Center for Molecular and Biomolecular Informatics
The manuscript by Bloom et al presents experimental data to show how neutral evolution can explore phenotype space and help in adaptation, adding a new answer to Sewall Wright's question to Kimura about the potential "future" benefit of neutral evolution [61]. They show that the proteins that are more or less equivalent with respect to their primary function, and that are connected through a neutral path, vary with respect to their ability to perform "other", promiscuous functions for which there has been no selection. The manuscript is well written, the results speak for themselves, the arguments are sound, and I do not have much to add. I should comment that I am not an experimentalist, and can therefore not judge the experimental results in detail.

Authors' response
We thank the reviewer for the paragraph putting the manuscript in this context of Wright's question to Kimura.

Specific questions:
Can the "four-fold" change in the promiscuous activities be put into perspective? How does this e.g. compare to the activity of a protein that has specifically been selected for any of the tested functions?

Authors' response
This is an excellent question. The short answer (which is now briefly mentioned in the main text) is that studies selecting for improvement P450 BM3 activities that have not previously been the target of selection tend to report increases of 10-to 40fold. As a caveat, we note that none of these studies exactly parallel the experimental methodology used here. One study that selected for high P450 peroxygenase activity on the substrate 12-pNCA increased the total substrate turnovers 11-fold with 9 amino acid mutations (not all 9 mutations necessarily contributed to the improved activity; some may have "hitchhiked" along) [31]. A study beginning with the P450 BM3 holoenzyme increased the maximal rate of octane hydroxylation 38fold with 9 amino acid mutations [32]. Another study beginning with a P450 holoenzyme variant increased the number of turnovers of propane hydroxylation 12-fold with a dozen amino acid mutations [33].
Can the variation in catalytic activity also be explained as a variation in enzyme catalytic lifetimes?
Authors' response Yes! By measuring total substrate turnovers, we are essentially measuring the product of the catalytic rate times the catalytic lifetime. P450s are known to have finite catalytic lifetimes that depend on the degree of coupling between substrate binding and the conformational shifts associated with the enzyme catalytic cycle. So mutations could affect the catalytic rate, the overall catalytic lifetime of the protein, or the catalytic lifetime of the protein on a particular substrate due to the degree of coupling between binding to that substrate and conformational changes. The fact that the total substrate turnovers depend on this entire set of protein properties is probably the reason why so many different mutations influence activity.
Is it possible to separate the two effects of the mutations on the activity (substrate specificity versus "overall" catalytic activity) by where those mutations occurred in the protein, like you do in the last paragraph of the Results section by averaging the effects of the mutations for all 6 substrates?
Authors' response Unfortunately, we do not believe that it is possible to convincingly make such a separation. The reason is that P450 is a very dynamic enzyme that undergoes large conformational changes during its catalytic cycles -indeed, the exact structure of the catalytically active form is not even known (the substrate position in the crystal structure conformation is thought to place the substrate several angstroms too far from the iron for catalysis). Therefore, mutations that are distant from the active site have on occasion been shown to affect substrate specificity. So there is no clear basis for predicting which mutations contribute more to overall activity versus specificity.
Fourth paragraph of the "Overall distributions of change in the activities" in Results: "On the other hand if there has not been enough neutral genetic drift etc." Could you change that sentence in one way or another such that it becomes clearer what is what relative to what? e.g. "On the other hand if there has not been enough neutral genetic drift after a few mutations to completely eliminate etc."

Authors' response
We have clarified this point. There are some properties of the protein that are not essential for the selected activity but that influence the promiscuous activities. Neutral genetic drift will tend to change these properties, and after some number of mutations the properties will have changed enough that they are no longer correlated with the properties of the evolutionary parent more than would be expected for any two random proteins that satisfied the selection criterion. Our original meaning was that there had been "enough" mutations that this correlation was gone. In analogy with physics, this would occur when the autocorrelation function had decayed to zero. In any case, we have rewritten the relevant sentences to make clear that "enough" means that the correlation between the parental and neutrally evolved proteins is no larger than between a random pair of proteins that satisfy the selection criterion.

Reviewer 2: Dr. Fyodor Kondrashov, University of California San Diego
Investigations and descriptions of fitness ridges (a fitness ridge is a function relating fitness and genotype) are necessarily difficult and confusing due to the complex and often multidimensional nature of such ridges. Thus, it is truly remarkable that in the recent past several studies have attempted to describe the shapes of fitness ridges one gene at a time. The present study is one of them. The authors study the effects of mutations on several promiscuous functions that have been selected for not ruining the activity of a related but distinct function. Indeed, under laboratory conditions the mutations that are allowed to propagate are neutral by definition, and the authors in essence studied the effect of mutations accumulated by drift. However, I do not think that the main claim of this paper, that "drift can aid functional protein evolution" is appropriate and I believe that the results are much more important to the study of fitness ridges than to the study of the impact of neutral drift.
Assuming a starting condition of such neutral evolution when the activity of promiscuous functions truly bears no fitness effects, consider the impact of the past several mutations right before this activity became a target of selection. In an equilibrium setting, those mutations are equally likely to have decreased the level of the activity as they are to have lead to an increase. Therefore, the paper could have been titled "drift can inhibit functional pro-tein evolution" just as well. Regardless, I think that the impact of such an artificially neutral mutation on future adaptations is not the issue here. Just as with the original work by Aharoni et al. [2], the issue is the degree of pleiotropic interactions of different functions that can be performed by a single protein because understanding the nature of such interactions is crucial for our understanding of fitness ridges.

Authors' response
The reviewer raises two main points.

Authors' response
We thank the reviewer for bringing this reference, which is based on computational lattice models of proteins, to our attention. It is now cited in our work.
First paragraph of "Overall distributions of change in the activities" section of Results: Another factor that may greatly affect whether or not mutations that change a promiscuous function are at equilibrium is whether or not there is selection against such function. Perhaps such selection is expected, because a promiscuous function may act as a competitive inhibitor of the original one. The possibility of such selection is another reason why I think that although in the laboratory these mutations are by definition neutral, we must first study the fitness function of these proteins before investigating the importance of drift.

Authors' response
We agree with the reviewer that in a natural setting there may be selection against some promiscuous functions, and we now mention this possibility in the text. However, due to the limitations of what is realistically possible to accomplish in the laboratory, we are not in a position to determine the natural fitness function for P450 enzymes. The reason is that while it is possible to accurately measure enzymatic activities as we have done in this paper, we do not have the capability of directly measuring how these changes in activity would affect the fitness of an organism in a natural setting.
First and second paragraphs of "Overall distributions of change in the activities" section of Results: When selection on one phenotype causes a change in another, this is called "apparent selection."

Authors' response
We thank the reviewer for pointing out the correct term for this phenomenon; we now use this term at the relevant point in the text.
Third paragraph of Conclusions: "The adaptive benefits of this drift come when new selective pressures suddenly favor a previously irrelevant promiscuous function, in effect creating a new neutral network that overlaps with parts of the old one." If a new selective pressure appeared, it changed the neutral network to a ridge. In general I think that the term "neutral network" should be avoided because it describes a specific model where each mutation is either lethal or neutral. This is exactly replicated in this experimental setup, however, as soon as adaptive selection comes into play the "neutral network" model becomes insufficient to describe either the fitness states of each genotype or the evolutionary processes that follow. In general, the term "fitness ridge" or "fitness landscape" is preferable for this reason. I would be interested to see the correlations of the change of each promiscuous function to that of the native one. Just by looking at Fig 2, it seems that every single function is actually correlated, but it would be interesting to see how much.

Authors' response
The reviewer is correct that the promiscuous activities are correlated to the native one. However, rather than computing correlation coefficients we have chosen to perform the principal component analysis shown in

Reviewer 3: Dr. Dan S. Tawfik (nominated by Christoph Adami), Weizmann Institute of Science
Major comments: 1. Outside the context of organismal fitness, neutrality is obviously hard to define, and even the former is highly dependent on growth conditions. The limitations of determining neutrality in an in vitro system should be discussed. These limitations should also be addressed in view of the fact that P450s are broad range, rather than a 'one substrate' enzyme. Evidence in favour of 12-pCNA being a reasonable surrogate of the primary substrate(s) of the P450 studied here should also be outlined.

Authors' response
This is an excellent comment by the reviewer (some similar points were also raised by Reviewer 2) that is now addressed in more detail in the main text. Quite simply, a limitation of our study is that we lack a solid basis for connecting the changes we observe in P450 activity to organismal fitness. We do not know exactly how a modest change in the activity profile of its P450 enzymes would affect the metabolism of a bacterium living in an environment with fluctuating concentrations of various substrates. As the reviewer points out, P450s tend to be broad range, and catalyze multiple reactions in biological settings. In fact, the primary natural substrates of P450 BM3 are not definitively known, although they are thought to include fatty acids [19,20]. There is thus no strong basis for believing that 12-pNCA is a close surrogate of a primary natural P450 BM3 substrate, and the reviewer is completely correct that natural P450s are probably not one-substrate enzymes. We therefore make no claims that the evolutionary changes that we observe in our studies parallel any specific changes that would be of biological relevance in natural P450 BM3 evolution. Instead, our study relates to the more general issue of how different enzyme specificities can evolve -certainly, such evolution is of relevance in biology, even if we cannot tie our exact choice of substrates to those that occur in nature. Our choice to define mutations as "neutral" if they maintain 12-pNCA activity above some threshold is based on the abstraction that selection in nature requires specific P450 activities to exceed some minimal value for viable metabolism. Given this abstraction, we believe our study sheds valuable light on how promiscuous activities can change when enzymes are constrained by limits on another primary activity.

2.
Being an integral part of the Results section, the sequence of the studied variants should be provided for this paper (notwithstanding [16], e.g., in the form of a table listing all mutations.

Authors' response
We have added a text file as Additional file 1that lists the mutations of all the neutrally evolved P450 variants as well as the full sequence of the parent P450.
3. Given the nature of the drift that led to the studied variants, the phrase "any mutations that spread among offspring sequences were therefore by definition due to neutral drift..." (second paragraph of Results section), should be supported by showing that the mutations in the studies variants all largely polymorphic, and none was enriched/fixated (see also point 2 above).

Authors' response
As mentioned above, we have added a list of all mutations (Additional file 1). In a few cases, the same mutation occurs in more than one P450 variant. However, we see no strong evidence of any specific mutation being unusually enriched among any of the variants. Overall, of the 105 different amino acid mutations occurring in the set of P450 variants, only twelve occur in more than one variant. Of these twelve mutations that occur multiple times, six are found mostly in variants from the polymorphic population that are evolutionarily related on the phylogenetic tree of Figure 1(for example, P9, P13, and P18 share a common mutation, but are evolutionarily related), meaning that the multiple occurrences can be explained by neutral spread of a single mutation to several members of the polymorphic population, rather than by repeated occurrence and fixation of the same mutation. So a visual inspection of the mutation data does not indicate any glaring trends for the enrichment of specific mutations. We have not attempted a formal population genetics analysis to see if the frequency distribution of mutations is compatible with the neutral theory (such an analysis would be complicated by the need to correct for phylogenetic relationships and the fact that error-prone PCR is biased towards creating certain types of mutations). However, our inclusion of the full mutational data should allow the reader to attempt such a population genetic analysis if he/she desires to do so. In any case, our statement that any mutations are "by definition due to neutral genetic drift" is simply based on the fact that the experimental procedure cannot directly favor any mutation, since all mutants have either equally high or zero fitness.
4. The standard errors refer to two separate measurements of the activities (legend of Figure 3). However, the Methods section describes a protocol by which a single lysate was aliquoted to two plates and measured separately (third paragraph of Methods section). This procedure hardly reflects the major sources of experimental variability, namely, differences in growth densities and lysis efficiency. Thus, separate measurements should involve independent growth of two colonies, lysis, and activity measurements.

Authors' response
This comment is addressed along with comment 5 in the next response.
5. There is no proper assessment of the likelihood that the observed variations in the measured activity are simply due to experimental variations. In other words, data should be provided to indicate the level of variation observed in an appropriate set of independent repeats of the parental protein ("wild-type") grown, lysed, and assayed for the various activities.

Authors' response
This response addresses the last two comments from the reviewer (4 and 5). New data and discussion to address these comments has been added in the form of Additional file 4and the new Methods section "Variability in experimental measurements." Here we briefly recap this new data and discussion. The reviewer correctly notes that the duplicate measurements of activity were performed on P450 from the same expression/lysis preparation of protein, and so do not constitute full biological