Automated mass action model space generation and analysis methods for two-reactant combinatorially complex equilibriums: An analysis of ATP-induced ribonucleotide reductase R1 hexamerization data
© Radivoyevitch; licensee BioMed Central Ltd. 2009
Received: 2 December 2009
Accepted: 9 December 2009
Published: 9 December 2009
Ribonucleotide reductase is the main control point of dNTP production. It has two subunits, R1, and R2 or p53R2. R1 has 5 possible catalytic site states (empty or filled with 1 of 4 NDPs), 5 possible s-site states (empty or filled with ATP, dATP, dTTP or dGTP), 3 possible a-site states (empty or filled with ATP or dATP), perhaps two possible h-site states (empty or filled with ATP), and all of this is folded into an R1 monomer-dimer-tetramer-hexamer equilibrium where R1 j-mers can be bound by variable numbers of R2 or p53R2 dimers. Trillions of RNR complexes are possible as a result. The problem is to determine which are needed in models to explain available data. This problem is intractable for 10 reactants, but it can be solved for 2 and is here for R1 and ATP.
Thousands of ATP-induced R1 hexamerization models with up to three (s, a and h) ATP binding sites per R1 subunit were automatically generated via hypotheses that complete dissociation constants are infinite and/or that binary dissociation constants are equal. To limit the model space size, it was assumed that s-sites are always filled in oligomers and never filled in monomers, and to interpret model terms it was assumed that a-sites fill before h-sites. The models were fitted to published dynamic light scattering data. As the lowest Akaike Information Criterion (AIC) of the 3-parameter models was greater than the lowest of the 2-parameter models, only models with up to 3 parameters were fitted. Models with sums of squared errors less than twice the minimum were then partitioned into two groups: those that contained no occupied h-site terms (508 models) and those that contained at least one (1580 models). Normalized AIC densities of these two groups of models differed significantly in favor of models that did not include an h-site term (Kolmogorov-Smirnov p < 1 × 10-15); consistent with this, 28 of the top 30 models (ranked by AICs) did not include an h-site term and 28/30 > 508/2088 with p < 2 × 10-15. Finally, 99 of the 2088 models did not have any terms with ATP/R1 ratios >1.5, but of the top 30, there were 14 such models (14/30 > 99/2088 with p < 3 × 10-16), i.e. the existence of R1 hexamers with >3 a-sites occupied by ATP is also not supported by this dataset.
The analysis presented suggests that three a-sites may not be occupied by ATP in R1 hexamers under the conditions of the data analyzed. If a-sites fill before h-sites, this implies that the dataset analyzed can be explained without the existence of an h-site.
This article was reviewed by Ossama Kashlan (nominated by Philip Hahnfeldt), Bin Hu (nominated by William Hlavacek) and Rainer Sachs.
The dNTP supply system is ideal for cancer systems biology research because, among cancer relevant processes, it is perhaps the best understood. This is important because, intuitively, the more understanding a mathematical model captures, the more likely it is to be more useful than a conceptual model. Thus, the dNTP supply system is well poised to be successfully controlled better with mathematical modeling than without, and because of this, this system could become a standard of success in systems biology; the basis of this argument is prior success in the use of mathematical models to improve the control of well understood systems such as power plants and airplanes.
RNR (NDP → dNDP)  has two subunits, R1, and R2 or p53R2 [17, 18]. On short time scales of seconds to minutes, RNR is controlled through two R1 regulatory sites, a selectivity (s-) site that is somewhat analogous to a radio tuning control knob, and an activity (a-) site that can be thought of as a volume control knob. Complicated positive and negative dNTP-mediated feedback loops (Fig. 1) impinge upon these two sites to implement a sophisticated solution to a challenging dNTP pool balance regulation problem; if pool balance regulation performance varies across individuals and MMR performance also varies, individuals compromised in both systems may be predisposed to cancer . RNR functional complexity is mirrored by the combinatorial complexity of its R1 subunit: its catalytic site can be empty or filled with 1 of 4 NDP substrates, its s-site can be empty or filled with ATP, dATP, dTTP or dGTP, its a-site can be empty or bound by ATP for activation or dATP for inactivation, it may have an h-site that can be empty or bound by ATP , and all of this is folded by an R1 monomer-dimer-tetramer-hexamer equilibrium where R1 j-mers may also be bound by variable numbers of R2 (or p53R2) dimers. As a result, trillions of R1 complexes are possible if R1, R2, UDP, CDP, GDP, ADP, ATP, dATP, dTTP and dGTP are all present (in this case ~102 R1 monomers implies ~1012 R1 hexamers) and the problem then is to determine which are needed in models to explain the data at hand. To appreciate the magnitude of the problem, if 1012 complexes are possible, the number of possible complete dissociation constant models is 2 raised to the 1012 (i.e. 1 followed by ~300 billion zeros), since each complex, and its corresponding complete dissociation constant K, can either be in the model (estimated) or out (set to infinity if the model hypothesizes that the concentration of the complex is approximately zero across all of the experimental conditions of the dataset). This huge number of models is even greater if, in addition to hypotheses that complete dissociation constants are infinite, hypotheses that binary dissociation constants equal each other are also considered. Though this problem is intractable for 10-reactants, 2-reactant solutions are feasible and may yield insights needed to enable 3-reactant solutions, and so on.
The R  package Combinatorially Complex Equilibrium Model Selection (ccems)  is used here to automatically generate thousands of ATP induced R1 hexamerization models partitioned into two classes: those that include model terms of complexes with ATP occupied h-sites (i.e. models that would support the existence of h-sites if selected) and those that do not. Comparisons of the Akaike Information Criterion (AIC) [23, 24] of these two classes of models were then used to assess the extent of h-site evidence strictly in the ATP-induced R1 hexamerization dynamic light scattering (DLS) data in figure 1 of reference . No evidence was found. As discussed in Kashlan's review below, evidence for an h-site may, however, exist under different experimental conditions.
A Simple Example
To introduce concepts of model space generation, standard models of competitive and non-competitive inhibition are derived below as instances of models in two systematically defined model spaces, one of spur graphs which focus on complete dissociation constants and hypotheses that they are infinite, and one of grid graphs which focus on binary dissociation constants and hypotheses that they are equal.
Note that this model corresponds to the full spur graph less the ESI node/edge and that it can also be viewed as the pair (KESI = ∞, Eq. 1).
where underscores in subscripts indicate specific binary reactions. In grid graphs, because the dissociation constants are binary, equation terms that represent complexes of n reactants have n - 1 K parameters in their denominators.
There is a one-to-one mapping between the K's in grid graph system (2) and those in spur graph system (1), namely, KES = KE_S, KEI = KE_I, and KESI = KE_IKEI_S or KEI_S = KESI/KEI. It follows then that (1) and (2) are data-fitting equivalents. The added value of full grid graph systems such as system (2) is their ability to spawn new hypotheses that cannot be generated by corresponding full spur graph systems such as (1). The additional hypotheses are binary K equality hypotheses. In the example here there is only one such hypothesis/model, the non-competitive inhibition model KE_S = KEI_S where inhibitor binding has no detectable effect on substrate binding.
to large τ where τ has nothing to do with real time and the state trajectory is merely algorithmic and thus not a biophysical path. Free concentration solutions then map to complex concentrations via mass action laws and these are then mapped to expected measurements, e.g. see Eq. (6) below and Eqs. (16-18) in .
The methods presented here are currently limited to biochemical systems where one central hub protein mediates all of the interactions and total concentrations of the reactants are approximately known exactly. It is assumed that the latter condition is adequately met in analyses of data derived from systems that were reconstituted from purified reactants. If the hub protein has more than one binding site for the same ligand, as R1 does for ATP, to interpret model terms, a specified sequence of site filling must be assumed. This assumption, made due to lack of a better option, may not hold. Automated model space generation is currently limited to two-reactant systems.
ATP-induced R1 Hexamerization Models
To generate a space of ATP induced R1 hexamerization models, the first step is to pick a full model and the second step is to apply K hypotheses to it . Full models that include s, a and h ATP binding sites generate two classes of models: those that include at least one occupied h-site complex and that thus support the existence of an h-site, if selected, and those that do not, i.e. those that allege that all occupied h-site complex concentrations are approximately zero and that thus support claims that the h-site is not needed to explain the data, if selected. The full model below generates both of these model types.
where, in each equation, first sum limits assume s-sites cannot be bound in monomers and other sum limits assume s-sites must be bound in oligomers; here X = ATP is used because a = dATP and A = ADP are being reserved for subsequent RNR models and RjXi is used instead of RjXi to stress connections to polynomials.
The hub protein monomer complex RX can be interpreted as X bound to either the a- or h-site. Because the a-site is known to exist, a-site binding will be assumed. RXX is then a monomer with both the a- and h-sites occupied. For oligomers, in addition to all of the s-sites being prefilled, it is assumed that: R2X2 through R2X4, R4X4 through R4X8, and R6X6 through R6X12, have zero to full a-site occupancies and no h-site occupancies, and that R2X5 and R2X6, R4X9 through R4X12, and R6X13 through R6X18, have partial to full h-site occupancies in addition to completely filled a-sites (and s-sites). Model inferences will be based on these interpretations.
is the total hub protein concentration, M1 is the mass of a monomer (90 kDa for the R1 subunit of RNR), ε is noise with constant variance and zero mean, and the j2 in the numerator includes one factor of j because the mass of a j-mer is j times that of a monomer, and another factor of j because light scattering is proportional to mass; ligand masses are treated as negligible relative to protein masses.
Nonlinear least squares was used to fit the models. The fitted models were then rank ordered by their AICs . Because the number of data points N is small at 15, the small sample size corrected version of the AIC was used: AIC = 2*P + 2*P(P+1)/(N-P-1) + N*log (2π) + N*log (SSE/N) + N where P is the number of estimated parameters (including the variance) and SSE is the sum of squared errors . In parameter optimizations (i.e. SSE minimizations, see Methods) the initial complete dissociation constant values were 100 μM raised to the sum of the powers of the numerators in Eq. (4) minus one, i.e. j + i - 1. This was critical, as it increased the number of models that converged from roughly 10% (when 1 μM was used uniformly) to nearly 100%. Models were fitted in parallel in a load balanced manner using R ; the R package ccems uses the R package snow (small network of workstations) to accomplish this. R scripts that were used to produce the results in this and the accompanying paper are available as examples in the papers directory of ccems.
The number of complexes represented in Eq. (5) is 2 + 5 + 9 + 13 = 29 and this implies that the number of spur models is 229 = ~500 million. Relevant here, however, is the number of 1-, 2- and 3-parameter models. There are 29 single-edge models, 406 (29 choose 2) two-edge models, and 3654 (29 choose 3) three-edge models. The lowest AIC of the 3-parameter models (144.4) was higher than the lowest AIC of the 2-parameter models (142.7), so 4-parameter models were not fitted; as parameter numbers increase AICs typically first decrease as SSEs decrease, but then rise and continue to rise due to over-parameterization.
Similar arguments apply to h-sites, with Xi+jreplacing X i in j-mers.
The first two columns of K equality models/hypotheses in Fig. 3 are plausible, as a protein could be so rigid that a ligand binding site is unchanged with respect to binding affinity regardless of other bindings (first column) and this could be true within j-mers but not between them (second column). The third column hypotheses are less likely, however, as they claim that binding of R (to R), which is massive relative to ligand and thus more likely to cause alterations upon binding, causes no change in ligand site affinities, yet, binding of the first small ligand to the dimer alters the dimer structure enough to change ligand binding at the second site. Continuing, fourth column hypotheses are even less likely, as they claim that the first ligand binds dimer differently than monomer, yet, after it binds, by chance, the second ligand binding energy exactly equals the amount needed for the product of these K's to equal the square of the monomer ligand K. Equivalent E-shaped graphs (same column) support this claim of unlikelihood, as they claim that although dimerization energies are different between R + R and Rt + R, they somehow return to the R + R value when both reactants are Rt. Finally, in the fifth column, it would be remarkable if ligand binding to free dimer differs from ligand binding to free monomer, yet somehow, binding of the first ligand returns the unoccupied dimer subunit to a state indistinguishable from that of the free monomer (with respect to its ligand binding affinity). The third and fifth columns can be interpreted in terms of rigid asymmetric dimers where one subunit holds its monomer shape and the other has a different shape with either tighter (fifth column) or weaker (third column) ligand binding. From this perspective, it is very unlikely that all of the dimerization induced shape changes (deformation energies) would fall strictly onto one of two identical subunits. Thus, the grid model space used here will only include K equality hypotheses that are analogous to the first 2 columns in Fig. 3, and it suffices to consider n-shaped graphs.
Within a site type, binary K's of j-mers can be depicted as threads hanging from a curtain rod as shown in Fig. 4. In the accompanying paper, binary K equalities in contiguous stretches within threads are considered. Here, each thread is homogeneous in its binary K value (i.e. only full thread length contiguous stretches are considered) and K equality models are instead generated by considering thread K values as independent of other threads (top 2 rows in Fig. 4), infinite (graphs with missing threads), or equal to those of other threads of the same site type (bottom 3 rows in Fig. 4) within contiguous stretches of threads, the idea being that if one protein oligomerization step alters a ligand K, it is unlikely that an additional step would return it to one of its previous values.
Since R binds R to form R2 via one protein surface, and since it is likely that R2 binds R2 and R4 using two different protein surfaces (or patches thereof), no hypotheses of K equivalence will be considered between the saturated s-site complexes R2X2, R4X4 and R6X6 (i.e. complexes in the curtain rods in Fig. 4). Thus, all of the K equality hypotheses explored will be with respect to ligand binding site constants in threads.
The binary K equality model space of interest here is the product of a space of a-site models and a completely analogous space of h-site models. The 32 models shown in Fig. 4 thus imply a K equality space of 1024 models. If thread head nodes within curtain rods are allowed to remain in models where all other nodes in the same thread have infinite K, the number of models increases: models missing hexamer threads split into two models (there are 8 of these in Fig. 4) and models missing both tetramer and hexamer threads (there are 3 of these in Fig. 4) split into four models. The total number of grid models is then (32 + 8 + 9)2 = 492 = 2401.
Models that contain hexamer terms
Since external data  confirms ATP induced R1 hexamerization, the model space was reduced to only models that contain at least one hexamer term. This reduced the number of grid models with 2 and 3 parameters to 2 and 15 (from 7 and 36) and the number of spur models with 1, 2 or 3 parameters to 13, 286 and 3094 (from 29, 406 and 3654), i.e. the number of models is now 17 + 3393 = 3410.
The top 3 models (lowest AIC) of the RX model space.
The space of 2088 models contains 99 models that do not have any terms with ratios of ATP bound per R1 > 1.5 (i.e. models consistent with ≤ 50% a-site filling in oligomers). Of the top 30 models, however, there were 14 such models, which is significant, 14/30 > 99/2088 with p < 3 × 10-16. In the reduced spaces of 1420 and 1287 models (of the previous paragraph), the proportions are 15/30 > 75/1420 and 13/30 > 54/1287, which also yield p < 3 × 10-16. Thus, this dataset does not support the existence of hexamers with >3 a-sites occupied by ATP. It should be noted that this statement implies a lack of h-site evidence if a-sites fill before h-sites.
No terms higher than R6X9 were needed to explain the ATP induced R1 hexamerization data found in figure 1 of reference . If s-sites fill before a-sites, this implies that ~1/2 of the hexamer a-sites are not bound by ATP under the experimental conditions of this dataset. If h-sites fill after a-sites, this also implies that h-sites need not exist to explain this dataset. Since the s-site is at the dimer interface in yeast , and since it is reasonable that hexamers form as trimers of dimers, it is likely that s-sites do fill first.
If it is true that s-sites are always bound in oligomers and never bound in monomers, dNTP access to hexamer s-sites, as is needed for RNR control, implies that either the monomer-dimer-tetramer-hexamer equilibrium is rapid enough that changes in ligand bound to the s-site can occur via the monomer-dimer equilibrium, or perhaps the hexamer stabilizes internal dimers enough that hexamer s-sites can vacate without hexamer decomposition. The latter case would complicate the analysis as the term R6X9 for example might then describe more than 3 filled a-sites.
Regarding a- before h-site filling, since a-sites are known to exist  and h-sites are in question, this is a reasonable default. The alternative, to assume h-site existence and instead challenge a-site existence, is much less reasonable.
The approach used selects model terms (and thus parameters) based on how needed they are to explain the data analyzed. If, in solution, hexamers rarely have more than 3 ATPs bound to their a-sites, no parameters are allocated to complexes with higher numbers of bound ATP. The analysis presented does not claim that ≥ 3 a-sites will remain unoccupied under R1 crystallization conditions that may differ greatly from those used to generate the data analyzed here.
For the data analyzed, [RT] is 7 μM (i.e. yielding up to 21 μM of ATP binding sites if h-sites exist) and the minimum [XT] is 46 μM, so the approximation [X] = [XT] would not have been valid for this lowest [ATP] data point. The value of such approximations is less with ≥ 2 oligomerization states than with one (which has an analytic solution, see accompanying paper), as the univariate polynomial that results still requires a numerical solution (e.g. [X] = [XT] in Eq. (5) yields a 6th order polynomial in free [R]), but univariate polynomials are solved much faster than multivariate polynomials, e.g. using ODEs as in Eq. (3), so the computational savings are worthwhile if the approximation is tolerable. For the data used here, as [X] = [XT] is valid for most of the data points, this approximation caused deviations of only 1% in the parameter estimates of Table 1 but it gave a 30-fold increase in computation speed.
Since there were 13, 286 and 3094 K infinity spur graphs with 1, 2 or 3 parameters, compared to 0, 2 and 15 K equality grid graphs, and since models with few parameters have an AIC advantage when dataset sizes are modest, it is not surprising that with 15 data points, the top models were all spur models. In the future, as automation affords richer datasets, grid graphs may become more competitive. Thus, though the grid graph enumeration efforts expended in this paper did not pay immediate dividends, they may in the future.
Contiguous stretches of equal binary K parameters within threads were not explored because binary K models were already non-competitive due to over-parameterization, and because additional ATP ligands on a j-mer would not have changed DLS masses detectably, so K cooperativity within threads would not have been detectable.
Microfluidic chip technology [28–31] will eventually enable 5-dimensional RNR studies where [ATP] and [NDP]s are fixed to in vivo levels and [R1], [R2], [dATP], [dTTP] and [dGTP] vary across ranges centered about physiological operating points. If ccems can automatically analyze new RNR data as it arrives, it could find uses in sequential experimental designs  where the chip conditions of the next measurements are determined in real time to implement efficient 5-D sampling strategies.
As protein expression and purification core facilities become more common, reconstituted network analyses where alleged protein-protein interactions are mathematically characterized for applications in systems biology [33, 34] will eventually also become more common. It is anticipated here that many of these interactions will be combinatorially complex and that ccems will then find broader uses.
No terms higher than R6X9 were needed to explain the ATP induced R1 hexamerization data found in figure 1 of reference . This suggests that under the experimental conditions of this dataset, ~1/2 of the hexamer a-sites are not bound by ATP, and that if a-sites fill before h-sites, that h-sites need not exist to explain this dataset.
The R package ccems currently solves 2-reactant problems where total reactant concentrations are known and manipulated, free reactant concentrations are determined by a system of mass action-based total concentration constraint polynomials, expected measurements are determined by model predicted complex concentrations, and the number of models is large due to combinatorial complexity. This is a generic in vitro synthetic biochemical system problem statement, so ccems could have a broad impact.
Data were digitized by plotDigitizer  and analyzed using ccems. Hessians of SSEs obtained using optim were divided by 2, inverted, multiplied by SSE/(N - P), and the square roots of the main diagonal were then multiplied by 1.96 to form 95% Wald CI. Parameters were estimated in exponentiated forms to constrain them to positive values.
Reviewer's report 1
Ossama B. Kashlan, University of Pittsburgh (nominated by Philip Hahnfeldt, Tufts)
As you've shown, you don't need to invoke the h-site to fit our figure 1 data. But we did need it for the other data in the paper, e.g. the global fit of DLS and activity data in figure 5 of our paper (with dTTP saturating the s-site). Since we wanted to use a single model for all the data, we therefore used an h-site to fit our figure 1 data. You should include a discussion of the potential of the h-site-less models to account for our figure 5 DLS data (as you noted, modeling activity data greatly increases the model space).
Regarding the assumption that s- fills before a- fills before h-site, and that oligomers always have certain sites filled. This assumption ignores a few important observations and possibilities. First is the ability of R1 to dimerize in the absence of a filled s-site, e.g. we observed that CDP reduction occurs (at a low rate) in the absence of an s-site ligand. Second, as you pointed out, is the ability to switch s-site ligands while the a- (and/or h-) site(s) are occupied. This is important, because you base your conclusion of a lack of h-site evidence on the fact that not enough ATP are bound to fill the a-sites. But unless h-site binding is dramatically weaker than a-site binding, binding the first few h-sites may be more favorable than filling the last few a-sites.
The minimal models presented as 'best', and from which physical conclusions are drawn, should be able to account for both (all) datasets. Having a different model framework for each given set of ligands adds a new level of complexity. Can you combine the AIC scores for the fits to both sets of data to find the best model(s), based at least on these data?
The conclusion on p. 3 should be edited to reflect the above comments and your response to my previous comment.
The background on p. 3 should be edited to read, "ATP binds to both of these sites and there is some evidence, based on RNR *mass and* activity versus [ATP] data,."
By stating model assumptions I did not mean to say that I thought they must be true (no model is ever correct). What I meant to say is that my inferences are all conditional on their truth. The situation is such that unless such an assumption is made about the binding order, the meaning of a polynomial term that exists in a model is ambiguous. This is a big weakness, but I do not see any way around it. I now state this weakness at the end of the Limitations Section.
Note that we really have two different hub proteins for these two datasets, one with R as R1 monomer and the other with R as dTTP and GDP saturated R1 dimer. All complete K parameters downstream of these two hubs would thus be independent and there would thus be no way to pool the parameter estimates (beyond the error estimate). The joint model would thus be the sum of the terms.
The conclusion in the abstract has been softened to reflect conditionality on the experimental conditions of your figure 1 dataset.
Agreed, but this sentence no longer exists.
Reviewer's report 2
Bin Hu, Los Alamos National Laboratory (nominated by William Hlavacek, LANL)
An introduction to RNR, its function and regulation is needed. Currently there is only one sentence in the beginning of the Introduction about RNR.
It is not mentioned in the paper how the R1 subunit forms RNR with R2 units and what kind of multimers the R1 subunit can form and their biological importance. References, especially for the crystallography data, are needed.
In the Results section, the meaning of "complete dissociation constants" is not clear.
The R package ccems was first introduced without reference.
The author submitted two papers to this journal simultaneously. Instead of using "the accompanying paper," the author or editorial office may want to change it to some other description that may help readers to find out which paper the author is referencing.
I cannot tell whether the assumption "s-sites are always filled in oligomers and never filled in monomers" is acceptable in this study.
It would be interesting if the author compared the parameters used this work with those used in .
The model of ATP induced R1 hexamerization previously proposed by Kashlan et al.  assumed: a) that the binary dissociation constants K of the ATP binding sites s, a and h are the same within oligomers (within site types); b) that these Ks, Ka and Kh are infinite in structures smaller than dimers, tetramers and hexamers, respectively (note that all three are thus infinite in monomers); c) that finite K are equal wherever it is plausible that they might be, i.e. that Ka in tetramers equals Ka in hexamers and that Ks is the same across dimers, tetramers and hexamers; d) that the dissociation constants for R1 binding to itself (KR_R), R1 dimers binding to themselves (KR2_R2), and R1 tetramers binding to R1 dimers (KR4_R2), are independently adjustable; and e) that R1 tetramers can isomerize with an isomerization constant Kis. Assumptions a) to c) constrain the model and d) to e) broaden it. When Kashlan et al. fitted their model to their DLS data, KR_R, KR2_R2, KR4_R2 and Kis were treated as being independent of R1 ligands, and consistent with these assumptions, the data in their figures 1 and 3-9 were fitted to single values of these constants such that the fits in these figures did not appear too poor). With respect to their figure 1 data, however, the first five residuals of their fit were negative and thus correlated, and although the residuals were small and thus subtle, the fit was thus poor. This paper focuses on their figure 1 data alone.
Background material regarding dNTP supply and RNR have now been added to the Introduction. However, since I do not model RNR activity data, this work has limited relevance to dNTP supply metabolism. My focus is thus on the methods developed. Indeed, it may be best to view R1 as merely some protein that has either 2 or 3 binding sites on it for a ligand that induces its hexamerization.
A) R2 is irrelevant here since this paper does not delve into activity data and since R2 was not present in the experiment that yielded the DLS average mass data analyzed. B) The first eukaryotic (yeast) R1 structure showed a dimer and this was referenced . Though a dATP induced R1 tetramer was observed in Ref , it was not observed in , and no lab has observed it directly using the more relevant ligand (for this paper) ATP. Thus, tetramers could perhaps have been left out of the model space, but there is strong support for R1 monomers (e.g. the low [ATP] DLS data in Fig. 7), dimers (the structure in ) and hexamers (e.g. the high [ATP] DLS data in Fig. 7).
By complete dissociation constants I mean those where the Gibbs Free energies are with respect to all reactants being completely separated from one another by infinite distances. In contrast, by binary dissociation constants I mean situations where only one reactant (or perhaps a subcomplex) is separated out at infinity while all of the other reactants remain bound together.
The link to my ccems page is now referenced earlier.
They should end up back-to-back and I hope readers will read, and know of, both.
The conclusions remain the same if I drop h-site terms and blow up the model space by introducing s-site terms, i.e. there is some support for the assumption besides dTTP induced R1dimerization results in  and the structure in .
The model used in  has 7 parameters. Their binary K values for ATP binding were 100 μM for dimers and tetramers and 1.1 mM for hexamers. For R1 oligomerization and isomerization their values were K R_R = 170 μM, K R2_R2 = 2-5 mM, K R4_R2 = 2-6 mM and K is = 10-40 where ranges depend on different tetrameric activity assumptions. Without the same parameters in my models, comparisons are difficult. The best model in Table 1 is R6X8 and it has a geometric mean binary binding constant of 63 μM. Since all of the binary K values of  are ≥ 100 μM, their geometric mean must also be ≥ 100 μM. Indeed, aassuming ATP fills 6 s-sites and 2 a-sites in R6X8, one obtains a geometric mean binary K of 190 μM = ((100)8(170)3(3000)1(3000)1)1/13, i.e. there is a difference of a factor of 3 in parameter estimates between analyses.
Reviewer's report 3
Rainer K. Sachs, UC Berkeley
In general, is there some systematic rationale and/or underlying reasoning on what criteria to use to distinguish hypotheses? Do you estimate that almost any criterion would give the same final answers qualitatively? How did you decide on 30 models in your comparison of model proportions? What is your main motivation to study RNR? Can your software handle more than two reactants? Can you provide a code use example?
The AIC was used (without consideration of alternatives such as the Bayesian Information Criterion, BIC) only because it is the most popular. The idea was to pick a criterion to present my main contribution, which is in model space generation rather than model selection. Though there may be reasons to switch to a different criterion that I have yet to learn of, in the interim, the AIC is my default. I suspect that the conclusions made here would be robust to such changes.
The choice of 30 models involved data snooping as you may have guessed, i.e. 30 looked like a good breaking point for a claim that the data does not support an h-site. Thus, the difference in proportions p-value that I reported may be overly significant. Nevertheless, the Kolmogorov-Smironov Test is with respect to entire distributions in Fig. 6, so the conclusion that this particular dataset does not demand the existence of an h-site is robust. There is no indication that this conclusion will hold under different experimental conditions, however, e.g. see Kashlan's review above.
The project described was supported by Award Number K25CA104791 from the National Cancer Institute. The content is solely the responsibility of the author and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.
- Radivoyevitch T, Loparo KA, Jackson RC, Sedwick WD: On systems and control approaches to therapeutic gain. BMC Cancer. 2006, 6 (1): 104-10.1186/1471-2407-6-104.PubMedPubMed CentralView ArticleGoogle Scholar
- Manegold C: Gemcitabine (Gemzar) in non-small cell lung cancer. Expert Rev Anticancer Ther. 2004, 4 (3): 345-360. 10.1586/14737126.96.36.1995.PubMedView ArticleGoogle Scholar
- Kroep JR, Van Moorsel CJ, Veerman G, Voorn DA, Schultz RM, Worzalla JF, Tanzer LR, Merriman RL, Pinedo HM, Peters GJ: Role of deoxycytidine kinase (dCK), thymidine kinase 2 (TK2), and deoxycytidine deaminase (dCDA) in the antitumor activity of gemcitabine (dFdC). Adv Exp Med Biol. 1998, 431: 657-660.PubMedView ArticleGoogle Scholar
- Shewach DS, Reynolds KK, Hertel L: Nucleotide specificity of human deoxycytidine kinase. Mol Pharmacol. 1992, 42 (3): 518-524.PubMedGoogle Scholar
- Wang J, Lohman GJ, Stubbe J: Enhanced subunit interactions with gemcitabine-5'-diphosphate inhibit ribonucleotide reductases. Proc Natl Acad Sci USA. 2007, 104 (36): 14324-14329. 10.1073/pnas.0706803104.PubMedPubMed CentralView ArticleGoogle Scholar
- Ferrandina G, Mey V, Nannizzi S, Ricciardi S, Petrillo M, Ferlini C, Danesi R, Scambia G, Del Tacca M: Expression of nucleoside transporters, deoxycitidine kinase, ribonucleotide reductase regulatory subunits, and gemcitabine catabolic enzymes in primary ovarian cancer. Cancer Chemother Pharmacol. 2009Google Scholar
- Ohtaka K, Kohya N, Sato K, Kitajima Y, Ide T, Mitsuno M, Miyazaki K: Ribonucleotide reductase subunit M1 is a possible chemoresistance marker to gemcitabine in biliary tract carcinoma. Oncol Rep. 2008, 20 (2): 279-286.PubMedGoogle Scholar
- Flanagan SA, Robinson BW, Krokosky CM, Shewach DS: Mismatched nucleotides as the lesions responsible for radiosensitization with gemcitabine: a new paradigm for antimetabolite radiosensitizers. Mol Cancer Ther. 2007, 6 (6): 1858-1868. 10.1158/1535-7163.MCT-07-0068.PubMedView ArticleGoogle Scholar
- Muller A, Fishel R: Mismatch repair and the hereditary non-polyposis colorectal cancer syndrome (HNPCC). Cancer Invest. 2002, 20 (1): 102-109. 10.1081/CNV-120000371.PubMedView ArticleGoogle Scholar
- Leung WK, Kim JJ, Kim JG, Graham DY, Sepulveda AR: Microsatellite instability in gastric intestinal metaplasia in patients with and without gastric cancer. Am J Pathol. 2000, 156 (2): 537-543.PubMedPubMed CentralView ArticleGoogle Scholar
- Nakata B, Wang YQ, Yashiro M, Nishioka N, Tanaka H, Ohira M, Ishikawa T, Nishino H, Hirakawa K: Prognostic value of microsatellite instability in resectable pancreatic cancer. Clin Cancer Res. 2002, 8 (8): 2536-2540.PubMedGoogle Scholar
- Hartmann A, Zanardo L, Bocker-Edmonston T, Blaszyk H, Dietmaier W, Stoehr R, Cheville JC, Junker K, Wieland W, Knuechel R Rueschoff J, Hofstaedter F, Fishel R: Frequent microsatellite instability in sporadic tumors of the upper urinary tract. Cancer Res. 2002, 62 (23): 6796-6802.PubMedGoogle Scholar
- Ichikawa Y, Lemon SJ, Wang S, Franklin B, Watson P, Knezetic JA, Bewtra C, Lynch HT: Microsatellite instability and expression of MLH1 and MSH2 in normal and malignant endometrial and ovarian epithelium in hereditary nonpolyposis colorectal cancer family members. Cancer Genet Cytogenet. 1999, 112 (1): 2-8. 10.1016/S0165-4608(98)00252-0.PubMedView ArticleGoogle Scholar
- Simpkins SB, Bocker T, Swisher EM, Mutch DG, Gersell DJ, Kovatich AJ, Palazzo JP, Fishel R, Goodfellow PJ: MLH1 promoter methylation and gene silencing is the primary cause of microsatellite instability in sporadic endometrial cancers. Hum Mol Genet. 1999, 8 (4): 661-666. 10.1093/hmg/8.4.661.PubMedView ArticleGoogle Scholar
- Leung SY, Chan TL, Chung LP, Chan AS, Fan YW, Hung KN, Kwong WK, Ho JW, Yuen ST: Microsatellite instability and mutation of DNA mismatch repair genes in gliomas. Am J Pathol. 1998, 153 (4): 1181-1188.PubMedPubMed CentralView ArticleGoogle Scholar
- Thelander L, Reichard P: Reduction of ribonucleotides. Annu Rev Biochem. 1979, 48: 133-158. 10.1146/annurev.bi.48.070179.001025.PubMedView ArticleGoogle Scholar
- Tanaka H, Arakawa H, Yamaguchi T, Shiraishi K, Fukuda S, Matsui K, Takei Y, Nakamura Y: A ribonucleotide reductase gene involved in a p53-dependent cell-cycle checkpoint for DNA damage. Nature. 2000, 404 (6773): 42-49. 10.1038/35003506.PubMedView ArticleGoogle Scholar
- Nakano K, Balint E, Ashcroft M, Vousden KH: A ribonucleotide reductase gene is a transcriptional target of p53 and p73. Oncogene. 2000, 19 (37): 4283-4289. 10.1038/sj.onc.1203774.PubMedView ArticleGoogle Scholar
- Feng IJ, Radivoyevitch T: SNP-SNP interactions between the dNTP supply and mismatch DNA repair systems in breast cancer. Ohio Collaborative Conference on Bioinformatics. 2009Google Scholar
- Kashlan OB, Scott CP, Lear JD, Cooperman BS: A comprehensive model for the allosteric regulation of mammalian ribonucleotide reductase. Functional consequences of ATP- and dATP-induced oligomerization of the large subunit. Biochemistry. 2002, 41 (2): 462-474. 10.1021/bi011653a.PubMedView ArticleGoogle Scholar
- Ihaka R, Gentleman R: R:a language for data analysis and graphics. Journal of Computational and graphical statistics. 1996, 5: 299-314. 10.2307/1390807.Google Scholar
- Combinatorially Complex Equilibrium Model Selection. [http://epbi-radivot.cwru.edu/ccems/overview.html]
- Akaike H: A new look at the statistical model identification. IEEE Transactions on Automatic Control. 1974, 19: 716-723. 10.1109/TAC.1974.1100705.View ArticleGoogle Scholar
- Burnham KP, Anderson DR: Model Selection and Multimodel Inference: A Practical-Theoretic Approach. 2002, Springer-VerlagGoogle Scholar
- Radivoyevitch T: Equilibrium model selection: dTTP induced R1 dimerization. BMC Syst Biol. 2008, 2 (1): 15-10.1186/1752-0509-2-15.PubMedPubMed CentralView ArticleGoogle Scholar
- Xu H, Faber C, Uchiki T, Fairman JW, Racca J, Dealwis C: Structures of eukaryotic ribonucleotide reductase I provide insights into dNTP regulation. Proc Natl Acad Sci USA. 2006, 103 (11): 4022-4027. 10.1073/pnas.0600443103.PubMedPubMed CentralView ArticleGoogle Scholar
- Rofougaran R, Vodnala M, Hofer A: Enzymatically active mammalian ribonucleotide reductase exists primarily as an alpha6beta2 octamer. J Biol Chem. 2006, 281 (38): 27705-27711. 10.1074/jbc.M605573200.PubMedView ArticleGoogle Scholar
- Roach LS, Song H, Ismagilov RF: Controlling nonspecific protein adsorption in a plug-based microfluidic system by controlling interfacial chemistry using fluorous-phase surfactants. Anal Chem. 2005, 77 (3): 785-796. 10.1021/ac049061w.PubMedPubMed CentralView ArticleGoogle Scholar
- Song H, Ismagilov RF: Millisecond kinetics on a microfluidic chip using nanoliters of reagents. J Am Chem Soc. 2003, 125 (47): 14613-14619. 10.1021/ja0354566.PubMedPubMed CentralView ArticleGoogle Scholar
- Huebner A, Olguin LF, Bratton D, Whyte G, Huck WT, de Mello AJ, Edel JB, Abell C, Hollfelder F: Development of quantitative cell-based enzyme assays in microdroplets. Anal Chem. 2008, 80 (10): 3890-3896. 10.1021/ac800338z.PubMedView ArticleGoogle Scholar
- Huebner A, Bratton D, Whyte G, Yang M, Demello AJ, Abell C, Hollfelder F: Static microdroplet arrays: a microfluidic device for droplet trapping, incubation and release for enzymatic and cell-based assays. Lab Chip. 2009, 9 (5): 692-698. 10.1039/b813709a.PubMedView ArticleGoogle Scholar
- Atkinson AC, Donev AN, Tobias RD: Optimum Experimental Designs, with SAS. 2007, Oxford: Oxford University PressGoogle Scholar
- Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, et al: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003, 19 (4): 524-531. 10.1093/bioinformatics/btg015.PubMedView ArticleGoogle Scholar
- Le Novere N, Bornstein B, Broicher A, Courtot M, Donizelli M, Dharuri H, Li L, Sauro H, Schilstra M, Shapiro B Snoep JL, Hucka M: BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. 2006, D689-691. 10.1093/nar/gkj092. 34 DatabaseGoogle Scholar
- Plot Digitizer. [http://plotdigitizer.sourceforge.net/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.