The archaeo-eukaryotic GINS proteins and the archaeal primase catalytic subunit PriS share a common domain
© Swiatek and MacNeill; licensee BioMed Central Ltd. 2010
Received: 6 April 2010
Accepted: 12 April 2010
Published: 12 April 2010
Primase and GINS are essential factors for chromosomal DNA replication in eukaryotic and archaeal cells. Here we describe a previously undetected relationship between the C-terminal domain of the catalytic subunit (PriS) of archaeal primase and the B-domains of the archaeo-eukaryotic GINS proteins in the form of a conserved structural domain comprising a three-stranded antiparallel β-sheet adjacent to an α-helix and a two-stranded β-sheet or hairpin. The presence of a shared domain in archaeal PriS and GINS proteins, the genes for which are often found adjacent on the chromosome, suggests simple mechanisms for the evolution of these proteins.
This article was reviewed by Zvi Kelman (nominated by Michael Galperin) and Kira Makarova.
Primases are specialised DNA-dependent RNA polymerase enzymes that function in chromosome replication to synthesise oligoribonucleotide primers for use by the replicative DNA polymerases [1, 2]. Structurally, primases fall into two classes. One class comprises the DnaG family enzymes found in bacteria and archaea. The second class are the heterodimeric primases of the archaeo-eukaryotic primase (AEP) superfamily found in the eukarya and archaea but which are also present in some bacteria . The AEP enzymes comprise a catalytic and a non-catalytic subunit. In the archaea these are designated PriS and PriL, respectively. In eukaryotes, the dimeric primase forms part of the replicative DNA polymerase α-primase complex that initiates Okazaki fragment synthesis.
The first structural insights into archaeal primase function came from the crystal structures of the PriS proteins from the euryarchaeal organisms Pyrococcus furiosus  and P. horikoshii . The latter was co-crystallised with UTP (uridine-5'-triphosphate) allowing confirmation of the location of the active site of the enzyme. The P. furiosus and P. horikoshii PriS proteins are composed of two distinct domains: a mixed α/β domain (the Prim domain) that includes the catalytic site of the enzyme and a smaller α-helical domain of unknown function [4, 5].
In addition to the Pyrococcus PriS structures, the structure of the PriS protein from the crenarchaeal organism Sulfolobus solfataricus has also been determined . Three significant differences are apparent when comparing the S. solfataricus PriS structure with those of the Pyrococcus PriS proteins: the α-helical domain observed in the latter proteins is reduced to a single irregular helix in S. solfataricus PriS, the zinc binding motif in S. solfataricus PriS is located at the end of an extended β hairpin structure that is absent from the Pyrococcus proteins, and a mixed α/β domain of ~50 amino acids (termed the PriS-CTD) is found at the C-terminal end of the S. solfataricus protein but is also absent from the Pyrococcus proteins . The PriS-CTD, which is the subject of this report, comprises a three-stranded antiparallel β-sheet adjacent to an α-helix and a two-stranded antiparallel β-sheet. Multiple sequence alignments (data not shown) indicate that the PriS CTD is conserved in all archaeal lineages with the exception of the Thermococcales (including Pyrococcus and Thermococcus species) and the Methanobacteriales (Methanosphaera and Methanothermobacter species), implying that these latter groups have undergone lineage-specific loss of this domain. In addition, the PriS-CTD does not appear to be present in the eukaryotic primase small subunit proteins. The role of the PriS-CTD is unclear but it has been suggested that this may play a role supporting and positioning the extended β hairpin structure that forms the stem of the zinc-binding motif . In the Pyrococcus PriS proteins, which lack the extended β hairpin, a single α-helix replaces the PriS-CTD [4, 5].
The function of the non-catalytic primase subunit is less clear but experiments suggest that this protein might have a role in determining (or limiting) the length of the RNA primer synthesised by the catalytic subunit . Three-dimensional structures for truncated S. solfataricus and P. horikoshii PriL proteins have been determined and the PriS-PriL subunit interface defined [6, 8]. Missing from both PriL structures is the C-terminal [4Fe-4S] cluster-containing domain that is found conserved in the eukaryotic non-catalytic primase subunit and which has been shown to be essential for primer synthesis [9, 10].
DNA unwinding during eukaryotic chromosome replication is most likely catalysed by the CMG (Cdc45-MCM-GINS) complex comprising the hexameric MCM DNA helicase and its accessory factors, the Cdc45 protein and GINS [11, 12]. Eukaryotic GINS is a heterotetramer consisting of the Sld5, Psf1, Psf2 and Psf3 subunits, each of which comprises two distinct protein domains [13, 14]: an A-domain composed largely of α-helices and a smaller B-domain made up largely of β-strands [15–17]. Intriguingly, the order of the two domains is circularly permuted in the Sld5 and Psf1 subunits compared to the Psf2 and Psf3 subunits [18, 19]. In Sld5 and Psf1 the A-domain is N-terminal to the B-domain, whereas in Psf2 and Psf3 it is the B-domain that is N-terminal. In the complex, the four subunits of GINS are arranged in two layers and the B-domains appear to function both to stabilise the interfaces between the layers of the complex and to mediate protein-protein interactions with additional factors [15–17]. The broader function of GINS within the CMG complex is not known and although several models have been proposed, significant uncertainty remains over the mode of action of the MCM helicase itself [13, 14]. It has been suggested, for example, that MCM acts primarily as a double-stranded DNA translocase, pumping dsDNA through its central cavity in an ATP-dependent manner; DNA exiting the central channel might then encounter the GINS protein acting as a ploughshare to sterically separate the two DNA strands . Further biochemical analysis of CMG function will be required to resolve this uncertainty.
All archaeal genomes sequenced to date encode a single protein with similarity to the eukaryotic Sld5 and Psf1 proteins and their characteristic A-B domain order [18, 19]. A subset of species, including representatives of the deeply-branching Thaumarchaeota  and Korarchaeota , encode an additional protein (called Gins23) with similarity to the eukaryotic Psf2 and Psf3 proteins and their B-A domain order. In S. solfataricus and P. furiosus, the Gins51 and Gins23 proteins form a tetrameric complex comprising two molecules of Gins51 and two of Gins23 that is likely to be similar in structure to eukaryotic GINS [18, 23]. The structure of the GINS complex in those archaea that apparently lack Gins23 is not known; in particular, it is not known if the Gins51 protein can form tetramers. In evolutionary terms, it is likely that the last common archaeo-eukaryotic ancestor encoded proteins with both A-B (Gins51) and B-A (Gins23) domain order [18, 19]. In eukaryotic cells, subsequent duplication of the ancestral genes encoding Gins51 and Gins23 produced Sld5 and Psf1, and Psf2 and Psf3, respectively, while in the archaea, lineage specific loss of the gene encoding Gins23 led to the appearance of species lacking this protein [18, 19].
In the course of database searching to identify GINS proteins in diverse archaeal species, we observed that sequences corresponding to the C-terminal domain (CTD) of the catalytic subunit of the archaeal primase protein PriS were often detected when archaeal GINS proteins were used as the query sequence. For example, BLAST searching using default parameters  against archaeal proteins in the NCBI Reference Sequence database  with the Cenarchaeum symbiosum (strain A) Gins23 protein [CENSYa_1724; GI: 118576897] as the query identifies the PriS protein [PAE3036; GI: 1463797] from Pyrobaculum aerophilum with an E-value 0.003 (amino acids 14-63 of the C. symbiosum Gins23 are 42% identical to residues 261-310 of P. aerophilum PriS). Additional Pyrobaculum PriS proteins, from P. calidifontis [Pcal_0991, GI: 4909914]. P. arsenaticum [Pars_1787, GI:5055591] and P. islandicum [Pisl_0437, GI: 4617745], are found with E-values of 0.058, 0.063 and 0.22, while PriS from Thermoproteus neutrophilus [Tneu 1683; GI: 6165219] is found with an E-value of 5.6.
In conclusion, the observations described here highlight a previously undetected relationship between two key components of the archaeal replication machinery and suggest a simple mechanism to account for the evolution of the PriS protein.
Reviewer's report 1
Zvi Kelman, University of Maryland Biotechnology Institute (nominated by Michael Galperin, National Center for Biotechnology Information)
The manuscript by Swiatek and MacNeill describes a structural comparison between domains of the eukaryotic GINS and the archaeal primase. It was found that although the domains share limited sequence similarities, they have similar three-dimensional folds. Using these observations, the authors proposed several mechanisms involving gene duplication that could result in the two protein families. These are interesting observations regarding essential replication enzymes in archaea and eukarya. I have only two minor comments. It would be useful for readers who are not familiar with the archaeal replication system to briefly describe the dimeric archaeal primase and the role of each subunit. A sentence or two regarding the proposed function(s) of the GINS complex would also be useful (in addition to the references provided).
Authors' response: We are grateful for the reviewer's suggestions and have modified the text of the manuscript accordingly.
Reviewer's report 2
Kira Makarova, National Center for Biotechnology Information
Swiatek and MacNeill have made an interesting observation about the similarity of archaeal small primase subunit (PriS) C-terminal domain and B-domain of GINS-like proteins and have presented a plausible evolutionary scenario showing how the fusion of ancestral PriS and B-domain of GINS (specifically Gins51) could have occurred. This paper definitely extends the horizons of our understanding of complex events in the evolution of the molecular machinery for DNA replication initiation in archaea and eukaryotes. Importantly, it also provokes further discussion and analysis of the proteins and domains involved in this process. Specifically the absence of CTD in PriS in Thermococcales and Methanobacteriales and especially eukaryotes raises further questions about the actual ancestral state and involvement of horizontal transfer in the chain of evolutionary events. In this respect it would be interesting to see a phylogenetic tree reconstructed for Prim domain of PriS (of archaea and eukaryotes). While the evolutionary scenario suggested in this paper is really tempting because of physical proximity of PriS and Gins51 in some archaea, the ancestral state of this gene arrangement is not certain, since many archaea do not have it, including Thaumarchaea, one of deeply branching groups. Moreover the suggested scenario does not seem to take into account the observation that the CTD of archaeal PriS is a little bit more similar to B-domain of Gins23 (this follows from such data reported in the paper as PSI-BLAST search results and multiple alignment which shows that only eukaryotic Psf2 has structure and sequence fully compatible with the CTD while Sld5 has a specific insertion between β2 and β3). Hopefully structures of archaeal Gins51 and Gins23 would help to resolve some of these issues. Thus I would not be surprised if the evolutionary scenario of PriS/GINS evolution will be revised when new data became available. And, of course, many questions still remain about the configuration of the molecular complex that includes PriS and a variety of GINS proteins or/and CTD.
The reviewer is correct to state that the organisation of the genes encoding the PriS and Gins51 proteins in the last common archaeal ancestor is not certain and to hint that the lack of physical proximity between these genes in the deeply-branching Thaumarchaeota might be an indication that the Prim-Gins51 gene organisation proposed in our model for the acquisition of the CTD by PriS (Figure 3) is problematic, despite the widespread co-localisation of these genes in many diverse archaeal species including representatives of the Euryarchaeota, Korarchaeota and Crenarchaeota (see Additional file 1). The sequencing of additional archaeal genomes, particularly from the deeply-branching clades, will be of great importance in clarifying this issue.
In addition, while it is true that the findings reported here could be construed as suggesting a closer relationship between the PriS CTD and the B-domains of the Gins23 family proteins (Psf2 and Psf3 in eukaryotes), the low levels of sequence similarity displayed by the CTD and B-domains (Figure 1) and the substantial evolutionary distance between the archaeal and human proteins whose structures have been solved (Figure 2) do not allow firm conclusions to be drawn on this point. It also seems unlikely on the basis of the sequence alignment shown in Figure 1 that the sequence insertion in the human Sld5 will be present in archaeal Gins51 B-domain. As the reviewer rightly points out, representative structures of archaeal Gins51 and Gins23 proteins may well help to resolve this issue.
List of abbreviations
basic local alignment search tool
basic local alignment search tool
open reading frame.
We are grateful to Malcolm White (University of St Andrews) for constructive comments on the manuscript. This work was funded by the Scottish Universities Life Sciences Alliance (SULSA).
- Arezi B, Kuchta RD: Eukaryotic DNA primase. Trends Biochem Sci. 2000, 25 (11): 572-576. 10.1016/S0968-0004(00)01680-7.PubMedView ArticleGoogle Scholar
- Frick DN, Richardson CC: DNA primases. Annu Rev Biochem. 2001, 70: 39-80. 10.1146/annurev.biochem.70.1.39.PubMedView ArticleGoogle Scholar
- Iyer LM, Koonin EV, Leipe DD, Aravind L: Origin and evolution of the archaeo-eukaryotic primase superfamily and related palm-domain proteins: structural insights and new members. Nucleic Acids Res. 2005, 33 (12): 3875-3896. 10.1093/nar/gki702.PubMedPubMed CentralView ArticleGoogle Scholar
- Augustin MA, Huber R, Kaiser JT: Crystal structure of a DNA-dependent RNA polymerase (DNA primase). Nature Struct Biol. 2001, 8 (1): 57-61. 10.1038/83060.PubMedView ArticleGoogle Scholar
- Ito N, Nureki O, Shirouzu M, Yokoyama S, Hanaoka F: Crystal structure of the Pyrococcus horikoshii DNA primase-UTP complex: implications for the mechanism of primer synthesis. Genes Cells. 2003, 8 (12): 913-923. 10.1111/j.1365-2443.2003.00693.x.PubMedView ArticleGoogle Scholar
- Lao-Sirieix SH, Nookala RK, Roversi P, Bell SD, Pellegrini L: Structure of the heterodimeric core primase. Nat Struct Mol Biol. 2005, 12 (12): 1137-1144. 10.1038/nsmb1013.PubMedView ArticleGoogle Scholar
- Arezi B, Kirk BW, Copeland WC, Kuchta RD: Interactions of DNA with human DNA primase monitored with photoactivatable cross-linking agents: implications for the role of the p58 subunit. Biochemistry. 1999, 38 (39): 12899-12907. 10.1021/bi9908991.PubMedView ArticleGoogle Scholar
- Ito N, Matsui I, Matsui E: Molecular basis for the subunit assembly of the primase from an archaeon Pyrococcus horikoshii. FEBS J. 2007, 274 (5): 1340-1351. 10.1111/j.1742-4658.2007.05690.x.PubMedView ArticleGoogle Scholar
- Klinge S, Hirst J, Maman JD, Krude T, Pellegrini L: An iron-sulfur domain of the eukaryotic primase is essential for RNA primer synthesis. Nat Struct Mol Biol. 2007, 14 (9): 875-877. 10.1038/nsmb1288.PubMedPubMed CentralView ArticleGoogle Scholar
- Weiner BE, Huang H, Dattilo BM, Nilges MJ, Fanning E, Chazin WJ: An iron-sulfur cluster in the C-terminal domain of the p58 subunit of human DNA primase. J Biol Chem. 2007, 282 (46): 33444-33451. 10.1074/jbc.M705826200.PubMedView ArticleGoogle Scholar
- Moyer SE, Lewis PW, Botchan MR: Isolation of the Cdc45/Mcm2-7/GINS (CMG) complex, a candidate for the eukaryotic DNA replication fork helicase. Proc Natl Acad Sci USA. 2006, 103 (27): 10236-10241. 10.1073/pnas.0602400103.PubMedPubMed CentralView ArticleGoogle Scholar
- Ilves I, Petojevic T, Pesavento JJ, Botchan MR: Activation of the MCM2-7 helicase by association with Cdc45 and GINS proteins. Mol Cell. 2010, 37 (2): 247-258. 10.1016/j.molcel.2009.12.030.PubMedView ArticleGoogle Scholar
- MacNeill SA: Structure and function of the GINS complex, a key component of the eukaryotic replisome. Biochem J. 2010, 425 (3): 489-500.PubMedView ArticleGoogle Scholar
- Labib K, Gambus A: A key role for the GINS complex at DNA replication forks. Trends Cell Biol. 2007, 17 (6): 271-278. 10.1016/j.tcb.2007.04.002.PubMedView ArticleGoogle Scholar
- Kamada K, Kubota Y, Arata T, Shindo Y, Hanaoka F: Structure of the human GINS complex and its assembly and functional interface in replication initiation. Nat Struct Mol Biol. 2007, 14 (5): 388-396. 10.1038/nsmb1231.PubMedView ArticleGoogle Scholar
- Choi JM, Lim HS, Kim JJ, Song OK, Cho Y: Crystal structure of the human GINS complex. Genes Dev. 2007, 21 (11): 1316-1321. 10.1101/gad.1548107.PubMedPubMed CentralView ArticleGoogle Scholar
- Chang YP, Wang G, Bermudez V, Hurwitz J, Chen XS: Crystal structure of the GINS complex and functional insights into its role in DNA replication. Proc Natl Acad Sci USA. 2007, 104 (31): 12685-12690. 10.1073/pnas.0705558104.PubMedPubMed CentralView ArticleGoogle Scholar
- Marinsek N, Barry ER, Makarova KS, Dionne I, Koonin EV, Bell SD: GINS, a central nexus in the archaeal DNA replication fork. EMBO Rep. 2006, 7 (5): 539-545.PubMedPubMed CentralGoogle Scholar
- Makarova KS, Wolf YI, Mekhedov SL, Mirkin BG, Koonin EV: Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell. Nucleic Acids Res. 2005, 33 (14): 4626-4638. 10.1093/nar/gki775.PubMedPubMed CentralView ArticleGoogle Scholar
- Takahashi TS, Wigley DB, Walter JC: Pumps, paradoxes and ploughshares: mechanism of the MCM2-7 DNA helicase. Trends Biochem Sci. 2005, 30 (8): 437-444. 10.1016/j.tibs.2005.06.007.PubMedView ArticleGoogle Scholar
- Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P: Mesophilic Crenarchaeota : proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev Microbiol. 2008, 6 (3): 245-252. 10.1038/nrmicro1852.PubMedView ArticleGoogle Scholar
- Elkins JG, Podar M, Graham DE, Makarova KS, Wolf Y, Randau L, Hedlund BP, Brochier-Armanet C, Kunin V, Anderson I, Lapidus A, Goltsman E, Barry K, Koonin EV, Hugenholtz P, Kyrpides N, Wanner G, Richardson P, Keller M, Stetter KO: A korarchaeal genome reveals insights into the evolution of the Archaea. Proc Natl Acad Sci USA. 2008, 105 (23): 8102-8107. 10.1073/pnas.0801980105.PubMedPubMed CentralView ArticleGoogle Scholar
- Yoshimochi T, Fujikane R, Kawanami M, Matsunaga F, Ishino Y: The GINS complex from Pyrococcus furiosus stimulates the MCM helicase activity. J Biol Chem. 2008, 283 (3): 1601-1609. 10.1074/jbc.M707654200.PubMedView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215 (3): 403-410.PubMedView ArticleGoogle Scholar
- Pruitt KD, Tatusova T, Klimke W, Maglott DR: NCBI Reference Sequences: current status, policy and new initiatives. Nucl Acids Res. 2009, D32-36. 10.1093/nar/gkn721. 37 Database
- Holm L, Kaariainen S, Wilton C, Plewczynski D: Using Dali for structural comparison of proteins. Curr Protoc Bioinformatics. 2006, Chapter 5 (5.5):
- Holm L, Park J: DaliLite workbench for protein structure comparison. Bioinformatics. 2000, 16 (6): 566-567. 10.1093/bioinformatics/16.6.566.PubMedView ArticleGoogle Scholar
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucl Acids Res. 1997, 25 (24): 4876-4882. 10.1093/nar/25.24.4876.PubMedPubMed CentralView ArticleGoogle Scholar
- Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics. 2007, 23 (21): 2947-2948. 10.1093/bioinformatics/btm404.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.