Reviewer's report Title: AID/APOBEC cytosine deaminase induces genome-wide mutation clusters Version: 1 Date: 1 November 2012 Reviewer number: 1 Professor Sandor Pongor
Regions of localized hypermutations – so-called kataegis regions – were recently found to be colocalised with regions of somatic genome rearrangements in cancer genomes. As C:G->T:A transitions were overabundant in these regions, it was hypothesized that AID/APOBEC editing deaminases that are responsible for cytosine to uracil deamination in single-stranded DNA or RNA, may be one of the causative agents generating localized hypermutations. However plausible in the chemical sense, this hypothesis is difficult to prove by experiment. In this Discovery Note, Lada et al. describe an experiment designed to provide a very interesting piece of supporting evidence to this hypothesis. The authors used a diploid yeast sensitized to deamination effects by the removal of the uracil DNA glycosylase gene (ung1) as the model organism. Then they expressed a hyperactive AID/APOBEC protein from sea lamprey in this organism and explored the distribution of mutations along the chromosomes. It was found that distribution of these mutations is highly uneven, and the differeces are especially striking in comparison with those induced by the base analog mutagen 6-hydroxlaminopurine (HAP) which was used as a control. The findings are straightforward and provide strong support to the hypothesis that unleashed AID/APOBEC may be the causative agents of hypermutations found in cancer genomes. The link between the two phenomena is that clusters of deaminase-induced mutations in yeast are very similar to those found in cancer cells. It would be very interesting to see a more detailed description of this similarity. Are there similarities in the sequence contexts? It is a convincing argument that HAP-induced mutations in ung1- mutants are entirely random, but perhaps there are other examples of or analogies with more uneven mutations in the literature where, for instance, the context of the mutations are different.
Author’s response: We are glad that Dr. Pongor considered our study interesting and convincing. We would like to thank Dr. Pongor for the constructive suggestions on manuscript improvements. We have discussed the sequence context of the mutations found in our study and in breast cancer samples, as well as the recent paper (Roberts et al.) where clustered mutations were induced in yeast by a different mutagen-MMS.
The experiments are complex, and the details are described in an experiment under review. I would suggest the authors add a brief description of the experiment as an appendix to this note.
Author’s response: As suggested, we have added a short overview of experiments undertaken. We also provide more details in the responses to Reviewer #3. In addition, since the format of the Discovery Notes is adapted to the short communications, we refer the interested readers to the paper (Lada et al.) that is currently under review. This paper contains the details of experiments undertaken.
In summary, I find the experiment well-designed and the conclusions convincing.
Reviewer’s response: I accept the revisions.
Quality of written English: Acceptable
Reviewer's report Title: AID/APOBEC cytosine deaminase induces genome-wide mutation clusters Version: 1 Date: 15 November 2012 Reviewer number: 2 Professor Shamil R. Sunyaev
I find this manuscript to be of great interest. Two recent publications reported presence of mutation clusters induced by APOBEC proteins in cancer genomes, shedding new light on the nature of spontaneous somatic mutagenesis. This manuscript provides experimental evidence supporting the hypothesis of recent observational studies. The authors report that genomes of yeast mutants carrying the hypermutagenic deaminase contained mutation clusters highly similar to clusters (putatively caused by APOBECs) observed in tumor genomes. This is an important result and I have no suggestions for improvements.
Author’s response: We are excited that Professor Sunyaev found our work to be of great interest and that it provides new information on spontaneous mutagenesis.
Reviewer’s response: I did not have any concerns with the manuscript.
Quality of written English: Acceptable
Reviewer's reportTitle: AID/APOBEC cytosine deaminase induces genome-wide mutation clustersVersion: 1Date: 15 November 2012Reviewer number: 3Dr Vladimir Kuznetsov
Report form: Comments
Recent papers (2,3) have provided detail descriptions of clustered mutation sites in the genomes of four human cancers and in yeast cells. In (3) functional and structural association of APOBEC proteins with clustered mutation have been suggested. However, more direct functional associations of APOBEC family member(s) proteins with clustered mutations have to be carrying out. In this study, the authors used sequencing technique and their yeast model to study genome clusters of hypermutation activity of deaminases PmCDA1 and AID.
Major concerns and my recommendations:
1. Analysis of the literature is essentially incomplete
The authors claimed: “… a direct link between APOBEC deaminase activity and genome-wide hypermutagenesis is still lacking.” However, this claim has to be debated. Atomic force microscopy studies provided direct evidence of the structural details of direct interaction of APOBEC3G with ssDNA on a specific site at a sing molecular level and at nanometer resolution (Shlyakhtenko et al., 2011). Yamane et al. (2011) reported about deep sequencing analysis of mutations and identification of the genomic targets of AID in mouse B-cells and provided the evidences of association of ssDNA hypermutation sites with APOBEC binding motif. At least two papers reported functional and structural connection between AID/APOBECs and genome-wide hypermutation (Klein et al. 2011; Yamane et al. 2011). Both papers studied the impact of AID in mouse B-cells at the genome scale.
Author’s response to 1: We thank Dr. Kuznetsov for extensive review of our paper that took significant effort and almost two months. In response to the critique we added a more balanced discussion of the papers that we deemed ultimately related to our study (references 2–8). As to papers mentioned by the reviewer, a very interesting article by Yamane et al. (2011) is devoted to the construction of ChIP-based whole-genome maps of AID and RPA occupancies and is neither analyzing genome-wide mutation distributions nor report the discovery of clustered mutations. Moreover, in our opinion, interpretation of the very solid experimental results of this study should be re-considered, because they are in direct disagreement to data obtained in our lab (Lada et al., 2011) and by Dr. Myron Goodman’s group (Pham et al., 2008; Chelico et al., 2009). The paper by Klein et al. (2011) presents a very thorough genome-wide study where the authors used a powerful translocation-capture sequencing method to map chromosomal rearrangements in B lymphocytes. Although the authors do report that translocation hotspots were accompanied by the base substitutions, we would like to point out that, similar to the paper by Yamane et al. (2011), the genome-wide mutagenesis study is not performed in this study and mutational clusters are not detected. Moreover, the paper by Nik-Zainal (2012) reporting the discovery of kataegis and discussing the potential involvement of the APOBEC protein in the formation of clusters of mutations was published later than all of the mentioned papers. In addition, studies of activated B-cells, which provide the natural environment for the AID activity, do not explain how the genomes of breast cells become edited by the APOBEC proteins.
2. There is no description of the sequencing methods. Even the number of reads was not reported.
3. Raw and processed data are not available.
4. Sequencing generation, sequence data analysis, genome assembly and mapping procedures and results of these steps omitted.
Author’s response to 2–4: The format of the Discovery Notes does not allow us to include all the Materials and Methods related to our data. We refer to our parallel paper (Lada et al., currently under review) where all the details of experimental procedures and data analysis are described in detail. However, we have added a short description of materials and methods used in this manuscript, including the numbers of reads, coverage and the NCBI accession number for the raw data. This text is available as an Additional File
5. Authors did not provide systematic evidences of accuracy of their finding.
Statistical model(s) of background noise, testing methods, and analysis of experimental results are not reported. There are no any estimates of specificity and sensitivity of the proposed experimentally detected mutation sites and clustered mutations associated with PmCDA1 and AID activity.
Author’s response to 5: All draft reference genome assemblies performed in this study were manually edited and assembly errors were excluded from analysis. The remaining questionable few regions were sequenced using the Sanger method to confirm or reject the SNVs detected. The detailed description of these procedures is beyond the scope of the Discovery Notes, see response to comments 2–4.
6. A work needs to develop an analysis of the boundaries of clustered mutations; result should include the frequency tables of all observed mutation transitions occurred in clustered mutation as well as in the regions out of the clusters.
Author’s response to 6: There are methods to analyze the clustering of mutations that attempt to locate the boundaries of the regions with an elevated frequency of mutations (P.J. Gearhart, D.F. Bogenhagen, 1983. Clusters of point mutations are found exclusively around rearranged antibody variable genes. Proc. Natl. Acad. Sci. U.S.A. 80, 3439–3443; H. Tang, R.C. Lewontin 1999. Locating regions of differential variability in DNA and protein sequences. Genetics 153, 485–495). These methods, however, require a much higher frequency of mutations per nucleotide and were tested for relatively short sequences.
We have used a classification approach to analyze the distribution of mutations across yeast chromosomes using non-overlapping windows. This method is not capable of finding the exact boundaries of hypermutable regions, however it allows for the detection of the general trends in a robust way. It is described in more details in the revised draft and in new Additional File
7. There are no final lists of clustered mutations and their genome coordinates and biological interpretation.
Author’s response to 7: We have added Table 1, which contains distribution of mutations in 1 Kb windows.
8–9. The number of C to T substitution mutation is reported only for two genes on chrX; There are no quantitative data and numerical/statistical characteristics for clustered mutation sites, any other genes, regions and chromosomes.9. Statistical distributions of all base transitions (e.g. % of C to T, G to A etc.) should be presented and discussed. The work should provide mutation’ classification and include description of the substitution mutation in the clusters on positive and negative strands and supporting by APOBEX/AID motif(s) co-localization.
Author’s response to 8–9: See response to comments 2–4.
10. There is no comparison of the results of this genome-wide finding with alternative studies.
Author’s response to 10: We have included a more extensive discussion of the results by Roberts et al. (2012).
11. A reason of using 6-hydroxlaminopurine (HAP) treated cells as a negative control should be explained.
Author’s response to 11: We are especially grateful for this comment. One of the major emphases of the paper is to study mutagenesis in diploid yeast independent of recombination, which is uniquely frequent in this organism. We have chosen conditions and mutagens when induced recombination is suppressed and the situation is closer to processes in human cells (both HAP and PmCDA1 in ung1- strains does not induce recombination in yeast). We have updated the text accordingly to make this more transparent.
This work is essentially incomplete and poorly performed; there is no way to reproduce its methods, results and evaluate their actual value.
Author’s response: See answers to comments 2–4 and 8–9. We also think that even without the knowledge of fine experimental details there is a straightforward way to reproduce the results of this work by expression of PmCDA1 gene in diploid ung1- yeast strain or treatment by HAP, selection of mutants and genome sequencing.
Quality of written English: Not suitable for publication unless extensively edited
Author’s response: Please see evaluation by reviewers 1 and 2. Nevertheless, we have put forth additional effort and we have carefully edited the manuscript.
1. “….clusters very similar to those found in tumors”.
What kind of parameter(s) is similar? What kind of similarity/dissimilarity measure(s) between mutation clusters in yeast and human cancer genomes was used? Is there some statistical estimation? If it is statistical-based analysis, the test and confidence values should be reported.
2. “We also think that even without the knowledge of fine experimental details there is a straightforward way to reproduce the results of this work by expression of PmCDA1 gene in diploid ung1- yeast strain or treatment by HAP, selection of mutants and genome sequencing.”
Unfortunately, NGS is not well matured and standardized technic, specifically, in context of ‘The details’ of experimental procedures, data analysis and interpretation. Perhaps many readers of BD whose have an experience to use NGS technics and corresponding analytical method, should disagree with the authors point. Specifically -processing, alignment, mapping results and analysis of data are not trivial steps and are usually reported in publications as regular (not referred to unpublished data). As usual publication practices, it should be presented in suppl. file.
3. “primer sequences are available upon request”.
Why? This information should be present in the work, if no commercial interest.
4. “Importantly, our study reveals that the AID/APOBEC proteins can induce kataegis in the genome”
This conclusion should be too strong. The inducer(s) of “kataegis” were not defined; it might be identified in future works.
5. “Our data provide evidence that unleashed cytosine deaminase activity is an evolutionary conserved, prominent source of genome-wide kataegis events.”
It might be too strong conclusion. The evidences of the evolution conservation of cytosine deaminase activity in “genome-wide kataegis” loci across species were not reported and they should be done for specific kataegis loci if any.
6. Minor: NGS Instrument model should be indicated in the manuscripts.
Quality of written English: Acceptable.