Responses to Reviewer 1 (Jaap Heringa)
We appreciate the reviewer’s efforts and thorough criticism in reviewing our manuscript.
The authors come up with a few potentially interesting findings on regulation and PPI in schizophrenia, these work more like snapshots than elaborately researched aspects. Furthermore, none of the findings are placed in a biological perspective. For example, if true, why would schizophrenia be more regulated via miRNA than other diseases or normal cell situations?
We do not say that schizophrenia is “more regulated” by miRNA than other diseases or normal situation, only that miRNA are implicated in schizophrenia: there are several papers some of which we cite that point out the involvement of microRNAs in schizophrenia.
Stoichiometry of interacting target proteins seems not to be an argument that would hold only for schizophrenia.
Certainly true but we studied here only schizophrenia-related phenomena/abnormalities.
I have the following reservations against this manuscript that I think the authors should address:
1. As a first point: Somewhat annoyingly, the authors did not insert page numbers in their manuscript, while their line numbering is of no help since line numbers start at 1 at every new page.
We are sorry about that, we number the pages now.
2. Although the paper is written compactly, it was at times difficult to grasp what exactly the authors have done. For example, the authors present data on a number of data sets, but it remains unclear how they have assembled these. There are descriptions of data collection in Methods and results, but these appear inconsistent. The authors should improve clarity here.
We tried to clarify now these issues, referring back to the appropriate Methods section.
Figure
1
provides some information, but the main text should provide the information. Later on in the paper, the authors talk about “data sets 1 and 2”, where some more explanation would be helpful.
Data sets 1, 2 and 3 as explained in the text now, refer to the 3 supplementary tables in the large-scale methylome studies by Wockner et al. (reference 6 in our manuscript) where they listed [1] all differentially methylated probes between patients and controls; [2] all differentially methylated probes between patients and controls corrected for age and PMI, post-mortem interval and [3] differentially methylated probes between two patient subgroups, respectively.
3. Considering the alluded bias for miRNA-directed regulation in schizophrenia, the authors write: “Remarkably, approximately 80 % of all schizophrenia-related genes are targeted by only the ten most frequently occurring microRNAs. We also calculated this ratio for the total number of genes regulated by this set of microRNAs, where it also proved to be a value close to 80 %.” Assuming that “the total number of genes” are all the genes in the Wockner et al. dataset, these statements appear to imply that if there is anything remarkable about the 80 %, it has nothing to do with schizophrenia, as there is no difference between schizophrenia and the general situation in the human brain in this regard. 4. In the text, the authors say that 1547 out of 2931 (53 %) of the genes implicated in schizophrenia (differential methylation – Wockner et al.) are regulated by one or more miRNAs. However, in Fig.
1
the percentages for only the top-10 and top-2 miRNAs (Wockner) are close to 80 and 40, respectively. Something must be wrong here.
Great point, thanks for pointing this out! As mentioned in the text now, not all schizophrenia-related genes are regulated by microRNAs (as per current knowledge, August 2015, this ratio might change as it is such an actively researched topic). According to others (see Caputo et al., ref 36 in the manuscript), only half of all protein-coding genes are regulated by microRNAs.
5. The authors make a point on the regulatory control of the top-2 miRNAs by writing: “As shown in Fig.
1
, the ratio of the targets of the top two microRNAs is around 40 % for the first data set (where all differentially methylated genes from Wockner et al. [6] are taken into account), however it increases to 46.4 % and 52.6 %, respectively, for the last two data set combinations, which combines the schizophrenia evidence from Genecards and Malacards, respectively. This may support the role of microRNAs in schizophrenia in general and that of the top two microRNAs in particular.” The conclusion that more confirmed data is more correlated with top-2 miRNA-directed regulation is just based on a single dataset of 19 elements. This is not a compelling argument.
There are 3 data sets: the Wockner set, the Genecards set and the Malacards set. We also added the ratio of targets for the top microRNA, miR-335-5p and also for the top five (in Fig. 1). While there is a more dramatic increase in ratios between the Genecards set and the Malacards set, there is also a slight increase between the Wockner set and the Genecards set. We hope the figure looks more convincing now.
6. Why did the authors only present data on the top 10 and top 2 miRNAs? They could draw a plot from top 1 to top 15 to show how the coverage falls off.
We extended Fig. 1 now, also calculating the ratios for the top 5 and the top one microRNAs. We hope the tendency is clear and convincing.
7. To check the scale free network property, the authors have made a network of miRNAs, but it is unclear to this reviewer what are the nodes and edges in the network. They use miRNAs and the numbers of genes each of those regulate, but how these data are converted in a network is obscure. An edge might have been declared whenever two miRNAs regulate the same target gene, but the authors should be clear on this. If the latter is indeed the procedure that is followed by the authors, then the biological importance of this network is questionable, and so are the findings that this network has scale free and small world properties. - In any event, the fashion of deriving scale free and small world properties is now mainly moot in the biological literature, since this has hardly led to increased biological understanding. As to inferring a power law from double-logarithmic plots, it is quite difficult in practice not to get a straight line using these, so the scale-freeness is not compelling, at least not to this reviewer.
Thank you for pointing out this lack of clarity in the text. We fixed this in the new version of the paper.
The network consists of miRNAs and genes regulated by them. If we do not include the protein-protein interactions between these genes it can be considered a bipartite directed network (biological example - Martinez and Walhout, 2009 *) with miRNA- > target directed links. This network can be characterized by in-degree (in this case the number of miRNAs controlling a given gene) distribution and out-degree (the number of genes controlled by a given miRNA) distribution. In Fig. 2 we depicted the out-degree distribution of this network. In addition to the plot, we performed goodness of fit test using Kolmogorov-Smirnov statistic as described in the Methods, which demonstrated that this distribution follows the power law.
*Martinez NJ, Walhout AJ. The interplay between transcription factors and microRNAs in genome-scale regulatory networks. Bioessays. 2009 Apr;31 [4]:435–45.
We also constructed a miRNA-miRNA network based on shared genes between any two miRNAs. For this network we found that it has small-world property. Small-world network has high average clustering coefficient and small characteristic path length.
Regarding the biological meaning: Clustering coefficient demonstrates how tightly connected any miRNA is to its neighbors in the miRNA-miRNA network; thus high clustering coefficient can mean that a gene is typically regulated by several microRNAs which can indicate the robustness of this control – malfunction of one miRNA or even several of them can be overpassed by the remaining miRNAs.
Our graph parameters:
Clustering coefficient: 0.807
Characteristic path length: 1.717
After generating a random graph using the Erdos-Renyi algorithm with the same number of nodes and the same average number of links per node (52.327) we found:
Clustering coefficient: 0.330
Characteristic path length: 1.669
Apparently, the real network has a higher clustering coefficient (0.807) than the random Erdos-Renyi graph (0.330).
8. On the potential role of miRNAs in schizophrenia, the authors write: “To see if they have significantly more interactions than those proteins not regulated by the same microRNAs we performed randomization and a statistical test described in the
Methods
section. Apparently, proteins that are regulated by the same microRNAs tend to have more interactions and one of the main regulatory roles of the microRNAs might be actually this coordinating effect, to make sure that interacting proteins have the correct stoichiometry in the cell [37].” Could the authors provide some data to show that proteins regulated by the same miRNA do indeed interact more?
We provided the results of simulation in Additional file 4: Table S3 for the top 10 schizophrenia-related miRNAs. In the table the actual connections between targets regulated by the same miRNAs and the average number of connections across 3000 simulations for each miRNA are given.
9. In addition to a bias in interaction for proteins regulated by the same miRNA, did the authors check whether PPI in schizophrenia is biased in general, so regardless of miRNA regulation?
Perhaps the referee meant if there are more PPI among the schizophrenia-related genes than between schizophrenia-related and unrelated genes? We did not check this. Considering that there are more than schizophrenia-related 2000 genes, the results would be probably inconclusive.
10. On miRNA expression, the authors write: “We also checked the abundance of the microRNAs taken from the resource mirbase.org, which has the most comprehensive annotation about microRNAs in human tissues across tens of different experiments [29]. The data for the most abundant microRNAs are shown in Fig.
4
, plotted against the number of known targets. Apparently there is a positive correlation between the two values, supporting the theory of “competing endogenous RNAs” [39], which inherently assumes that microRNAs with more targets are also expressed in higher quantities, to carry out their regulatory functions.” Although the authors claim there is a positive correlation (what is the r-value?) the plot looks rather erratic. It would be interesting to check the correlation when some outliers are removed. As another point, was the data used here specifically for schizophrenia? The resource mirbase.org seems more general.
Because the data deviated significantly from the normal distribution we have run Spearman's rank correlation test, which is also robust regarding the outliers. As we described in “2.6. Correlation between the abundance of miRNAs and the number of miRNA targets” section of Methods we obtained statistically significant correlation with Spearman's rank correlation coefficient rho = 0.52 between target counts and mature miRNA read counts and rho = 0.67 between target counts and stem-loop transcripts read counts (p-value < 2.2e-16 in both cases). We provided the plots on logarithmic scale.
We also tested correlation between schizophrenia miRNA target counts and read counts for these miRNAs and demonstrated statistically significant correlation with Spearman's coefficient, rho = 0.47 (p-value = 1.012e-09) for mature schizophrenia-related miRNA read counts; and rho = 0.49 (p-value = 2.255e-11) for SZ miRNAs stem loops.
11. On differential methylation of repetitive elements, the authors write: “We studied the methylation of repetitive elements by comparing the sequences of the differentially methylated probes in all three data sets in [6] to Repbase, a collection of all repetitive elements in eukaryotic genomes [35]. While we did not find any particular repetitive element enriched in the differentially methylated probes, using Student’s t-test we did find data sets 1 and 2 significantly more methylated (p-value < 1e-5) if they matched a repetitive element when compared to the methylation value distributions derived from the entire sets (Fig.
5
a and b) whereas for data set 3 the repetitive element matching probes were slightly less methylated than the entire set (p-value = 0.017).” This section is not clear to this reviewer, while the results seem inconsistent. This section should either be elaborated or deleted.
We expanded this section, we hope it is clear now. In principle, we compared the sequences of each probe (that was found differentially methylated in the Wockner paper) to each repetitive element in Repbase. When we compare the methylation value distributions for the repetitive probes only (in Fig. 5b) to the distributions of the methylation values of all differentially methylated probes (Fig. 5a), it is clear that for set 1 and set 2 (x1 and x2 in Fig. 5) the patients’ have higher methylations levels (the positive side of the histograms in Fig. 5b increase when compared to Fig. 5a) whereas for the x3 data set (also from the Wockner set, comparing two patient subgroups) methylation distribution (histogram) remains the same for the repetitive subset (in Fig. 5b) when compared to the total set (to be more precise, it is marginally different, p-value being 0.017).
12. In their Conclusions section the authors only reiterate earlier findings that are not the topic of the current manuscript. In summary, this paper presents a number of very interesting findings, but the work needs to be elaborated and the results should be placed in a biological/evolutionary context.
We changed the Conclusions section accordingly and expanded on the biological meaning in the Discussion. Thanks for the suggestions and the overall positive impression about our findings.
Responses to Reviewer 2 (Sandor Pongor)
In this work the authors analyzed several recent data sets: (i) a methylome study, (ii) microRNAs’ experimentally verified targets collected from the literature, (iii) STRING, a protein-protein interaction database, (iv) Genecards, (v) regulatory regions of human genes and transcription factor binding sites mapped to the human genome, concluding that GABBR1 plays a significant role in the etiology of schizophrenia. It is certainly of interest, considering that schizophrenia affects 1 % of the population with heavy toll on society both in financial and human terms. Interestingly, the work is in line with a previous finding by the authors, also published in Biology Direct, where they also concluded that the downregulation of GABBR1 via an endogenous retroviral element might be the cause of schizophrenia. In this study they start their investigation with microRNAs implicated in schizophrenia, finding that microRNA targets form a scale-free network and, accordingly, the top ten microRNAs regulate 80 % of all schizophrenia-related genes. The top two microRNAs regulate 40-52 % of all genes, the ratio depending on the data set they use. They suggest that the more relevant the gene set is to schizophrenia, the higher this ratio is, highlighting the importance of microRNAs in schizophrenia. The top two microRNAs both regulate GABBR1, from which they conclude again that this gene is of special interest in schizophrenia.
Thanks for the overall positive view.
The authors may want to deal with the following issues: 1. The selection of genes in Additional file 6: Table S2 seems rather arbitrary. While they also implicate AKT1 as a gene of importance from the protein-protein interaction network they draw of genes regulated by one of the top 2 microRNAs, there is no data about transcription factor binding sites (TFBSs) for AKT1.
We tried to select the most important genes. There was no available data on the cis regulatory and promoter regions in the Thurman data set about AKT1.
2. There is no data for KCNJ9 (in the same table) whereas it directly interacts with GABBR1.
We fixed this, the gene is now included in Additional file 4: Table S2.
3. It is difficult to see the significance of the TFBSs. The authors should at least calculate the correlation between the genes they list in Additional file
6
: Table S2.
We felt that this would place too much emphasis on the TFBSs and draw the attention away from the main focus of the paper, which was the microRNA-regulation and protein interaction networks.
4. Does the - approximately - scale-free nature of the network have a biological significance?
See our response in this respect to reviewer 1 above. It is the robustness of the network that is supported by the scale-free nature of the network, which will probably increase over time as more microRNA-target relations will get discovered, considering how novel and intensely researched the subject is.
Responses to Reviewer 3 (Zoltan Gaspari)
Schizophrenia is a disease with unknown aetiology despite numerous efforts to find its cause for over more than a century. In this study Gumerov and Hegyi approach the subject from the point of view of microRNAs. They combine microRNA target data with a methylation study conducted in schizophrenic brains, a protein-protein interaction database, STRING, and various gene sets such as Genecards and Malacards, mostly derived from text mining of the literature. The authors find that the mostly hypermethylated gene in the methylation study, GABBR1 is also the target of the top two microRNAs with the highest number of targets in their set. Combining this with protein-protein interaction data they find that most proteins form a network of interactions with two hubs where one of the hubs is again GABBR1 while the other hub is AKT1, protein kinase B, an important protein in signal transduction. From this they conclude that GABBR1 might play a causative role in schizophrenia. I think the work is original and describes observations that can be of importance in understanding schizophrenia.
Thanks for taking on our manuscript and for the positive tone.
I have a few notes that I ask the authors to address in the final published version:
1. If the genes in question form a network by being targeted by shared microRNAs as they suggest, should not these microRNAs also be hyper- or hypomethylated in the methylation study they analyze in this study? There is no mentioning of this in the manuscript.
There were only a handful of differentially methylated microRNAs in the methylome study we based our study on, none of them particularly outstanding.
2. How do the transcription factors with binding sites in the cis regions of the genes in the network in Fig.
3
relate to the shared microRNAs? Aren’t they more important in the regulation of the interacting genes than the microRNAs they focus on? (Also, there is no data on promoter transcription factor binding sites, only on cis regulatory elements). This gives an incomplete picture.
We found that the two different types of regulatory mechanisms (transcription factor binding sites in the cis regions and microRNA-targeting) are two unrelated mechanisms, which we mention in the paper. We wanted to focus on the latter (microRNAs – protein targets) in this study.
3. There seems to be an inconsistency regarding the hyper- and hypomethylation of the genes that interact with each other. E.g. the authors claim that AKT1 is downregulated in schizophrenics whereas AKT1 seems to be only mildly hypermethylated based on its shade of grey in Fig.
3
.)
This is a very good point. However, we found that most genes are both hyper- and hypomethylated in the Wockner study. Also, the difference between earlier studies and the current methylome study may lie in the fact that while the study that found AKT1 consistently downregulated in schizophrenic patients investigated the gene expression patterns in the blood of patients and with recent-onset disease, the methylome study analyzed the brains of (dead) schizophrenic patients, with apparently longer histories of the disease.
4. It would be interesting to see whether in the full available interaction network of the proteins in question (i.e. not only those regulated by the selected miRNAs and/or indicated in schizophrenia) GABBR1 and AKT1 still stand out with some properties.
We tried to generate a network using PPIs in STRING for the fully available interaction network. However, AKT1 seems to have a lot more (known) interactions than GABBR1, therefore the picture seemed a lot more complex and nonspecific. As far as we know, there is no PPI database that would show only tissue-specific protein interactions.
Minor issues: − When using data from mirbase.org, were the tissue origins of the miRNAs considered? –
Yes, we did additional plots, as there was available data for the prefrontal cortex only. Our findings are still valid (Additional file 5: Figure S3).
In Fig.
4
, might the existence of the correlation be better visualized on a logarithmic scale? –
Yes, thanks for the suggestion, we did it and it looks better.
In the abstract the authors mention a scale-free network. I would refrain from this notation even if the power-law distribution for miRNA targets is valid because the nodes in the full network are not equivalent (miRNAs and regulated proteins).
We expanded on this issue clarifying the networks we had in mind and removed the term “scale-free network”. Using the known miRNA-target relations we constructed two types of networks: a miRNA- > target and a miRNA-miRNA network. The network of miRNAs and the genes regulated by them, not including interactions between these genes, is a bipartite directed network with miRNA- > target directed links. This network can be characterized by in-degree (in this case the number of miRNAs controlling a given gene) distribution and out-degree (the number of genes controlled by a given miRNA) distribution. In Fig. 2 we depicted the out-degree distribution of this network. We tested goodness of fit using Kolmogorov-Smirnov statistic as described in Methods, which demonstrated that this distribution follows the power law.
Supplementary information is available at the journal’s website.