Reviewer's report 1
Paul Harrison Department of Biology, McGill University, Canada (nominated by Mark Gerstein, Biomedical Informatics, Yale University, USA
This paper describes a database of Nuclear Receptors and Receptor Tyrosine Kinases, and some analysis of their protein interactions.
* Firstly, it would be of benefit to researchers to make clear how much novel curation is involved in the construction of the database. At present, this is not entirely clear.
Author's response: We have added an explanation about the curation process in "NR-RTKdatabase content".
* Secondly, in the analysis of the interaction networks, the authors do not entertain other possible equation fits for the distribution of K_interactors versus Frequency (K_interactors). One should really check other possible equations also (not just a power law equation).
Author's response: We agree with the reviewer that we should check other equations. We feel that testing the scale-free topology and biological interpretation is sufficient for publication on this topic. As we are carrying on other works with the data, a more detailed comparison with other power law equation is desirable and we hope to complete that in the near future.
* Thirdly, in the database, the Pubmed links in the lists of families do not appear to work.
Author's response: Done, all links work.
* Fourthly, the authors should make sure that the format of the webpages and download files is adequately explained. Currently, there is not sufficient explanation. For example, there is not a detailed explanation of the columns in the Excel file of downloadable interactions. Help pages with this information should be provided.
Author's response: This information is explained in the manuscript in Results and Discussion section ("NR-RTK module"). Furthermore, we are now in the process of generating a new website to house these data.
Reviewer's report 2
Arcady Mushegian, Department of Bioinformatics, Stowers Institute for Medical Research, Kansas City, Missouri, USA.
* This study is suitable for publication as a Discovery Note, not as a Research Article. Targeted reconstruction of a protein-centered interaction subnetwork by extracting and scoring all relationships of that protein may be of interest if this results in novel observations about biological system, and I would suggest to show more of it in this note.
Author's response: We don't think that this study can be considered as a Discovery Note. We believe that the significance of this work consists not just in the approach, but in the combination of methodology and biology to get new insights on the important receptors in signal transduction.
* For instance, instead of primarily focusing on the node degree of the network, the authors may discuss in more detail the actual gene content of the modules that they discover. How many of the connections and of module composition are well-known and how many are novel/not covered in the literature? Any new hypotheses suggested by the module inference? What type of evidence makes the most significant contribution to the network? Are there types of evidence that do/do not improve the score?
Author's response: The sub-networks discovered are RTK-RTK and NR-NR networks. The gene content of each sub-network is quite clearly described in the database such are their potential connections and their citations in literature. The module inference is exactly the main aim of the presented paper. Our approach may serve as predictive tool for indentifying key interactions and providing insight in experimental validation (in vivo and in vitro assays) (table1). The experimental assays and co-citations in literature for a given interaction improve the score.
The figures as they stand now are typical of many "systems-biology" papers, but it is not clear what to make of them. Are we supposed to eyeball the list of gene names (in which case the table would suffice)? Is the density of links supposed to be the main message? A position of particular nodes?
Author's response: The figures represent the interactions derived from different sources of evidence showing the central role of some proteins so called "hubs". The main point is the identification of these hubs according to their position in the network and other topological parameters (supplemental file 2).
Finally, about the "scale-free" character of the network. First, I am not sure that it is useful to compute the node degree distribution of the local, i.e., protein-centric network: if it is by construction a set of interactors of one protein, it is guaranteed that at least one protein will be very highly connected. More formally, it is known that fitting to the power law is not a right test here: many purported "scale-free" networks have been proven to reject the hypothesis in standard tests (e.g., Khanin and Wit, JCB 2006), or fit the power law only on an interval, or on different intervals with different values of the gamma parameter (discussed, for example, in several of Mark Newman's papers). Moreover, suppose that the distribution can be fit to a mixture of functions - what would the biological conclusion be?
Author's response: We agree with reviewer. Although the statistical test shows that the network is scale-free, we are also aware that much more tests should be performed especially for such complex system.
Regarding your question supposing a mixture of functions, this would be explained by two types of interactions: physical interactions and cross-talking with the two types of receptors that modulate gene expression.
Reviewer's report 3: Anthony Almudevar, Department of Biostatistics and Computational Biology University of Rochester Medical Center, Rochester, NY
The authors are concerned with two classes of proteins, nuclear receptors (NR) and receptor tyrosine kinases (RTK). These have broad functionality, with variants implicated in many disease states, including cancer. A database is constructed based on a protein-protein interaction (PPI) network of 48 NRs and 53 RTKs. The PPIs are compiled using various existing knowledge sources.
Author's response: We corrected the typing error in manuscript, 56 RTKs instead of 53 RTKs.
Three networks are analyzed, with NR proteins only (NR-NR), with RTK proteins only (RTK-RTK), and with all proteins (NR-RTK). It is reported that the RTK-RTK and NR-RTK networks have scale-free topology (ie. possess a power-law node degree distribution, with a small number of highly connected hubs, as is common in cellular networks). This does not hold for the NR-NR network, an observation which conforms to another published finding. A hypothesis for this observation is given, involving the presence of a large number of negative feedback loops.
Proteins which form highly connected hubs in the NR-RTK network (11 listed) are conjectured to influence the transmission of information between the NR and RTK networks.
The NR-RTK database is available at a URL given in the article. It is fairly basic in structure, but provides a useful way to explore the properties of the network and its components.
Overall, the paper is concerned with the construction of a knowledge database, and is not concerned with new methodology. As such, some potentially useful hypotheses concerning signaling pathways, and their role in disease states, are generated. The methods used seem sound. The given biological background, and hence the motivation for the database, is interesting.
Some suggestions:
(1) Abstract - Results: "We constructed a human signalling network ... that indentified a much more connected network topology than previously thought." Is it possible to provide citations, or to elaborate on this claim?
Author's response: To elaborate this result, we rephrased this sentence.
(2) In "Results and discussion" and "Methods: sections: The method used by the authors for testing the scale-free property is given in Barabási and Albert (1999) Science, vol 286, p -509-512. The exponent of the power law (the slope of the regression line) might also be reported, since it is sometimes used to characterize network properties, and might allow for a useful comparison to other cellular networks.
Author's response: We added this reference.
Minor points:
Background
Results and discussion - NR-RTK signalling network
-
paragraph 2: enclose "especially EGFR" in commas.
-
paragraph 4: rephrase sentence starting "The remaining ..."
-
paragraph 5: "and the PPI" - > "and the time at which the PPI ..."
Results and discussion - Topology of the interaction networks
-
paragraph 2: "few nodes" - > "a few nodes"
-
paragraph 2: "network" - > "networks"
-
paragraph 2: "on large" - > "on a large"
Author's response: we corrected these points accordingly.