Reviewer number 1: Dr Robert Murphy
Comment-1: This manuscript describe a fairly simply design of a machine learning system for predicting whether a chemical structure is similar to previously approved drugs. It describes a web server to provide predictions about new structures.
The manuscript does not provide sufficient discussion of relevant prior work and quantitative comparison with other published approaches for which code is available (e.g., Bickerton et al. 2012). Approaches such as features reflecting drug dynamics (e.g., Vistoli et al. (2008) Drug Discovery Today 13:285–294 (doi:10.1016/j.drudis.2007.11.007) are also not discussed.
Response: In the revised version, we have discussed the previous studies as suggested by reviewer. After getting comments from the reviewer, we evaluate performance of QED model on our datasets, QED correctly predict 44.8% (sensitivity) approved and 81.28% (specificity) experimental drugs. While on independent dataset, it shows only 40% sensitivity and 52.5% specificity. QED (Bickerton et al. 2012) perform poor on our dataset because it is developed for predicting oral drug-likeness of a molecule. The high sensitivity and specificity of our models described in this study implies its usefulness in predicting drug-likeness of a molecule.
Comment-2: There is a potentially serious concern with the validity of the results due to the fact that the experimental design may result in overfitting. Even though cross-validation was used internally for combinations of features and learners to evaluate predictive accuracies, when these results are subsequently used to make decisions (such as which features to use) it compromises any conclusions from further analysis of the same training and testing data. A related problem may also arise from maximization of ROC area when some of the experimental drugs may indeed be drug-like. These concerns were shown to be warranted because the final evaluation using an independent dataset showed much lower accuracy. However, it is somewhat encouraging that twenty-one molecules in the test set that were recently approved as drugs were classified as “drug-like” by the authors’ model.
Response: We are thankful to reviewer for this valuable comment. In order to further validate our prediction model, we used Monte-Carlo approach where we randomly create training and testing datasets 30 times and compute average performance. We achieved sensitivity 87.88%, specificity 90.36% and accuracy 89.63% when evaluated using Monte-Carlo approach. The result for every set is provided in supplementary document (Additional file 1: Table-S2) in the form of sensitivity, specificity, accuracy and MCC along with their mean and standard deviation. These results were more or less same to the previous five fold results. The result indicates that our models are not over-fitted and will be useful in real scenario.
Comment-3: The web server model does not seem appropriate for the primary use case, which is envisaged to be making predictions for users with novel structures. Since users may wish to keep their structures private, an open source approach would be strongly preferable to a public server. This would secure use of the system and also permit inspection and modification of the methods used.
Response: We are thankful for this suggestion. We understand the limitation of the webserver used for prediction. In order to facilitate and for the sake of user privacy, we developed a standalone version of this software, which is available for download from http://osddlinux.osdd.net, now user can run our software on their local machine.
Additional comment-1: The author list contains “Open Source Drug Discovery Consortium” which is not a person and is not mentioned elsewhere in the manuscript.
Response: We are thankful for this comment. In the revised version, we have acknowledged the Open Source Drug Discovery Consortium instead of authors list.
Additional comment-2: The abstract refers to screening but the manuscript does not describe any screening results.
Response: The authors are thankful for this suggestion. In the revised manuscript, we have provided the detailed of chemical libraries and their screening results under the paragraph screening of databases.
Quality of written English: Needs some language corrections before being published.
Response: We have corrected the language in the revised manuscript.
Reviewer number 2: Prof Difei Wang (nominated by Dr Yuriy Gusev)
In general, this is an interesting work and it is important to predict drug-like molecules using various types of molecular fingerprints. However, I do have some questions about the manuscript.
Comment-1: On page7, the authors stated that “Similarly, MACCS fingerprint elements 112, 122, 144, and 150 were highly desirable and present with higher frequency in the approved drugs [Table 2, Figure 3]”. How to interpret this observation? What are the definition of MACCS-144 and −150 etc.? It will be very useful if the authors can clearly explain what are these features. Also, MACCS-66 is missing here but it does show up in the Table. Is there any reason to exclude MACCS-66 here?
Response: We are thankful to the reviewer for this nice suggestion. Here, we are providing the selected MACCS keys description that would be useful to interpret the results [Additional file 1: Table-S1].
-
a)
MACCS 66: A tetrahedral carbon atom connected with 3 carbons and one (that may or may not be carbon) atom.
-
b)
MACCS 112: Any atom connected with four atoms by any kind of bond (single, double or triple).
-
c)
MACCS 122: A nitrogen atom joined with 3 other atoms by any kind of bond.
-
d)
MACCS 138: An aliphatic carbon connected with 3 atoms of which one atom is not the carbon or hydrogen, second is any atom and third is with 2 further hydrogen’s.
-
e)
MACCS 144: Any four atoms connected by non-aromatic bonds.
-
f)
MACCS 150: Any four atoms connected of which atom 1,2 and 3,4 connected by non-ring bond and atom 2,3 joined by ring bond.
Comment-2: What is the score cutoff value for drug like and non drug like molecules for database screening results? What are the meaning of “drug like, low”, “drug like, high” and “non drug like, low”? What false-positive rate do we expect here?
Response: The authors are thankful for this comment. In this study, we have used a threshold value 0 for discrimination of the approved and experimental drugs.
The SVM score is categorized into three groups:
-
a)
Very High: used when the score is >1 (drug-like) and < −1.0 (Non drug-like).
-
b)
High: used when the score is between 0.5-1.0 (drug-like) and in between −1.0 to −0.5 (Non drug-like).
-
c)
Low: when the score lies in between 0–0.5 (drug-like) and in between −0.5 to 0 (Non drug-like).
False positive rate has been calculated via 30 times shuffling the dataset in five fold cross-validation and the average value of FPR is 9.64% (Additional file 2: Table-S2).
Comment-3: How many distinct structural families in drugbank3.0? How structurally diverse of this dataset? Are there many drugs having similar structures? If the answer is yes, will it bias the fingerprint selection and model creation?
Response: We are thankful for this valuable comment. After getting this comment, we analyzed the structural family of drugs in drugbank3.0 and found that at present these were classified into 233 different families (http://www.drugbank.ca/drug_classes). This clearly shows the dataset is highly diverse and suitable for model development.
Comment-4: I tried the example on the web server. But it seems slow and could not give me the result. Is this server really functional?
Response: We are thankful to the reviewer for this comment. Now, the server is completely functional.
Comment-5: Will it possible to have a standalone version of the web server? It will be great if there is a standalone version available to the community.
Response: We are thankful for such a nice suggestion. To improve the visibility of this work, we have developed a standalone version of this software. This is available to the users at http://osddlinux.osdd.net.
Comment-6: On page 1, "can predict drug-likeness of molecules with precession." Is "precession" a typo?
Response: We are thankful to the reviewer for pointing out this typo error. In the revised version, we have corrected this mistake and also take care of any other grammatical error.
Comment-7: I am not sure if this topic is suitable for this computational biology-centric journal. Maybe, this work is more suitable for publishing in journals like BMC.
Response: We are thankful for this suggestion and we think this kind of work is well suited for this journal.
Quality of written English: Acceptable
Reviewer number 3: Mr Ahmet Bakan (nominated by Prof James Faeder)
Comment-1: The authors developed various classification models using an exhaustive set of chemical fingerprints for discriminating approved drugs from experimental drugs and made these models available via a web server. In the past years, many newly approved drug molecules are breaking the widely accepted rule of 5 for drug-likeness, this improving and updating methods for calculating drug-likeness is an important problem. However, I don't understand why authors developed models that discriminate "approved" drugs from "experimental" drugs. Experimental drugs are molecules that are under investigation. Being experimental does not meet the compound is not drug-like, so any model that discriminates approved from experimental does not have any value. The exhaustive approach would be valuable if models were developed to discriminate drug-like, safe compounds from potentially toxic, non-drug-like compounds.
Response: We completely agreed with the reviewer comment. Although, studies have been done previously with focused towards the discrimination of drug-like molecules from non-drug-like ones. But most of these were based on the use of commercial dataset like MDDR, CMC as drug-like and ACD as non drug-like dataset. Thus, availability of the dataset is the major issue. In contrast, our method is an attempt to discriminate two closely related drug-like molecules. This will be an advance step in drug design process because despite the in vitro drug-like properties, many drugs failed in clinical trial (experimental stage). Thus, it is very important to discriminate these two classes of molecules. This is the only dataset that is available for public use and will be an excellent asset for development of public domain servers.
Quality of written English: Not suitable for publication unless extensively edited
Response: We are thankful to the reviewer for this comment. In the revised version, we have tried our best to improve quality of English in revised version of manuscript. Hopefully, the revised version will be suitable for publication.