- Open Access
Designing of interferon-gamma inducing MHC class-II binders
© Dhanda et al.; licensee BioMed Central Ltd. 2013
- Received: 10 June 2013
- Accepted: 25 November 2013
- Published: 5 December 2013
The generation of interferon-gamma (IFN-γ) by MHC class II activated CD4+ T helper cells play a substantial contribution in the control of infections such as caused by Mycobacterium tuberculosis. In the past, numerous methods have been developed for predicting MHC class II binders that can activate T-helper cells. Best of author’s knowledge, no method has been developed so far that can predict the type of cytokine will be secreted by these MHC Class II binders or T-helper epitopes. In this study, an attempt has been made to predict the IFN-γ inducing peptides. The main dataset used in this study contains 3705 IFN-γ inducing and 6728 non-IFN-γ inducing MHC class II binders. Another dataset called IFNgOnly contains 4483 IFN-γ inducing epitopes and 2160 epitopes that induce other cytokine except IFN-γ. In addition we have alternate dataset that contains IFN-γ inducing and equal number of random peptides.
It was observed that the peptide length, positional conservation of residues and amino acid composition affects IFN-γ inducing capabilities of these peptides. We identified the motifs in IFN-γ inducing binders/peptides using MERCI software. Our analysis indicates that IFN-γ inducing and non-inducing peptides can be discriminated using above features. We developed models for predicting IFN-γ inducing peptides using various approaches like machine learning technique, motifs-based search, and hybrid approach. Our best model based on the hybrid approach achieved maximum prediction accuracy of 82.10% with MCC of 0.62 on main dataset. We also developed hybrid model on IFNgOnly dataset and achieved maximum accuracy of 81.39% with 0.57 MCC.
Based on this study, we have developed a webserver for predicting i) IFN-γ inducing peptides, ii) virtual screening of peptide libraries and iii) identification of IFN-γ inducing regions in antigen (http://crdd.osdd.net/raghava/ifnepitope/).
This article was reviewed by Prof Kurt Blaser, Prof Laurence Eisenlohr and Dr Manabu Sugai.
- Support Vector Machine
- Negative Sequence
- Negative Dataset
- Dipeptide Composition
- Main Dataset
The present vaccination strategies are contemplating subunit vaccine as an alternative to traditional attenuation approach. These subunit vaccines consist of a part of the pathogen to be used as vaccine, which generally include the peptides or proteins [1, 2]. This novel strategy of vaccination has motivated the research towards development of subunit vaccines to combat a number of diseases like tuberculosis, malaria, anthrax, cancer and swine fever [3–7]. The major challenge in designing subunit vaccine is identification of antigenic regions (peptides or proteins) in the pathogen proteome that can induce desired immune response in the host organism, mainly human. Ideally one should experimentally check immune response for each possible fragment/peptide of pathogen proteome. In practice, it is not possible due to two reasons i) possible fragments are in the range of millions and ii) experimental techniques are costly and time consuming [8–11]. There is a need to assist experimental scientist using alternate approaches like computational techniques.
There is a tremendous change in the field of immunology in last few years due to exponential growth of new field immunoinformatics or computational immunology. In the last decade, numerous software, databases and web servers have been developed to identify antigenic regions that can activate various arms of the immune system like humoral, cellular and innate immunity. Broadly these in silico tools can be divided in following categories; i) linear/conformational B-cell epitopes for activating humoral response, ii) MHC class I/II binders, TAP binders, protease cleavage for understanding cell mediated immunity and iii) pathogen associated molecular patterns for activating innate immunity [12–40].
In past numerous methods have been developed to predict MHC class II binders that can activate T-helper cells. Best of author’s knowledge no method has been developed so far that can predict the type of T-helper cells will be activated, or type of cytokine will be released. The role of epitopes in deciding the immune response is well documented in literature [49–52]. In order to design subunit vaccine with more precision, there is a need to develop a method that can predict peptides that can activate specific type of cytokine. In this study, first time a systematic attempt has been made to predict IFN-γ inducing MHC class II binders or peptides.
We extracted 10,433 experimentally validated MHC class II binders or T-helper epitopes from Immune Epitope Database (IEDB) . Out of these 10,433 MHC class II binders, 3705 induced IFN-γ, whereas remaining 6728 unique peptides have not induced IFN-γ. Thus, our dataset contains 3705 positive examples or IFN-γ inducing peptides and 6728 negative examples or IFN-γ non-inducing peptides.
This dataset has been created to resolve the issue, if a peptide is not inducing interferon-gamma, would it induce other cytokine after binding with MHC class II? The dataset was compiled from IEDB; we obtained 4483 MHC II binders or epitope that induce IFN-gamma only and 2160 epitopes which induce cytokines other than interferon-gamma. The numbers of IFN-γ inducing epitopes are greater in this dataset than our main dataset due to updation of IEDB in the mean time. While creating this dataset, we have removed the redundant and the epitopes which have induced two or more cytokines.
IFNrandom or alternate dataset
This is alternative dataset, where IFN-gamma inducing epitope were taken positive examples and equal numbers of peptides (3705) with same length variation from swissprot were generated in random fashion for negative examples. The model developed on this dataset would be very useful in discriminating the IFN-gamma inducing epitopes from the peptides for which MHC binding status is not known.
Analysis of length and positional conservation of peptides
In order to understand the preference of length in positive and negative peptides, we used R-package for creating boxplot . To understand position specific preference of each residue, we used two-sample logo software, where we created a two-sample logo from first 15 amino acids of N-terminal of complete peptides . In this case, we removed all the peptides shorter than 15 residue length and remaining 89% peptides contained 2965 and 6336 peptides of positive and negative instances, respectively. On the other hand, in IFNgOnly dataset, there were 3682 epitopes in positive examples and 1641 epitopes remained in negative examples after applying the above filter.
Motif based approach
Identification of functional motifs in peptides or proteins is extremely valuable in the field for functional annotation of proteins/peptides . In this study, we used a powerful software called MERCI for searching exclusive motifs in positive and negative examples . Although, MERCI uses positive and negative examples simultaneously as an input but at a time it gives motifs for the positive examples only. Therefore, we applied two-step strategy, where first we used IFN-γ inducing peptides dataset as positive and non-IFN-γ inducing peptide dataset as negative input and extracted motifs for IFN-γ inducing examples. Consequently, in order to extract motifs for the non-IFN-γ inducing examples, we used IFN-γ inducing examples as negative and IFN-γ non-inducing examples as positive input. In this way, we extracted motifs for both IFN-γ inducing and IFN-γ non-inducing examples. We have searched 100 degenerate motifs from the following three kinds of classification: i) None, ii) Koolman-Rohm and iii) Betts-Russell. The Betts-Russell classification could be further divided in to 3 categories: i) Polar, ii) Hydrophobic and iii) Small. These different classification methods produce different motifs in the both positive and negative peptides. Thus, we selected unique motif-containing peptide from both datasets, in order to calculate overall motif coverage in the dataset. The peptides of IFN-γ inducing and IFN-γ non-inducing examples containing positive and negative motifs were assigned as true positives and true negatives respectively.
Amino acid compositions
We applied binary approach, in which positive and negative examples were converted into the binary patterns. Each amino acid represented by an unique vector of 20 dimensions (e.g. Ala by 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0; Cys by 0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0) for different 20 standard amino acids. For example,15-residue long peptide represented by the 300 (15 X 20) dimensions of a vector as an input.
Machine learning approach
In this study, SVM (Support Vector Machine) was applied for machine learning approach . Based on the features (amino acid composition and length) generated above, the support vector machine was optimized at different parameters of various kernels (linear, sigmoidal and radial basis function), and the best-optimized model was selected for software implementation.
In the hybrid approach, we combined the predictions from motif approach and machine learning approach. First of all, the sequences were separated that could be correctly predicted via motif based approach and the remaining sequences were then predicted using SVM. Various hybrid models were developed based on the type of vector inputs used for SVM-based prediction. Finally, the performance was evaluated by adding the truly predicted peptides from the motif-based method with SVM based predictions.
To test the vigor of the model, it was evaluated with five fold cross validation, where the complete dataset was divided into five equal parts and out of these four parts were used for training and the remaining fifth part was used for testing. This process was repeated five times in such a way that each part was once used for testing and four times it was a part of training. The overall performances were calculated by averaging the result of each test. The best model was also validated on 10 fold cross validation. In cross validation for hybrid approach, the results of motifs were directly added in the five or ten fold cross validation through SVM based approach.
TP = True Positive, FP = False Positive, TN = True Negative, FN = False Negative.
Examination of dataset
The peptides in the main dataset were obtained from 17,752 assays, where 5962 assays had shown to be positive for interferon-gamma secretion. These peptides were derived from 281 source organisms and were presented through 153 MHC alleles from 181 different host species/strains. On the other hand, the epitopes in IFNgOnly dataset were extracted from 15,778 assays. Out of these 15,778 assays 7302 assays have induced IFN-gamma and remaining 8476 assays have induced the secretion of other cytokine except interferon-gamma. The epitopes in IFNgOnly dataset were extracted from 394 different sources and presented through 183 MHC alleles in 232 host strains. The detailed analysis of epitopes with respect to MHC alleles, host strain and source organisms is available in supplementary excel sheet (Additional file 1).
Positional preference of residues
Exclusive motifs of different class found in IFN-γ inducing and non-inducing peptides
Class of motifs
No. of exclusive pos peptide
No. of exclusive neg peptide
Frequency of best motifs discovered using MERCI software in IFN-γ epitopes and non-epitopes
Class of Motifs
Found in IFN epitopes
Found in non-epitopes
Exclusive motifs of different class found in IFN-γ inducing and other cytokine (except IFN-γ) inducing peptides from IFNgOnly dataset
Class of Motifs
No. of exclusive IFN-γ peptide
No. of exclusive rest of cytokine inducing peptide
Frequency of best motifs discovered using MERCI software in IFN-γ epitopes and Rest of cytokine inducing epitopes from IFNgOnly dataset
Class of Motifs
Found in IFN Epitopes
Found in Rest-epitopes
Model based on machine learning technique
The performance of SVM based models developed using residue (amino acid) and dipeptide composition with and without length of peptides on our main dataset
Residue Composition + Length
Dipeptide Composition + Length
The performance of SVM based models developed using residue (amino acid) and dipeptide composition with and without length of peptides on IFNgOnly dataset
Residue Composition + Length
Dipeptide Composition + Length
Additionally, we also developed SVM based models using binary profile where each position is represented by a vector of dimension of 20 (each element represent presence or absence of a specific type of residue). The performance of models developed using binary profile of N-/C-terminal residues is shown in Additional file 2: Table S1 along with the composition variation plot for each residue in Additional file 2: Figures SF1 and SF2.
The performance of hybrid models that combines Motif based approach with SVM models developed using residue and dipeptide composition with or without length on our main dataset
Residue Composition + Length
Dipeptide Composition + Length
Dipeptide Composition + Length (10 fold)
The performance of hybrid models that combines Motif based approach with SVM models developed using residue and dipeptide composition with or without length on the IFNgOnly dataset
Residue Composition + Length
Dipeptide Composition + Length
Models for discovering IFN-γ inducing peptides
Performance of SVM on IFNrandom dataset with compositional features of residues
In the era of computer aided vaccine design, researchers are trying to find out the best epitope that can induce desired immune response. To be the best vaccine candidate, a peptide should not only be epitope for B and T cell, but it should also be able to evoke the desired type of immune cells to generate the required response. For example, in tuberculosis the vaccine candidate must be able to induce IFN-γ to eradicate the infection [62–64]. Therefore, there is a need of a method, which could predict the peptide responsible for secreting IFN-γ. The MHC-peptide complex may be exceptionally crucial for deciding the type of transcription factors to be activated after this association, which is responsible for the type of cytokine released [65, 66]. Therefore, the biasness of MHC alleles and secretion of interferon-gamma were analyzed. In case of main dataset, alleles were not determined for 10,767 assays out of 17,752 interferon gamma assays; similarly in case of IFNgOnly dataset) alleles were not determined for 6576 assays out of 15,778. Source organisms and host species/strains are also very important in deciding the secretion of interferon gamma.
The binding affinity of these peptides with MHC was shown to be dependent on the length of peptides . Therefore we analyzed the length variation in our datasets. The variability in length was observed from 9 to 30 with some exceptions, which was in consistent with previously reported by Nielsen et. al. in the analysis of SYFPEITHI and MHCPEP [68–70]. The skewness in the positive dataset was observed with length more than 15 amino acid residues. It has been reported that the peptides having more than 15–16 amino acids showed less affinity toward MHC class II, and this lesser affinity might be creating an environment that lead to release of IFN-gamma. We have also observed that length of the peptide is not significantly different in IFN-γ, when compared with length of peptides that have induced other cytokine.
Besides length, the conservation of the residue at a specific position may also be beneficial. Therefore, we have compared the positive and negative epitope data to fish out the prime residue activating IFN-γ releasing potential. In our observation, it was noticed that charged residues are preferred in at 4, 9, 10, 13 and 14 positions whereas, the Leucine and Iso-leucine residues are dominating in the peptides not inducing the release of IFN-γ. The differential preference may be significant for the different activation factors activated. While in case of comparison between IFN-γ and rest of cytokine using IFNgOnly dataset, it was observed that at position 4th, 9th and 10th charged residue are more prevalent in IFN-γ inducing peptides. This observation was in consensus with the observation from our main dataset. It was also found that negatively charged residues are dominating for induction of other cytokine except IFN-γ at 4th, 6th ,8th ,11th and 13th position. This kind of discrimination could be utilized for designing Th1 inducing peptides based upon amino acid properties.
The positional feature of a sequence could be encoded in machine learning format by generating binary feature input. This binary feature input could only be applied at a fixed length pattern, therefore different binary inputs were created by varying the length of amino acids from 9 to 15 through both N and C-terminal of a peptide. The performance of SVM model on these input vectors was nearly the same in terms of MCC. The compositional vector amino acid and dipeptide for a sequence has fixed feature input (20 and 400 respectively) irrespective of length of the peptide. The SVM performed better on these feature input as compared to binary vectors.
The performance of the SVM based models increases after adding a feature of length along with compositional vector. This may be co-related with the earlier report of variation in affinity of MHC-peptide binding with the variation in length of peptide . The overhanging and short peptides may be interfering with the ternary complex of peptides-MHC-T cells. The exclusive motifs in positive or negative dataset may be a major driver for this differential behavior; these motifs were explored using MERCI software. The motifs could be searched using different classification of amino acids proposed in the literature. The best classification is Betts-Russell with hydrophobic root for our dataset. The top 100 motifs searching under hydrophobic root of Betts-Russell classification are able to cover 532 of positive peptides and 1835 of negative peptides. The significance of motifs could be estimated by its coverage and hydrophobic motifs are most commonly found in negative dataset.
In past large number of methods have been developed for predicting MHC Class II binders or T-helper epitopes. In this study, an attempt has been made to classify MHC class II binders based on their interleukin induction. We classify MHC class II binders in two categories; first category of binders have ability to induce IFN-γ where as second category of binders do not have ability to induce IFN-γ. In order to discriminate two categories of MHC binders, models have been developed using various features of binders/peptide sequence that include binary pattern, compositions, and motifs. Our models were able to predict IFN-γ inducing peptide with high precision, it mean it is possible to design peptide that can induce IFN-γ. This study also indicates the preference of certain MHC alleles and host strains/species to skew the immune response to release interferon-gamma. In the near future, these prediction models will be useful in the advancement of computer aided vaccine design, where researcher will be able of designing subunit vaccine with the desired immune response.
Webserver for designing IFN-γ inducing peptides
In order to serve scientific community, we developed a web server IFNepitope using PHP, Perl, HTML and Java scripts. This web server has three major modules called Predict, Design and Scan. Module Predict allow users to screen peptide library for predicting best IFN-γ inducing epitopes. Design module of IFNepitope allows to identify minimum mutations required in a peptide to make it IFN-γ inducing epitope. In Design module, first all possible single residue mutation peptides are generated then module predict IFN-γ inducing epitope in mutant peptides. Similarly, Scan module predict antigenic or IFN-γ inducing regions in an antigen. Overall this server will be useful for researchers working in the field of subunit vaccines.
Reviewer number1: Prof Kurt Blaser
Comment: The paper is interesting and helpful for prediction and modulation of antigenic compounds and generation of Type I cytokine pattern mainly in protective immunizations but also for allergen-immunotherapy. The approach, although it is based on published experimental observations, on theoretical mathematical models. Thus it is difficult for me to evaluate the value and correctness of these predictions. What to my mind is missing, are some experimental data from human in vitro experiments, either with specific T cell clones or adequate PBMC cultures that are stimulated with synthetic epitope- peptides from these models. It is furthermore important that in such experiments not only IFN-gamma but also a broader pattern of the most important cytokines are measured, as in many cases rather the ratio of IFN-gamma: other cytokines (e.g. IL4) is important and not the absolute amount of IFN-gamma.
Thus, I would strongly recommend to add such experimental data in order to prove the effectiveness of the described models.
Response: We understand the reviewer’s concern about the experimental validation of our model, but also its noteworthy to mention here that we have developed this model on experimentally proven dataset and evaluated using well-established computational cross-validation approaches.
Quality of written English: Needs some language corrections before being published.
Reviewer number2: Prof Laurence Eisenlohr
In this manuscript, Dhanda et al. describe their efforts to develop tools to predict those MHC class II-binding peptides that induce interferon-gamma production and those that do not.
My concerns with this paper are as follows:
Comment: 1) The authors mined the peptide sequences they analyzed from the Immune Epitope Database. My understanding (from communicating with IEDB staff) is that those peptide sequences listed as “Negative” for cytokine production have not been shown to bind any particular MHC class II molecule (they are not “epitopes” per se, just sequences that failed to elicit a T cell response with the MHC class II restrictions that were tested). Thus, there may not be much basis for comparison; if they don’t bind MHC class II to begin with, then of course they won’t elicit interferon-gamma.
Response: We are thankful to the reviewer for this nice comment. To answer this comment, we have created another dataset “IFNgOnly”, which comprises the peptides, which induce any other cytokine except interferon-gamma. We have applied the same approach and achieved the convincing results (Tables 3, 4, 6, 8 and 9).
Comment: 2) I imagine that most of these negative sequences are derived from overlapping15-16-mer peptide libraries that were comprehensively screened for immunogenicity. This seems to be the likely explanation for the skewing of negative sequences toward that size range.
Response: We do agree with the reviewer and therefore produced results with alternative negative dataset, and observed no peptide length-wise preference for the induction of interferon-gamma when compared with the peptides that have induced other cytokine (except IFN-gamma) in IFNgOnly dataset as shown in Figure 4.
Comment: 3) There is no consideration of species origin of the class II molecule or MHC polymorphism. A negative peptide sequence for one animal or MHC allele could be strongly positive in other conditions if they were to be tested.
Response: This is important issue raised by reviewer, we examine our main dataset again after comment. We analyzed the IFN-gamma response with respect to MHC alleles and found that there are 38 peptides in our dataset that elicit IFN-gamma response in one host/MHC allele and did not elicit such response with another host/MHC-allele. We have considered these epitopes in our positive examples.
Comment: 4) As far as I can tell, there is also no consideration of peptide binding register (where the peptide is positioned with respect to the binding pockets). Therefore, I do not know what to make of the reported positional effects.
Response: We agree with the reviewer that positional preference analyzed by us could not be correlated with MHC groove because the positional information of peptides. It is also fact that for most of the peptides/binders position in MHC binding grove is not known. This is first study and in future these points should be addressed when sufficient data is available.
Quality of written English: Needs some language corrections before being published
Response: We have revised the manuscript and corrections have been made to improve the English.
Reviewer number3: Dr Manabu Sugai
The authors have an idea to find ideal peptides to elicit Th1 response for developing novel vaccination strategy. To this end, they developed a webserver for predicting IFN-gamma inducing peptides by analyzing the dataset from IEDB.
Comment: The paper is interesting, and I think would be of interest to readers of Biology Direct. However, the validation of their program is not enough for providing functional rationale to support their concept. Author’s idea depends on the notion that the specific peptides promote specific helper T cell differentiation. However, such an idea is not easily accepted, because various cytokines, but not TCR-signals, play dominant role in instructing helper T cell differentiation.
Response: Thank you for this comment, but we would like to draw your intention to some of references, where substitution of single amino acid in a peptide had skewed the immune response from Th1 to Th2 or vice-versa [49–52]. Therefore the notion of peptide based immune modulation is existing in literature and we are developing a prediction model for designing the peptide that have potential to induce IFN gamma and hope that this model would be very useful in peptide-based vaccination and therapeutics.
Comment: On the other hands, IFN-gamma inducing activities of the peptides were usually estimated as a memory reaction, indicated by the comments from IEDB. Therefore, we can speculate that some peptides specific immune reaction occurs specifically in Th1 skewed condition. According to this notion, we can use IFN-gamma inducing peptides as an adjuvant to induce memory reaction in vivo.
To validate this notion, the authors need to examine whether other cytokine-inducing-peptides, such as IL4, IL17, TGF-b etc., are selected or not by IFN-gamma inducing-peptides finding program. If your program excludes other cytokine-inducing-peptides, your ideas are supported partially and provide the meaning of your program for future use. If other cytokine-inducing-peptides are also included in the selected IFN-gamma inducing-peptides, author’s concept is not correct or program itself is incomplete.
Response: We are thankful to reviewers for providing detail information on IFN-gamma inducing peptides. Our aim in this study is to discriminate inducing and non-inducing peptides. In order to address issue raised by reviewer we developed models for discriminating IFN-gamma and non-IFN-gamma (induce other cytokines except IFN-gamma). First we created a dataset called IFNgOnly contains IFN-gamma and other cytokine inducing peptides. The performance of our models on this IFNgOnly dataset is shown in Tables 3, 4, 6 and 8. As shown in result section our models were able to discriminate peptide which induce IFN-gamma and peptides that induce other cytokines.
Quality of written English: Acceptable
Responses to reviewer’s comments after revision
Reviewer number:2 Prof Laurence Eisenlohr
This paper by Dhanda and et al. proposes both length and sequence biases for MHC class II presented peptides that elicit interferon-gamma responses (vs. those that do not). I remain skeptical about the conclusions in this paper due to lack of information on:
Comment: 1) the basis for lack of interferon production. Are peptides that have been shown to bind to an MHC class II molecule but not elicit any response (“non-epitopes”) included in the analyses? This would be problematic. In our hands essentially all epitopes elicit some interferon-gamma response following natural virus infection so the distribution of epitopes (3705 inducers vs. 6728 non-inducers) is worrisome.
Response: Our main dataset contain 10,433 peptides obtained from 17,752 IFN-gamma assays; 6,728 peptides do not release IFN-gamma (as per IEDB) which we called negative peptides in this study. We agree with reviewer that all the MHC class II binders are not epitopes; thus negative peptides in our main dataset may also include non-epitope. In order to overcome this limitation, we created another dataset called IFNgOnly, where negative peptides/epitopes contain only those peptide that induce cytokine other than IFN-gamma. In simple term negative peptides in IFNgOnly are true non IFN-gamma inducing epitopes.
Comment: 2) inclusion/exclusion criteria. As an exercise, I went to the IEDB and searched for all MHC class II binders that do not elicit an interferon-gamma response. I then randomly chose lysteriolysin O (LLO), residues 216-227, described by Skoberne et al., 2002, J Immunol., 169:1410-8. In fact, this epitope does elicit an interferon-gamma response in BALB/c mice (because it is a CD4+ T cell epitope in that strain) but does not elicit an interferon-gamma response in C57Bl/6 mice (because it is not an epitope in that strain). Thus, it is listed in both categories. How did the authors deal with this? Exclude? Include in both categories? Include in only one category? How many other peptides in the database also fall into both categories?
Response: We followed IEDB recommendations, in case multiple assays are performed to test a peptide, it is considered positive even if a single assays shows positive. There are 667 epitopes/peptides falling in both the categories and we have included them in our positive dataset.
Comment: 3) the peptide lengths that are entered into the database.
Many epitopes are now identified via overlapping 15-mer libraries, with no subsequent attempts to map the minimal epitope or the effects of flanking residues due to the added expense. This seems to be the likely reason for the predominance of peptides of that size (Figure 2). In fact, the analysis can only be done with peptides that have been stringently defined with respect to the minimal core and flanking sequence effects, a much smaller set than was analyzed.
Many of the longer peptides may not have been identified by the library method but by the previous method of enzymatic digestion of antigen and in vitro assay with a T cell line, clone or hybridoma, again with no subsequent mapping of core and flanking sequences.
Also, the longer peptides may have been deduced in mapping a known response, and this could be the reason for bias toward interferon-gamma production in this cohort.o There is no discussion of the bimodal length distribution, which, for the reasons discussed, may have a technical vs. biological basis.
Response: Our main dataset were created without considering the epitopic information’s. Most of the peptides in our datasets are either ‘exact epitopes’ or ‘epitope containing region’ as per mentioned in IEDB database.
Comment: 4) how several other potential biasing factors, all of which can strongly influence both parameters (length and sequence bias) were accounted for, including:
origin of the peptide (pathogen, self-protein, natural sequence or variant, …)
method of immunization (peptide, organism-experimental infection, organism-natural infection, adjuvant, …)
host strain (BALB/c vs. C57Bl/6-Type I vs. Type 2)
identity of the class II molecule. Some class II molecules are over-represented in the database and this alone could account for the deduced sequence preferences.
Response: We do agree with the reviewer that the issues related to host species, immunization protocol and MHC alleles should also be considered, but this is the first study to predict the immune response of a peptide sequence. These limitations have to be addressed in future research. In order to address this issue we have investigated our dataset and provided the insight in the ‘Examination of dataset’ paragraph of ‘Result’ section.
Quality of written English: Needs some language corrections before being published. Reviewer number:3 Dr Manabu Sugai
The revised manuscript from Dhanda et al. has been significantly improved. This paper is acceptable now.
Quality of written English: Acceptable
Authors are thankful to funding agencies Council of Scientific and Industrial Research (project OSDD and GENESIS BSC0121) and Department of Biotechnology (project BTISNET), Govt. of India. Authors also acknowledge the Indian Council of Medical Research (ICMR) for providing fellowship to Mr. Sandeep. We also acknowledge Mr. Bharat Panwar for his valuable suggestions and positive comments in improving the manuscript.
- Elhay MJ, Andersen P: Immunological requirements for a subunit vaccine against tuberculosis. Immunol Cell Biol. 1997, 75: 595-603. 10.1038/icb.1997.94.View ArticlePubMedGoogle Scholar
- Andersen P, Doherty TM: TB subunit vaccines–putting the pieces together. Microbes Infection / Institut Pasteur. 2005, 7: 911-921. 10.1016/j.micinf.2005.03.013.View ArticleGoogle Scholar
- Kaufmann SHE: Tuberculosis vaccine development: strength lies in tenacity. Trends Immunol. 2012, 33: 373-379. 10.1016/j.it.2012.03.004.View ArticlePubMedGoogle Scholar
- Kanoi BN, Egwang TG: New concepts in vaccine development in malaria. Curr Opinion Infectious Dis. 2007, 20: 311-316. 10.1097/QCO.0b013e32816b5cc2.View ArticleGoogle Scholar
- Chitlaru T, Altboum Z, Reuveny S, Shafferman A: Progress and novel strategies in vaccine development and treatment of anthrax. Immunol Rev. 2011, 239: 221-236. 10.1111/j.1600-065X.2010.00969.x.View ArticlePubMedGoogle Scholar
- Agarwal N, Padmanabh S, Vogelzang NJ: Development of novel immune interventions for prostate cancer. Clin Genitourin Cancer. 2012, 10: 84-92. 10.1016/j.clgc.2012.01.012.View ArticlePubMedGoogle Scholar
- Beer M, Reimann I, Hoffmann B, Depner K: Novel marker vaccines against classical swine fever. Vaccine. 2007, 25: 5665-5670. 10.1016/j.vaccine.2006.12.036.View ArticlePubMedGoogle Scholar
- Sobolev BN, Poroĭkov VV, Olenina LV, Kolesanova EF, Archakov AI: Computer-assisted vaccine design. Biomedit͡sinskai͡a khimii͡a. 2003, 49: 309-332.PubMedGoogle Scholar
- Zagursky RJ, Olmsted SB, Russell DP, Wooters JL: Bioinformatics: how it is being used to identify bacterial vaccine candidates. Expert Rev Vaccines. 2003, 2: 417-436. 10.1586/14760518.104.22.1687.View ArticlePubMedGoogle Scholar
- De Groot AS, Sbai H, Saint AC, McMurry J, Martin W: Immuno-informatics: mining genomes for vaccine components. Immunology Cell Biol. 2002, 80: 255-269. 10.1046/j.1440-1711.2002.01092.x.View ArticleGoogle Scholar
- Vivona S, Gardy JL, Ramachandran S, Brinkman FSL, Raghava GPS, Flower DR, Filippini F: Computer-aided biotechnology: from immuno-informatics to reverse vaccinology. Trends Biotechnol. 2008, 26: 190-200. 10.1016/j.tibtech.2007.12.006.View ArticlePubMedGoogle Scholar
- Roggen EL: Recent developments with B-cell epitope identification for predictive studies. J Immunotoxicol. 2006, 3: 137-149. 10.1080/15476910600845690.View ArticlePubMedGoogle Scholar
- Saha S, Bhasin M, Raghava GPS: Bcipep: a database of B-cell epitopes. BMC Genomics. 2005, 6: 79-10.1186/1471-2164-6-79.View ArticlePubMedPubMed CentralGoogle Scholar
- Söllner J, Mayer B: Machine learning approaches for prediction of linear B-cell epitopes on proteins. J Mol Recognit. 2006, 19: 200-208. 10.1002/jmr.771.View ArticlePubMedGoogle Scholar
- El-Manzalawy Y, Dobbs D, Honavar V: Predicting linear B-cell epitopes using string kernels. J Mol Recognit. 2008, 21: 243-255. 10.1002/jmr.893.View ArticlePubMedPubMed CentralGoogle Scholar
- Ansari HR, Raghava GP: Identification of conformational B-cell Epitopes in an antigen from its primary sequence. Immunome Research. 2010, 6: 6-10.1186/1745-7580-6-6.View ArticlePubMedPubMed CentralGoogle Scholar
- Bublil EM, Freund NT, Mayrose I, Penn O, Roitburd-Berman A, Rubinstein ND, Pupko T, Gershoni JM: Stepwise prediction of conformational discontinuous B-cell epitopes using the Mapitope algorithm. Proteins. 2007, 68: 294-304. 10.1002/prot.21387.View ArticlePubMedGoogle Scholar
- Zhang W, Xiong Y, Zhao M, Zou H, Ye X, Liu J: Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature. BMC bioinformatics. 2011, 12: 341-10.1186/1471-2105-12-341.View ArticlePubMedPubMed CentralGoogle Scholar
- Ponomarenko J, Bui H-H, Li W, Fusseder N, Bourne PE, Sette A, Peters B: ElliPro: a new structure-based tool for the prediction of antibody epitopes. BMC bioinformatics. 2008, 9: 514-10.1186/1471-2105-9-514.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang Q, Wang P, Kim Y, Haste-Andersen P, Beaver J, Bourne PE, Bui H-H, Buus S, Frankild S, Greenbaum J, Lund O, Lundegaard C, Nielsen M, Ponomarenko J, Sette A, Zhu Z, Peters B: Immune epitope database analysis resource (IEDB-AR). Nucleic Acids Res. 2008, 36 (Web Server issue): W513-W518.View ArticlePubMedPubMed CentralGoogle Scholar
- Lata S, Bhasin M, Raghava GPS: Application of machine learning techniques in predicting MHC binders. Methods Mol Biol (Clifton, NJ). 2007, 409: 201-215. 10.1007/978-1-60327-118-9_14.View ArticleGoogle Scholar
- Adams HP, Koziol JA: Prediction of binding to MHC class I molecules. J Immunol Methods. 1995, 185: 181-190. 10.1016/0022-1759(95)00111-M.View ArticlePubMedGoogle Scholar
- Dönnes P, Elofsson A: Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinformatics. 2002, 3: 25-10.1186/1471-2105-3-25.View ArticlePubMedPubMed CentralGoogle Scholar
- Guan P, Doytchinova IA, Zygouri C, Flower DR: MHCPred: bringing a quantitative dimension to the online prediction of MHC binding. Appl Bioinformatics. 2003, 2: 63-66.PubMedGoogle Scholar
- Reche PA, Glutting J-P, Zhang H, Reinherz EL: Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles. Immunogenetics. 2004, 56: 405-419.View ArticlePubMedGoogle Scholar
- Wan J, Liu W, Xu Q, Ren Y, Flower DR, Li T: SVRMHC prediction server for MHC-binding peptides. BMC Bioinformatics. 2006, 7: 463-10.1186/1471-2105-7-463.View ArticlePubMedPubMed CentralGoogle Scholar
- Hoof I, Peters B, Sidney J, Pedersen LE, Sette A, Lund O, Buus S, Nielsen M: NetMHCpan, a method for MHC class I binding prediction beyond humans. Immunogenetics. 2009, 61: 1-13. 10.1007/s00251-008-0341-z.View ArticlePubMedGoogle Scholar
- Dimitrov I, Garnev P, Flower DR, Doytchinova I: EpiTOP–a proteochemometric tool for MHC class II binding prediction. Bioinformatics (Oxford, England). 2010, 26: 2066-2068. 10.1093/bioinformatics/btq324.View ArticleGoogle Scholar
- Bhasin M, Raghava GPS: Prediction of CTL epitopes using QM, SVM and ANN techniques. Vaccine. 2004, 22: 3195-3204. 10.1016/j.vaccine.2004.02.005.View ArticlePubMedGoogle Scholar
- Brusic V, Bajic VB, Petrovsky N: Computational methods for prediction of T-cell epitopes–a framework for modelling, testing, and applications. Methods (San Diego, Calif). 2004, 34: 436-443. 10.1016/j.ymeth.2004.06.006.View ArticleGoogle Scholar
- Kangueane P, Sakharkar MK: T-Epitope designer: a HLA-peptide binding prediction server. Bioinformation. 2005, 1: 21-24. 10.6026/97320630001021.View ArticlePubMedPubMed CentralGoogle Scholar
- Doytchinova IA, Flower DR: Class I T-cell epitope prediction: improvements using a combination of proteasome cleavage, TAP affinity, and MHC binding. Mol Immunol. 2006, 43: 2037-2044. 10.1016/j.molimm.2005.12.013.View ArticlePubMedGoogle Scholar
- Tsurui H, Takahashi T: Prediction of T-cell epitope. J Pharmacol Sci. 2007, 105: 299-316. 10.1254/jphs.CR0070056.View ArticlePubMedGoogle Scholar
- Schuler MM, Nastke M-D, Stevanovikć S: SYFPEITHI: database for searching and T-cell epitope prediction. Methods Mol Biol (Clifton, NJ). 2007, 409: 75-93. 10.1007/978-1-60327-118-9_5.View ArticleGoogle Scholar
- Feldhahn M, Dönnes P, Thiel P, Kohlbacher O: FRED–a framework for T-cell epitope detection. Bioinformatics (Oxford, England). 2009, 25: 2758-2759. 10.1093/bioinformatics/btp409.View ArticleGoogle Scholar
- Antonets DV, Maksiutov AZ: TEpredict: software for T-cell epitope prediction. Molekuliarnaia biologiia. 2010, 44: 130-139.PubMedGoogle Scholar
- Singh H, Raghava GP: ProPred: prediction of HLA-DR binding sites. Bioinformatics (Oxford, England). 2001, 17: 1236-1237. 10.1093/bioinformatics/17.12.1236.View ArticleGoogle Scholar
- Zhang GL, DeLuca DS, Keskin DB, Chitkushev L, Zlateva T, Lund O, Reinherz EL, Brusic V: MULTIPRED2: a computational system for large-scale identification of peptides predicted to bind to HLA supertypes and alleles. J Immunol Methods. 2011, 374: 53-61. 10.1016/j.jim.2010.11.009.View ArticlePubMedGoogle Scholar
- Bhasin M, Raghava GPS: A hybrid approach for predicting promiscuous MHC class I restricted T cell epitopes. J Biosci. 2007, 32: 31-42. 10.1007/s12038-007-0004-5.View ArticlePubMedGoogle Scholar
- Reche PA, Reinherz EL: PEPVAC: a web server for multi-epitope vaccine development based on the prediction of supertypic MHC ligands. Nucleic Acids Res. 2005, 33 (Web Server issue): W138-W142.View ArticlePubMedPubMed CentralGoogle Scholar
- O’Garra A, Murphy K: Role of cytokines in determining T-lymphocyte function. Curr Opin Immunol. 1994, 6: 458-466. 10.1016/0952-7915(94)90128-7.View ArticlePubMedGoogle Scholar
- Agnello D, Lankford CSR, Bream J, Morinobu A, Gadina M, O’Shea JJ, Frucht DM: Cytokines and transcription factors that regulate T helper cell differentiation: new players and new insights. J Clin Immunol. 2003, 23: 147-161. 10.1023/A:1023381027062.View ArticlePubMedGoogle Scholar
- Rincón M, Flavell RA: T-cell subsets: transcriptional control in the Th1/Th2 decision. Curr Biol. 1997, 7: R729-R732. 10.1016/S0960-9822(06)00368-X.View ArticlePubMedGoogle Scholar
- Romagnani S: Th1/Th2 cells. Inflamm Bowel Dis1999 Nov 5428594. 1999, 5: 285-294. 10.1097/00054725-199911000-00009.View ArticleGoogle Scholar
- Andersen P, Doherty TM: The success and failure of BCG–implications for a novel tuberculosis vaccine. Nat Rev Microbiol. 2005, 3: 656-662. 10.1038/nrmicro1211.View ArticlePubMedGoogle Scholar
- Sable SB, Kalra M, Verma I, Khuller GK: Tuberculosis subunit vaccine design: the conflict of antigenicity and immunogenicity. Clin Immunol Orlando Fla. 2007, 122: 239-251. 10.1016/j.clim.2006.10.010.View ArticleGoogle Scholar
- Wang L, Cai Y, Cheng Q, Hu Y, Xiao H: Imbalance of Th1/Th2 cytokines in patients with pulmonary tuberculosis. Zhonghua jie he he hu xi za zhi Zhonghua jiehe he huxi zazhi Chinese J Tuberc Respir Dis. 2002, 25: 535-537.Google Scholar
- Lienhardt C, Azzurri A, Amedei A, Fielding K, Sillah J, Sow OY, Bah B, Benagiano M, Diallo A, Manetti R, Manneh K, Gustafson P, Bennett S, D’Elios MM, McAdam K, Del Prete G: Active tuberculosis in Africa is associated with reduced Th1 and increased Th2 activity in vivo. Eur J Immunol. 2002, 32: 1605-1613. 10.1002/1521-4141(200206)32:6<1605::AID-IMMU1605>3.0.CO;2-6.View ArticlePubMedGoogle Scholar
- Pfeiffer C: Altered peptide ligands can control CD4 T lymphocyte differentiation in vivo. J Exp Med. 1995, 181: 1569-1574. 10.1084/jem.181.4.1569.View ArticlePubMedGoogle Scholar
- Windhagen A, Scholz C, Höllsberg P, Fukaura H, Sette A, Hafler DA: Modulation of cytokine patterns of human autoreactive T cell clones by a single amino acid substitution of their peptide ligand. Immunity. 1995, 2: 373-380. 10.1016/1074-7613(95)90145-0.View ArticlePubMedGoogle Scholar
- Nishimura Y, Chen YZ, Kanai T, Yokomizo H, Matsuoka T, Matsushita S: Modification of human T-cell responses by altered peptide ligands: a new approach to antigen-specific modification. Internal Med (Tokyo, Japan). 1998, 37: 804-817. 10.2169/internalmedicine.37.804.View ArticleGoogle Scholar
- Tamura T, Ariga H, Kinashi T, Uehara S, Kikuchi T, Nakada M, Tokunaga T, Xu W, Kariyone A, Saito T, Kitamura T, Maxwell G, Takaki S, Takatsu K: The role of antigenic peptide in CD4+ T helper phenotype development in a T cell receptor transgenic model. Int Immunol. 2004, 16: 1691-1699. 10.1093/intimm/dxh170.View ArticlePubMedGoogle Scholar
- Vita R, Zarebski L, Greenbaum JA, Emami H, Hoof I, Salimi N, Damle R, Sette A, Peters B: The immune epitope database 2.0. Nucleic Acids Res. 2010, 38 (Database issue): D854-D862.View ArticlePubMedGoogle Scholar
- R Development Core Team: R: a language and environment for statistical computing. 2009, Vienna, Austria: R Foundation for Statistical ComputingGoogle Scholar
- Vacic V, Iakoucheva LM, Radivojac P: Two sample logo: a graphical representation of the differences between two sets of sequence alignments. Bioinformatics (Oxford, England). 2006, 22: 1536-1537. 10.1093/bioinformatics/btl151.View ArticleGoogle Scholar
- Redhead E, Bailey TL: Discriminative motif discovery in DNA and protein sequences using the DEME algorithm. BMC bioinformatics. 2007, 8: 385-10.1186/1471-2105-8-385.View ArticlePubMedPubMed CentralGoogle Scholar
- Vens C, Rosso M-N, Danchin EGJ: Identifying discriminative classification-based motifs in biological sequences. Bioinformatics (Oxford, England). 2011, 27: 1231-1238. 10.1093/bioinformatics/btr110.View ArticleGoogle Scholar
- Joachims T: Making Large Scale SVM Learning Practical. Univ., SFB 475. 1998, 13-Google Scholar
- Bhasin M, Raghava GPS: Pcleavage: an SVM based method for prediction of constitutive proteasome and immunoproteasome cleavage sites in antigenic sequences. Nucleic Acids Res. 2005, 33 (Web Server issue): W202-W207.View ArticlePubMedPubMed CentralGoogle Scholar
- Lam TH, Mamitsuka H, Ren EC, Tong JC: TAP Hunter: a SVM-based system for predicting TAP ligands using local description of amino acid sequence. Immunome Res. 2010, 6 (Suppl 1): S6-10.1186/1745-7580-6-S1-S6.View ArticlePubMedPubMed CentralGoogle Scholar
- Yao B, Zhang L, Liang S, Zhang C: SVMTriP: a method to predict antigenic epitopes using support vector machine to integrate tri-peptide similarity and propensity. PloS one. 2012, 7: e45152-10.1371/journal.pone.0045152.View ArticlePubMedPubMed CentralGoogle Scholar
- Flynn JL, Chan J, Triebold KJ, Dalton DK, Stewart TA, Bloom BR: An essential role for interferon gamma in resistance to Mycobacterium tuberculosis infection. J Exp Med. 1993, 178: 2249-2254. 10.1084/jem.178.6.2249.View ArticlePubMedGoogle Scholar
- Shtrichman R, Samuel CE: The role of gamma interferon in antimicrobial immunity. Curr Opinion Microbiol. 2001, 4: 251-259. 10.1016/S1369-5274(00)00199-5.View ArticleGoogle Scholar
- Reljic R: IFN-gamma therapy of tuberculosis and related infections. J Interferon Cytokine Res. 2007, 27: 353-364. 10.1089/jir.2006.0103.View ArticlePubMedGoogle Scholar
- Brower RC, England R, Takeshita T, Kozlowski S, Margulies DH, Berzofsky JA, Delisi C: Minimal requirements for peptide mediated activation of CD8+ CTL. Mol Immunol. 1994, 31: 1285-1293. 10.1016/0161-5890(94)90079-5.View ArticlePubMedGoogle Scholar
- Cochran JR, Cameron TO, Stern LJ: The relationship of MHC-peptide binding and T cell activation probed using chemically defined MHC class II oligomers. Immunity. 2000, 12: 241-250. 10.1016/S1074-7613(00)80177-6.View ArticlePubMedGoogle Scholar
- Chang ST, Ghosh D, Kirschner DE, Linderman JJ: Peptide length-based prediction of peptide-MHC class II binding. Bioinformatics (Oxford, England). 2006, 22: 2761-2767. 10.1093/bioinformatics/btl479.View ArticleGoogle Scholar
- Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanović S: SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics. 1999, 50: 213-219. 10.1007/s002510050595.View ArticlePubMedGoogle Scholar
- Brusic V, Rudy G, Harrison LC: MHCPEP, a database of MHC-binding peptides: update 1997. Nucleic Acids Res. 1998, 26: 368-371. 10.1093/nar/26.1.368.View ArticlePubMedPubMed CentralGoogle Scholar
- Nielsen M, Lundegaard C, Worning P, Hvid CS, Lamberth K, Buus S, Brunak S, Lund O: Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach. Bioinformatics (Oxford, England). 2004, 20: 1388-1397. 10.1093/bioinformatics/bth100.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.