In silico Prediction of Immunogenic T Cell Epitopes of Leishmania donovani donovani GP63 Protein: an Alternative Approach for Anti-parasite Vaccine Development
Mona E. E. Elfaki1, Anne S. De Groot 2,3, Andres H. Gutierrez 2, Brema M. Younis 1, Rayan Tassone 3, Francis Terry 3, Ahmed M. Musa 1, Ahmed M. Elhassan 1, Eltahir A. G. Khalil 1*
1Department of Clinical Pathology and Immunology, Institute of Endemic Diseases, University of Khartoum, Khartoum, Sudan.
2Institute for Immunology & Informatics, University of Rhode Island, Providence, Rhode Island, USA. 3EpiVax, Inc., Providence, Rhode Island, USA.
Visceral leishmaniasis (VL) is a major parasitic childhood disease in sub-Saharan Africa. Expensive and toxic anti-leishmanial drugs are current control methods. Safe, effective and cheap vaccines are potentially powerful strategies to control VL. Traditional vaccine development techniques have failed to deliver an effective vaccine. Leishmania vaccine development may benefit from immunoinformatics tools. This paper describes an improved in silico prediction method for immunogenic Leishmania donovani donovani GP63 protein T cell epitopes as VL candidate vaccines. Using the EpiMatrix algorithm, the amino acid sequence of Leishmania Donovani donovani GP63 protein (GenBank accession: ACT31401) was screened for putative T cell cluster epitopes that would bind to the most common HLA class II alleles among at-risk populations. Nine epitopes were initially identified using EpiMatrix. Based on cluster score, number of EpiMatrix hits, hydrophobicity, and number of EpiBars (an EpiBar is a 9 amino acid frame predicted to bind at least 4 different HLA molecules), four peptides (P1-P4) were selected for synthesis. In a proof of concept study, blood samples from consenting healthy, leishmanin skin test (LST) reactive and non-reactive volunteers were stimulated and IFN-γ, IL-4, and IL-10 were measured. IFN-γ and IL-4 levels were similar in both groups. However, mean IL-10 levels were significantly reduced in LST reactive individuals. To evaluate whether cross-reactivity with the human genome (HG), the human gut microbiome (HM) and common human pathogens (HP) was responsible for these differences, the sequences of the evaluated peptides were screened using JanusMatrix. One of the peptides (P1), which increased IL-10 in the LST reactive volunteers, showed high cross-reactivity with HG, suggesting that P1 might induce a regulatory immune response in humans. In conclusion, immunoinformatics tools provide a promising alternative approach for anti-parasite vaccine development. Data obtained can be used in the development of epitope-based Leishmania vaccine.
Keywords: HLA Class II Alleles; Regulatory T Cell Epitope; Visceral Leishmaniasis; Epimatrix
Visceral leishmaniasis (VL) is recognized as a major public health problem in a number of countries spanning four continents, with an incidence of 0.5 million new cases per year and considerable fatalities. Currently, prevention of VL focuses on case detection/treatment, vector and animal reservoir control and vaccine development. Treatment options are limited, however; no effective vaccine has been developed to date [1-6]. Safe, effective and cheap vaccines are critically important for the effective control of VL. It is likely that a vaccine can be developed for leishmaniasis because spontaneous or drug-induced recovery from cutaneous or visceral leishmaniasis is associated with solid and life-long immunity against re-infection. First-generation and available second-generation vaccines have failed to deliver a candidate that was successful in Phase III trials. Ideally, a vaccine for VL should elicit a strong Th1 response and little or no Th2 response [7-11]. Immunoinformatics tools that analyze the sequence of parasite proteins for binding to common human leukocyte antigen (HLA) class II alleles can provide an expedited approach to vaccine development .
Glycoprotein-63 (GP63; Leishmanolysin) has been implicated in various aspects of the host-parasite relationship. It functions as a receptor for host macrophages, converting complement component C3b to C3bi in order to facilitate parasite survival. GP63 is able to cleave multiple intracellular signaling proteins of the host, contributing to the infectivity of Leishmania [13,14]. In mouse models, GP63 vaccination has been shown to induce protective immune responses against leishmaniasis through induction of antigen-specific T cells [15,16]. GP63 is highly conserved among all species of Leishmania and is the major surface protein expressed in both promastigote and amastigote life stages [17-20].
Protective immune responses to leishmaniasis depend entirely on Th1 cells, which produce IFN-γ and other Th1 cytokines . The activation of CD4+ helper T cells is essential for the development of adaptive immunity against intracellular pathogens [22-25]. A critical step in CD4+ T cell activation is the recognition of epitopes presented by HLA class II molecules . HLA class II molecules are heterodimers expressed on the surface of professional antigen presenting cells that bind linear peptide fragments derived from protein antigens . A hallmark of the HLA class II binding peptide groove is that there are four major pockets. These pockets accommodate side chains of residues 1, 4, 6, and 9 of a 9-mer core region of the binding peptide; this core region interaction largely determines binding affinity and specificity . In addition, peptide residues immediately flanking the core region have been shown to make contact with the HLA molecule outside of the binding groove and to contribute to HLA-peptide interaction .
Binding motifs derived for HLA class II molecules are highly variable. An epitope’s HLA binding promiscuity is an important feature for vaccine development and immunotherapy due to its ability to bind multiple HLA class II molecules (alleles) and the population coverage conferred by targeting multiple alleles at once [30, 31]. The goal of T cell epitope prediction is therefore to accurately identify peptide sequences within any protein that, in the context of a defined HLA molecule or across multiple families of HLA haplotypes, will elicit desired T cell responses when presented at the cell surface to be recognized by the immune system. These epitopes should ideally be dominant and promiscuous, so that they are recognized by most human populations . Immunoinformatics tools significantly accelerate the discovery of peptide epitope sequences, which account for an overwhelmingly large proportion of T cell responses to pathogens and self-proteins [12,32]. One such tool is Epi- Matrix, a T cell epitope mapping algorithm developed by De Groot and Martin (EpiVax). EpiMatrix screens protein sequences for 9–10 amino acid long peptide segments predicted to bind to one or more MHC alleles [33,34]. EpiMatrix uses the pocket profile method to build matrices for epitope prediction, which was first described by Sturniolo and Hammer . EpiMatrix has been successfully applied to the analysis of previously published epitopes . For example, researchers affiliated with GAIA (Global Alliance to Immunize against AIDS) have used EpiMatrix to identify highly conserved and immunogenic epitopes from available HIV sequences [37-40]. EpiMatrix was also used to identify epitopes for inclusion in novel, broadly reactive epitope-based vaccines for F. tularensis , M.tuberculosis , vaccinia virus [43, 44], H. pylori  and Influenza A (H1N1) virus [46,47]. In this communication, we describe the application of EpiMatrix and JanusMatrix, a new tool that predicts pathogen-host cross-reactivity at the T cell receptor level, to scan a parasite surface protein sequence (Leishmania donovani donovani GP63 protein, GenBank accession: ACT31401) for T cell epitopes that will bind the most common HLA-DRB1 alleles among at-risk populations.
We also describe the application of JanusMatrix to examine potential cross-reactivity of the peptides with human genome (HG), the human gut microbiome (HM), and common human pathogens (HP). JanusMatrix describes patterns of HG, HM, and HP cross-reactivity (XR) (XR is the number of cross-reactive hits detected) that are distinct for epitopes associated with regulatory T cell and effector T cell responses . Greater XR with HG compared to HM seems to distinguish defined regulatory and effector T cell epitopes [48,49]. JanusMatrix was used to refine our in vitro observations since it provides some additional insight into the phenotype of T cell responses that may occur. The application of these bioinformatics tools represents an alternative approach to identify possible candidates for antileishmania vaccine development.
HLA-DRB1 allele frequencies
A total of eleven HLA-DRB1 allelic groups assessed among the studied population from the Institute of Endemic Diseases Database, Khartoum, Sudan. The most common HLADRB1 alleles within the study population were DRB1*1101 and DRB1*0804, with frequencies of 32.4% and 24.1%, respectively .
Secretion analysis for the Leishmania donovani donovani GP63 protein
GP63 protein was predicted to have a signal peptide (1-39), a glycosylphosphatidylinositol (GPI) anchor and no transmembrane regions or lipoprotein attachment sites, confirming that it is a surface protein .
EpiMatrix and ClustiMer Analysis results
Using EpiMatrix predictive matrices for HLA DRB1*0804 and DRB1*1101, the immunogenicity score of whole GP63 was calculated to be -55.83 (Fig. 1a); no clusters were identified, as only two alleles were scored. However, some 9-mers were found with scores above 1.64 (Table 1). Accounting for the set of eight common HLA-DR super type alleles, the GP63 protein immunogenicity score was -41.35. Although this score was very low, indicating that the whole protein is unlikely to be very immunogenic (Figure 1), ClustiMer analysis detected several putative T cell epitope clusters that may be useful to drive immune response. The cluster with the highest score (25.29) was within the signal peptide of the GP63 protein (Table 1).
Figure 1. Immunogenicity Scale Report. EpiMatrix was used to score GP63 for putative binding to eight common HLA alleles (GP63 – Standard) and to DRB1*0804 and DRB1*1101, the two most commonly found class II HLA alleles among VL at-risk populations (GP63 – 0804 and 1101); scores are displayed in Fig. 1a as compared to standard immunogens. GP63 was then scanned for immunogenic clusters using ClustiMer; clusters identified in this analysis are displayed in Fig. 1b as compared to standard immunogenic clusters.
Using EpiMatrix predictive matrices for the set of eight common HLA-DR super type alleles, nine clusters were identified in the GP63 protein (Table 1). The cluster addresses (Table 1) indicate the location of the peptides within the GP63 sequence. The core peptide (underlined middle amino acids in bold) defines the actual epitope cluster containing the highest binding probability. The flanking residues (N-terminal and C-terminal, in italics) are included to stabilize the synthesized core sequence in the open-ended binding groove and to enhance TCR recognition . The EpiMatrix cluster score was derived from the deviation from the number of predicted ligands expected by random chance in a peptide of the same length (Table 1).
Four clusters were synthesized for this study: cluster 7 (STHRHRSVAARLVRLAAAGAAVIA); cluster 151 (DILVKHLIPQALQLH); cluster 205 (TDFVMYVASVPSEGDVL); and cluster 496 (SHGIIKSYAGLCANVRCDT) (Table 1). Selection was based on the EpiMatrix cluster score, number of EpiMatrix hits, hydrophobicity, number of EpiBars and homology to human proteins. Cluster 7 (cluster address 7-30) was selected for synthesis because it had the highest EpiMatrix cluster score of 25.29 as well as three EpiBars and 17 EpiMatrix hits. In addition, it was identified as part of the signal peptide with no significant homology to human proteins by BLAST. Cluster 7 is expected to be highly immunogenic as compared to well-known immunogens.
Table 1. Leishmania donovani donovani GP63 protein (GeneBank accession: ACT31401) clusters identified by EpiMatrix and
*Cluster addresses for the predicted peptides P1, P2, P3 and P4.
Clusters 205 (cluster address 205-221) and 496 (cluster address 496-514) were selected because both have EpiMatrix cluster scores >15 (16.58 and 15.82, respectively), and each contains two EpiBars and ten EpiMatrix hits. These clusters showed no significant homology to human proteins based on BLAST analyses and compared well to standard immunogens (Figure 1b). Clusters 151 (cluster address 151-165) and 347 (cluster address 347-361) had almost the same EpiMatrix cluster scores (14.45 and 14.23, respectively), similar numbers of EpiBars, and similar number of EpiMatrix hits. However, Cluster 151 was chosen for synthesis because it included a top 1% EpiMatrix hit (Z-score > 2.32). This hit is specific for DRB1*0804, one of the two alleles associated with at-risk populations.
Clusters 94 (cluster address 94-117), 126 (cluster address 126-140) and 248 (cluster address 248-270) were excluded from selection because they have comparatively low Epi- Matrix cluster scores (13.66, 11.31 and 12.17, respectively). Cluster 586 (cluster address 586-599) had an EpiMatrix cluster score of 16.18 placing it relatively high on the immunogenicity scale (Figure 1), but it was excluded due to elevated hydrophobicity (Table 1) and identity with sequences in the human genome (70%). None of the other predicted clusters had any significant linear homology to the human genome based on BLAST analysis. Many of the clusters were highly conserved among the other Leishmania species, specifically those four clusters/epitopes selected for synthesis. The predicted and synthesized peptides were assigned the following numbered letters: P1, P2, P3 and P4 for the sake of simplicity .
P1: STHRHRSVAARLVRLAAAGAAVIA (Cluster 7)
P2: DILVKHLIPQALQLH (Cluster 151)
P3: TDFVMYVASVPSEGDVL (Cluster 205)
P4: SHGIIKSYAGLCANVRCDT (Cluster 496)
In vitro whole blood stimulation results
The in vitro immunogenicity studies for the predicted peptides showed that IFN-γ was not significantly increased in any of the volunteer groups. The peptide pool (P1+P2+P3+P4) produced a moderate IFN-γ increase that was not statistically significant in LST non-reactive volunteers. IL4- was not significantly increased in any group (LST-reactive= skin induration ≥ 5mm; LST-non reactive= skin induration = 00 mm). IL-10 was significantly increased following stimulation with the P1 peptide in all volunteers, especially in the LST-reactive group. Significant IL-10 production was observed following stimulation with the P3 peptide in LST non-reactive volunteers. The peptide pool significantly decreased IL-10 production in all volunteers, especially in the LST-reactive group.
Janus Matrix analysis results:
Despite its lack of homology based on BLAST analyses, the P1 peptide was found to be highly cross-reactive with the HG (XR=33, Ratio=2.92) using JanusMatrix, suggesting that P1 may be a regulatory T cell epitope (Table 2). Furthermore, one 9-mer epitope within P1 had 14 of the 33 cross- reactive hits (Fig. 2) with HG (Ratio = 1.24). This contrasts with P4, which has a low XR ratio (0.35) and induced only low levels of IL-10; P4 appears to be an effector T cell epitope. P2 and P3 showed a minimal degree of cross-reactivity with the HG, HM and HP (Table 2).
Table 2. Cross-reactivity of the predicted peptides with the Human Genome (HG), Human Microbiome (HM) and Human Pathogens (HP).
XRa is the number of cross-reactive hits detected.
Ratiob is the ratio of cross-reactive hits per 1x106 amino acids in the comparison database.
Janus Human Scorec represents the average depth of coverage in the search database for each EpiMatrix hit in the input sequence
Figure 2. TCR-Epitope network for P1 peptide. Epitopes with TCR-facing residues identical to P1 were identified in protein sequences from the human genome database. Green diamonds represent P1 (source peptide); gray squares are predicted 9-mer epitopes derived from the source peptide (predicted using EpiMatrixblue triangles are 9-mers that are 100% identical to the TCR
face of the source epitope and that are predicted to bind to the identical HLA; and light purple circles are proteins containing the cross-reactive epitope.
First- and second-generation anti-parasite vaccines against leishmaniasis have not yet shown long-term protective potential. This could be attributed to a number of obstacles, including low immunogenicity, lack of dose standardization and variability in evaluating the immune responses in challenged individuals [53, 54]. Alternative approaches for anti-parasite vaccine development are urgently needed. Recently, T cell epitope prediction techniques using advanced immunoinformatics tools have provided real hope as a new strategy for subunit vaccine development. Common HLA class II alleles of at risk-populations can be a good starting point for T cell epitope prediction. A major step in identifying potential T cell epitopes involves identifying the peptides that bind to a target major HLA molecule. Because of the high cost of experimental identification of such peptides, computational methods for predicting HLA binding peptides are highly useful, as they save both time and money.
Compared to other vaccine types, synthetic peptide vaccines do not carry whole gene sequences, can be standardized, multiple epitopes accounting for variants of a pathogen can be included and vaccines can be developed for pathogens that are difficult to culture in the laboratory. In addition, an appropriate peptide-based vaccine would also decrease the chance of stimulating a response against self-antigens. Poor immunogenicity, necessitating the use of potent adjuvants, is a potential problem that arises with short peptide vaccines .
High-frequency HLA class II alleles of at risk populations (DRB1*0804 and DRB1*1101) were used to predict potential immunogenic peptides of Leishmania donovani donovani GP63 protein.
To confirm the immunogenicity of the predicted peptides, a small study was conducted . The study concluded that the synthetic peptides (P1-P4) produced no clear Th1-response individually. The peptide pool marginally increased IFN-γ secretion in whole blood assays, suggesting that they could be the basis of a subunit vaccine in combination with a potent adjuvant. There was no significant IL-4 production upon peptide stimulation. Two peptides (P1 and P3) were observed to induce strong IL-10 up-regulatory effects in vitro.
The antigen-presenting pathway is constantly processing proteins from infectious organisms and exposes peptides derived from these proteins to the immune system at the cell surface bound to HLA molecules. One way for infectious organisms to camouflage themselves and avoid immune recognition is to either reduce HLA binding (deimmunization) and/or to modify the TCR profile of their constituent peptides to resemble host sequences [P1 peptide] (tolerization). P4 and the peptide pool have marked IL-10 reduction effects, making them probable immune therapies for infections with increased IL-10 levels and good candidates for potential inclusion in a L. donovani vaccine based on their high putative immunogenicity (Figure 1b) [48, 56-59].
These first studies show significant differences in immune responses to specific peptides in patients who had been exposed to L. donovani infections. The phenotype of the response correlates with JanusMatrix results, at least for IL- 10 production in LST-reactive individuals.
Immunoinformatics analysis of the potential MHC-binding affinity of parasite surface protein sequences to common HLA class II DRB1 molecules provides a useful alternative method for anti-parasite vaccine development and predicts the phenotype of their immune responses.
The study protocol was scientifically and ethically reviewed and approved by the Scientific and Ethics Committees of the Institute of Endemic Diseases, University of Khartoum. Written informed consent was obtained from all volunteers for the pilot study.
Common HLA-DRB1 alleles among population at risk
The most common HLA-DRB1 alleles of patients with parasitologically confirmed VL/post kala-azar dermal leishmaniasis (PKDL) patients, apparently healthy volunteers with and without reactive leishmanin skin test (LST) and without history of VL/PKDL were obtained from the Institute of Endemic Diseases database. Volunteers were excluded if they failed to give written informed consent or if they have blood relation to any the enrolled patients or volunteers .
Leishmania donovani donovani GP63 protein sequence analysis
Standard analysis of the sequence of the Leishmania donovani donovani GP63 protein was performed as previously described . Leishmania donovani donovani GP63 protein (GenBank accession: ACT31401) was analyzed using Phobius and LipoP servers. Phobius was used to identify signal peptides and transmembrane segments , while LipoP was used to identify lipoprotein attachment sites in the protein . Presence of glycosylphosphatidylinositol (GPI) anchors was evaluated as well using GPI-SOM algorithm available at ExPASy . Each 9-mer frame (9-amino acid sequence) was assessed using EpiMatrix  to determine binding affinities to the most commonly found class II HLA alleles among VL at-risk populations (DRB1*0804 and DRB1*1101) and to eight common HLA alleles that cover >90% of human populations worldwide (DRB1*0101, DRB1*0301, DRB1*0401, DRB1*0701, DRB1*0801, DRB1*1101, DRB1*1301, and DRB1*1501). Each 9-mer frame was assigned a Z-score ranging from approximately -3 to +3. Z-scores ≥ 1.64, generally comprising the top 5% of any given peptide set, were defined as “Hits” and considered potentially immunogenic. Z-scores ≥ 2.32 comprise the top 1% and were considered extremely likely to bind the HLA molecule for which the score was assigned. A 9-mer frame predicted to bind to at least 4 different HLA alleles is termed an EpiBar. EpiBars may be the signature feature of highly immunogenic, promiscuous class II epitopes. For the analysis of the eight common HLA alleles, regions of high epitope density, or clusters, were identified and scored by the ClustiMer algorithm. An ancillary algorithm, ClustiMer identifies “clustered” or promiscuous epitopes [40, 41]. ClustiMer was also utilized to perform a grand average of hydropathicity (GRAVY) analysis, which identified epitopes with elevated hydrophobicity, triage of which can minimize technical difficulties with peptide synthesis and low water solubility stemming. BlastiMer identifies homologies between the putative epitopes identified by EpiMatrix and any protein sequence on file at GenBank. EpiMatrix cluster
score and EpiMatrix whole protein score were derived from the number of hits normalized for the length of the cluster or whole protein. The EpiMatrix cluster score is a representation of predicted promiscuity across multiple HLA alleles; clusters scoring above 10 are considered potentially immunogenic and therefore good candidates for T cell epitopebased vaccines. The predicted epitopes were screened for immunogenicity on a scale that compares the protein to vaccine antigens and other biologics already in clinical use (Figure 1).
The pilot study: In vitro whole-blood assays on healthy and LST reactive volunteers
Four of the predicted epitopes were selected for synthesis by 21st Century Biochemicals (Marlborough, MA, USA) and tested in vitro in whole blood samples from 22 volunteers as a proof of principle. Twelve volunteers were LST nonreactive (induration = 0 mm), while the rest (n = 10) were LST reactive with a mean LST induration of 12.6 ± 6 mm. Cytokine ELISA was used to measure IFN-γ, IL-4 and IL-10 as per manufacturer’s instruction (Komabiotech Inc., Seoul, South Korea) .
Janus Matrix Analysis
Peptide sequences of P1-P4 were analyzed using JanusMatrix to identify potential cross-reactive TCR-facing epitopes with HG, HM, and HP databases. Ratios of cross-reactive hits (number of cross-reactive hits divided by the total number of amino acids in the comparison database multiplied by 1 x 106) per cluster were calculated to determine potential regulatory T cell epitopes within the clusters. In addition, JanusMatrix results were aggregated into a Janus Human Score for each cluster, which represents the average depth of coverage in the search database for each EpiMatrix hit in the input sequence. For example, an input peptide with eight EpiMatrix hits, all of which have one match in the search database, has a Janus Homology Score of 1. An input peptide with four EpiMatrix Hits, all of which have two matches in the search database, has a Janus Homology Score of 2. Finally, epitope network diagrams were created from JanusMatrix output using Cytoscape (Fig. 2)  to visualize the cross-reactivity between the peptides and the target genome databases .
Cite this article: Khalil E A G. In silico Prediction of Immunogenic T Cell Epitopes of Leishmania donovani donovani GP63 Protein: an Alternative Approach for Anti-parasite Vaccine Development. J J Vaccine Vaccination. 2015, 1(2): 008.