AmylPred 2

A Consensus Method for Amyloid Propensity Prediction

Click here to proceed to the submission form.

General description of the Method

AMYLPRED2 is an improved version of a web tool (http://biophysics.biol.uoa.gr/AMYLPRED/) that was developed in our lab in the past (2009). It employs a consensus of different methods that have been found or specifically developed to predict features related to the formation of amyloid fibrils. The consensus of these methods is defined as the hit overlap of at least n/2 (rounded down) out of n selected methods (i.e. 5 out of 11 methods, if the user chooses to use all available methods). This is the primary output of the program. However, the individual predictions of these methods are also made available by pressing the button "Show/hide methods". Furthermore, a consensus histogram is shown by pressing the button "Show/hide consensus". All results are also made available in the form of a text file, maintained on the server for 1 (one) day. Consequently, AMYLPRED is a useful tool for identifying amyloid-forming regions in proteins that are associated with several conformational diseases, called amyloidoses, such as Altzheimer's, Parkinson's, prion diseases, type II diabetes etc. It may also be useful for understanding the properties of protein folding and misfolding and for helping to the control of protein aggregation/solubility in biotechnology (recombinant proteins forming bacterial inclusion bodies) and biotherapeutics (monoclonal antibodies and biopharmaceutical proteins).

If you are using this tool, please cite the following reference: Tsolis, A.C., Papandreou, N.C., Iconomidou, V.A., Hamodrakas, S.J. (2013) A Consensus Method for the Prediction of "Aggregation-Prone" Peptides in Globular Proteins. PLoS ONE, 8(1): e54175.

This service is freely available to academic users only!!! Non-academic users should contact:
Assist. Prof. V.A. Iconomidou (veconom@biol.uoa.gr)
Em. Prof. S.J. Hamodrakas (shamodr@biol.uoa.gr)

Methods used and Modifications

AGGRESCAN

AGGRESCAN is a web tool (http://bioinf.uab.es/aggrescan/) for the prediction of 'aggregation-prone' segments in protein sequences. It is based on an aggregation propensity scale for natural amino acids derived from in vivo experiments and on the assumption that short and specific sequence stretches modulate protein aggregation.

O. Conchillo-Sole, N.S. de Groot, F.X. Aviles, J. Vendrell, X. Daura, A. Ventura, (2007) BMC Bioinformatics, 8:65-81.
AmyloidMutants

AmyloidMutants is a web-based tool (http://amyloid.csail.mit.edu/) for predicting the structural and mutational landscapes of amyloid fibrils using an ensemble algorithm. Each peptide sequence is considered to fold into a complete set of millions (or billions) of unique structural states, with a single energetic value calculated for each state according to its entire conformation (McCaskill, 1990). From this quantified set of all possible structures, clusters of low-energy states with similar conformations can be extracted as predictions of likely real-world structures, with relative probabilities of occurrence. Because calculating the energy of all mathematically possible interactions would introduce an exponential number of states as a function of sequence length, this method uses «schemas» as an algorithmic construct to solve this by partitioning fibrillar from non-fibrillar conformations, enforcing steric consistency and restricting energetic calculations over amyloid fibril sequence/structure states. AmyloidMutants requires a set of parameters for each submission. The default values and the cross-beta pleat (serpentine) structural scheme are used here.

C.W. O'Donnell, J. Waldispuhl, M. Lis, R. Halfmann, S. Devadas, S. Lindquist, and B. Berger, (2011) Bioinformatics, 27: i34-i42.
Amyloidogenic Pattern

A sequence pattern has been identified as highly related to the formation of amyloid fibrils. Submissions are scanned for the existence of this pattern {P}-{PKRHW}-[VLSCWFNQE]-[ILTYWFNE]-[FIY]-{PKRH} at identity level, with the use of a simple custom script.

M. Lopez de la Paz, L. Serrano, (2004) Proc. Natl. Acad. Sci. U.S.A. vol.101 no.1 87-92.
Average Packing Density

This method relates the Average Packing Density of stretches of residues to the formation of amyloid fibrils. A script implementing the method has been written in our lab (Hamodrakas et al., Int.Journ. of Biol.Macromolecules 41(2007) 295-300.). Values above 21.4 obtained from a five-residue long sliding window are considered as hits.

O.V. Galzitskaya, S.O. Garbuzynskiy, M.Y. Lobanov, (2006) PLoS Comput Biol 2(12) 1639-1648.
Beta-strand contiguity

This is a simple algorithm that locates β-strands in the amyloid fibril core using the amino acid sequence alone. The algorithm calculates an average β-strand propensity score for peptide windows of different sizes (from 4 to 20 residues) within a polypeptide's sequence. The algorithm samples the full length of the protein’s sequence by sliding each window along, one residue at a time. A mean β-strand propensity (MβP) score is calculated for each peptide window, according to this equation: MβP = ΣP_β / [0.5 x (ΣP_α+ ΣP_t)], where ΣP_β, ΣP_α and ΣP_t are the sums of Chou and Fasman β-strand, α-helix, and reverse turn preference parameters, respectively (Chou and Fasman, 1978) for every residue in a peptide window. Then x-y plots are produced from compilation of these MβP scores, which are plotted as a y-coordinate against the amino acid sequence, which is along the x-axis. For each residue along the sequence, the value for the y-coordinate is produced by adding together the MβP scores above a minimum threshold value (MβP >= 1.2) from every window that contains that particular residue. Total y values above 20 are considered as hits. A script implementing the algorithm has been written in our lab.

S. Zibaee, O. S. Makin, M. Goedert, L. C. Serpell, Protein Sci. 2007 May; 16(5): 906–918.
Hexapeptide Conformational Energy

This program threads all hexapeptides of a submitted protein onto the microcrystallic structure of NNQQNY. Alternatively the program can use a set of over 2500 templates produced by small shifts in the structure of NNQQNY. In our consensus method, the version using only the original structure is used, in favour of speed. Energy values below -27.00 are considered as hits.

Z. Zhang, H. Chen, L. Lai, (2007) Bioinformatics vol.23 no.17 2218-2225.

NOTE: The original program executes interactively. For use in an automated method it was necessary to modify it in a way that it would allow options to be passed as arguments. Also, a small modification to the memory demands of the program was made. The use of only one structure as template made the use of a large multidimensional array (used to store energy results when all 2500+ templates are used) unnecessary. This large array prevented the program from executing on many computers.
NetCSSP

This is a computational method (http://cssp2.sookmyung.ac.kr/) that quantifies the influence of tertiary interaction on secondary structural preference. Artificial neural network (ANN)-based algorithms that use preparameterized tertiary interactions with sequence inputs from users are designed to predict contact-dependent secondary structure propensities (CSSPs). NetCSSP calculates these propensities for the center residue in a seven-residue sliding window. There is a choice between single or dual network. Here, we use the dual network as it has greater accuracy. The dual network architecture consists of two single networks that use distinct output nodes. One network has output nodes for α-helix and non-helix, thus predicting helical propensity P(helix). The other network has output nodes for β-strand and non-beta strand, thus predicting beta propensity P(beta). NetCSSP can be used to predict the core sequences of amyloid fibril formation where the switch in local secondary structure occurs. The amyloidogenic hidden beta propensity (HβP) is calculated using the form HβP = P(beta)/P(helix). Residues with values of HβP above 1 and of P(beta) above 6 are considered as hits.

C. Kim, J. Choi, S. J. Lee, W. J. Welsh, and S. Yoon, (2009) Nucl. Acids Res. 37 (Web Server Issue): W469-473
Pafig

Pafig (Prediction of amyloid fibril-forming segments, http://www.mobioinfor.cn/pafig/) is a method based on Support Vector Machines (SVM), for the identification of hexpeptides associated with amyloid fibrillar aggregates. The predictive model of Pafig is a phenomenological model, which was based on 41 physicochemical properties selected by a two-round selection from 531 physicochemical properties in the Amino acid index database (AAindex). Pafig was trained by hexpeptides, which were decomposed by scanning for segments that could form fibrils with a six-residue sliding window. Predictions with Reliability Index (RI) greater than or equal to 7 are considered as hits.

J. Tian, N. Wu, J. Guo and Y. Fan, (2009) BMC Bioinformatics, Jan 30;10 Suppl 1:S45.
SecStr (Possible Conformational Switches)

Using our own lab's consensus secondary structure prediction program SecStr (S.J. Hamodrakas, CABIOS 4 (4) (1988) 473), potential 'conformational switches' are identified as areas strongly predicted, simultaneously, both as alpha-helices and beta-sheet strands. The criterion is that, simultaneously, at least 3 secondary structure prediction methods predict both alpha-helix and beta-sheet.

S.J. Hamodrakas, C. Liappa, V.A. Iconomidou, Int. Journ. of Biol. Macromolecules 41(2007) 295-300.
TANGO

TANGO (http://tango.crg.es/) is a program that calculates the tendency of peptides for beta aggregation, which is different from amyloid fibril formation tendency but is highly correlated. Tango 2.1 is used and scores above 5.00% for beta aggregation are considered as hits. TANGO requires a set of environmental parameters for each submission. The default values from the TANGO online submission form are used.

A-M. Fernandez-Escamilla, F. Rousseau, J. Schymkowitz, L. Serrano, (2004) Nature Biotechnology vol.22 no.10 1302-1306.
Waltz

This method (http://waltz.switchlab.org/) uses a position-specific scoring matrix to determine amyloid-forming sequences. A sequence score S_profile is calculated from the log-odd based position-specific scoring matrix (PSSM). Nineteen selected physical properties, which best describe amyloid propensity enter the scoring function as a physical property term S_physprop consisting of the sum of the products of the amino acid frequency with the normalized property value of the respective amino acid for each position. The final component of the scoring function S_struct is the position-specific pseudoenergy matrix from structural modeling using amyloid backbone structures. Relative weightings a of individual terms was introduced for a balanced scoring function (S_total = a_profile S_profile + a_physprop S_physprop + a_struct S_struct). There are two options to be set, threshold and pH. Here, a threshold value of 79.0 (High Sensitivity) and pH = 7.0 are used.

S. Maurer-Stroh, M. Debulpaep, N. Kuemmerer, M. Lopez de la Paz, I.C. Martins, J. Reumers, K.L. Morris, A. Copland, L. Serpell, L. Serrano, J.W.H. Schymkowitz & F. Rousseau, (2010) Nature Methods 7, 237-242.

Also note that several of these methods have a limit to the sequence length they can process. In an attempt to overcome this problem, long submissions are broken into shorter overlapping segments that are processed/submitted individually. This may cause an extra delay to the acquisition of results.

Click here to proceed to the submission form.