- AGGRESCAN
AGGRESCAN is a web tool (http://bioinf.uab.es/aggrescan/)
for the prediction of 'aggregation-prone' segments in protein sequences. It is based on an aggregation propensity scale for
natural amino acids derived from in vivo experiments and on the assumption that short and specific sequence stretches
modulate protein aggregation.
O. Conchillo-Sole, N.S. de Groot, F.X. Aviles, J. Vendrell, X. Daura, A. Ventura, (2007) BMC Bioinformatics, 8:65-81.
- AmyloidMutants
AmyloidMutants is a web-based tool (http://amyloid.csail.mit.edu/)
for predicting the structural and mutational landscapes of amyloid fibrils using an ensemble algorithm.
Each peptide sequence is considered to fold into a complete set of millions (or billions) of unique structural states,
with a single energetic value calculated for each state according to its entire conformation (McCaskill, 1990). From this
quantified set of all possible structures, clusters of low-energy states with similar conformations can be extracted as
predictions of likely real-world structures, with relative probabilities of occurrence. Because calculating the energy
of all mathematically possible interactions would introduce an exponential number of states as a function of sequence
length, this method uses «schemas» as an algorithmic construct to solve this by partitioning fibrillar from non-fibrillar
conformations, enforcing steric consistency and restricting energetic calculations over amyloid fibril sequence/structure
states. AmyloidMutants requires a set of parameters for each submission. The default values and the cross-beta pleat
(serpentine) structural scheme are used here.
C.W. O'Donnell, J. Waldispuhl, M. Lis, R. Halfmann, S. Devadas, S. Lindquist, and B. Berger, (2011) Bioinformatics, 27: i34-i42.
- Amyloidogenic Pattern
A sequence pattern has been identified as highly related to the formation of amyloid fibrils.
Submissions are scanned for the existence of this pattern {P}-{PKRHW}-[VLSCWFNQE]-[ILTYWFNE]-[FIY]-{PKRH} at
identity level, with the use of a simple custom script.
M. Lopez de la Paz, L. Serrano, (2004) Proc. Natl. Acad. Sci. U.S.A. vol.101 no.1 87-92.
- Average Packing Density
This method relates the Average Packing Density of stretches of residues to the formation of amyloid
fibrils. A script implementing the method has been written in our lab (Hamodrakas et al., Int.Journ. of Biol.Macromolecules 41(2007) 295-300.).
Values above 21.4 obtained from a five-residue long sliding window are considered as hits.
O.V. Galzitskaya, S.O. Garbuzynskiy, M.Y. Lobanov, (2006) PLoS Comput Biol 2(12) 1639-1648.
- Beta-strand contiguity
This is a simple algorithm that locates β-strands in the amyloid fibril core using the amino acid
sequence alone. The algorithm calculates an average β-strand propensity score for peptide windows of different sizes
(from 4 to 20 residues) within a polypeptide's sequence. The algorithm samples the full length of the protein’s sequence
by sliding each window along, one residue at a time. A mean β-strand propensity (MβP) score is calculated for each
peptide window, according to this equation: MβP = ΣPβ / [0.5 x (ΣPα+
ΣPt)], where ΣPβ, ΣPα and ΣPt
are the sums of Chou and Fasman β-strand, α-helix, and reverse
turn preference parameters, respectively (Chou and Fasman, 1978)
for every residue in a peptide window. Then x-y plots are produced from compilation of these MβP scores, which are
plotted as a y-coordinate against the amino acid sequence, which is along the x-axis. For each residue along the
sequence, the value for the y-coordinate is produced by adding together the MβP scores above a minimum threshold
value (MβP >= 1.2) from every window that contains that particular residue. Total y values above 20 are considered
as hits. A script implementing the algorithm has been written in our lab.
S. Zibaee, O. S. Makin, M. Goedert, L. C. Serpell, Protein Sci. 2007 May; 16(5): 906–918.
- Hexapeptide Conformational Energy
This program threads all hexapeptides of a submitted protein onto the microcrystallic structure
of NNQQNY. Alternatively the program can use a set of over 2500 templates produced by small shifts in the structure
of NNQQNY. In our consensus method, the version using only the original structure is used, in favour of speed. Energy
values below -27.00 are considered as hits.
Z. Zhang, H. Chen, L. Lai, (2007) Bioinformatics vol.23 no.17 2218-2225.
NOTE: The original program executes interactively. For use in an automated method it was necessary to modify it in
a way that it would allow options to be passed as arguments. Also, a small modification to the memory demands of the
program was made. The use of only one structure as template made the use of a large multidimensional array (used to
store energy results when all 2500+ templates are used) unnecessary. This large array prevented the program from
executing on many computers.
- NetCSSP
This is a computational method (http://cssp2.sookmyung.ac.kr/)
that quantifies the influence of tertiary interaction on secondary structural preference. Artificial neural network
(ANN)-based algorithms that use preparameterized tertiary interactions with sequence inputs from users are designed
to predict contact-dependent secondary structure propensities (CSSPs). NetCSSP calculates these propensities for the
center residue in a seven-residue sliding window. There is a choice between single or dual network. Here, we use the dual
network as it has greater accuracy. The dual network architecture consists of two single networks that use distinct
output nodes. One network has output nodes for α-helix and non-helix, thus predicting helical propensity P(helix).
The other network has output nodes for β-strand and non-beta strand, thus predicting beta propensity P(beta).
NetCSSP can be used to predict the core sequences of amyloid fibril formation where the switch in local secondary
structure occurs. The amyloidogenic hidden beta propensity (HβP) is calculated using the form HβP = P(beta)/P(helix).
Residues with values of HβP above 1 and of P(beta) above 6 are considered as hits.
C. Kim, J. Choi, S. J. Lee, W. J. Welsh, and S. Yoon, (2009) Nucl. Acids Res. 37 (Web Server Issue): W469-473
- Pafig
Pafig (Prediction of amyloid fibril-forming segments, http://www.mobioinfor.cn/pafig/)
is a method based on Support Vector Machines (SVM), for the identification of hexpeptides associated with
amyloid fibrillar aggregates. The predictive model of Pafig is a phenomenological model, which was based
on 41 physicochemical properties selected by a two-round selection from 531 physicochemical properties in
the Amino acid index database (AAindex). Pafig was trained by hexpeptides, which were decomposed by scanning
for segments that could form fibrils with a six-residue sliding window. Predictions with Reliability Index
(RI) greater than or equal to 7 are considered as hits.
J. Tian, N. Wu, J. Guo and Y. Fan, (2009) BMC Bioinformatics, Jan 30;10 Suppl 1:S45.
- SecStr (Possible Conformational Switches)
Using our own lab's consensus secondary structure prediction program SecStr
(S.J. Hamodrakas, CABIOS 4 (4) (1988) 473), potential 'conformational switches' are identified as areas strongly
predicted, simultaneously, both as alpha-helices and beta-sheet strands. The criterion is that, simultaneously, at least
3 secondary structure prediction methods predict both alpha-helix and beta-sheet.
S.J. Hamodrakas, C. Liappa, V.A. Iconomidou, Int. Journ. of Biol. Macromolecules 41(2007) 295-300.
- TANGO
TANGO (http://tango.crg.es/) is a program that
calculates the tendency of peptides for beta aggregation, which is different from amyloid fibril formation tendency but
is highly correlated. Tango 2.1 is used and scores above 5.00% for beta aggregation are considered as hits. TANGO
requires a set of environmental parameters for each submission. The default values from the TANGO online submission
form are used.
A-M. Fernandez-Escamilla, F. Rousseau, J. Schymkowitz, L. Serrano, (2004) Nature Biotechnology vol.22 no.10 1302-1306.
- Waltz
This method (http://waltz.switchlab.org/)
uses a position-specific scoring matrix to determine amyloid-forming sequences. A sequence score Sprofile is
calculated from the log-odd based position-specific scoring matrix (PSSM). Nineteen selected physical properties, which best
describe amyloid propensity enter the scoring function as a physical property term Sphysprop consisting of the sum of
the products of the amino acid frequency with the normalized property value of the respective amino acid for each
position. The final component of the scoring function Sstruct is the position-specific pseudoenergy matrix from
structural modeling using amyloid backbone structures. Relative weightings a of individual terms was introduced for
a balanced scoring function (Stotal = aprofile Sprofile + aphysprop
Sphysprop + astruct Sstruct). There are two options to be set, threshold and pH. Here, a threshold value of 79.0
(High Sensitivity) and pH = 7.0 are used.
S. Maurer-Stroh, M. Debulpaep, N. Kuemmerer, M. Lopez de la Paz, I.C. Martins, J. Reumers, K.L. Morris, A. Copland, L. Serpell, L. Serrano, J.W.H. Schymkowitz & F. Rousseau, (2010) Nature Methods 7, 237-242.
Also note that several of these methods have a limit to the sequence length they can process. In an attempt to overcome this problem, long submissions are broken into shorter overlapping segments that are processed/submitted individually. This may cause an extra delay to the acquisition of results.
|