ProFAT - User Help
Sequence Entry
Dialog
The sequence to be annotated should be entered here. The sequence
should be in FASTA format. ProFAT will remove all non sequence
characters. Sequences longer than 800 amino acids will take a
long time to preprocess, and it is recommended to select only one domain for further processing to avoid extended processing times.
PSI-BLAST
Iterations
Specify the number of iterations for PSI-BLAST sequence based homology
searches. More iterations can detect remote homologs, however,
they can also introduce profile drift, arriving at false
positives. Increasing this number will also increase the
processing time required.
Threading
Depth
This option determines the quantity of hits from Threading
to be annotated and considered in the processing run. The default
is the top 10, although more can be used to determine weaker
homologs. Adding any more than 20 will increase the processing
time and the size of the output files, along with losing the specificity of the results.
RPS-BLAST
Threshold
Sets the e-value threshold for the RPS-BLAST domain search in the preprocessor. (see Domain Predictor)
HMMer Threshold
Sets the e-value threshold for the HMMer based domain search in the preprocessor. (see HMMerThread)
E-mail notification
An e-mail address is required for e-mail notification of completion of the processing run.
Keyword Lists
Here one can select various predefined keyword lists for use
directly with ProFAT. The real power of APArT is in
the use of user defined keyword lists. If you wish to add a keyword
list of your own to this list please contact bradshaw@mpi-cbg.de
User Defined Keyword List
In this text box you can enter a keyword list for the text mining component of ProFAT. Keywords entered into this window can undergo a processing step where basic
endings and prefixes are removed (Stemming) - See Preprocessing below. This assists the user by not
having to think of all permutations of a word and therefore entering it
multiple times. The keyword list is not case sensitive, unless
you select the option for it to be - See Case Senitivity below.
Case
Sensitivity
Determines whether the Text Mining component will be case-sensitive and
therefore, differentiate between "Binding" and "binding". If this
option is used the mining component will only detect EXACT matches.
Preprocessing
Determines whether the preprocessing step (stemming) will be used. The
preprocessing removes basic suffixes from the words in the
keyword list. This is based on the Porter Stemmer (Porter M, 1980). It is recommended to use the
preprocessor to make the most of the text mining component of ProFAT
Domain Prediction Hits
Specifies the quantity of BLASTP hits to be considered for domain prediction.
Increasing this value will increase the domains detected but will also
decrease the specificity of the results as the hits deviate further in similarity to
the detect domain.
HMMerThread Extension
Specifies the quantity of amino acids upstream and downstream of the domain sequence
to extract for Threading. This value can be varied to get optimal results. The
default value of 15 amino acids works well for most domains that are not at the N or C terminals of the protein.
End of page