ProFAT - User Help

Sequence Entry Dialog

The sequence to be annotated should be entered here.  The sequence should be in FASTA format.  ProFAT will remove all non sequence characters.  Sequences longer than 800 amino acids will take a long time to preprocess, and it is recommended to select only one domain for further processing to avoid extended processing times.

PSI-BLAST Iterations

Specify the number of iterations for PSI-BLAST sequence based homology searches.  More iterations can detect remote homologs, however, they can also introduce profile drift, arriving at false positives.   Increasing this number will also increase the processing time required.

Threading Depth

This option determines the quantity of hits from Threading to be annotated and considered in the processing run.  The default is the top 10, although more can be used to determine weaker homologs.  Adding any more than 20 will increase the processing time and the size of the output files, along with losing the specificity of the results.

RPS-BLAST Threshold

Sets the e-value threshold for the RPS-BLAST domain search in the preprocessor.  (see Domain Predictor)

HMMer Threshold

Sets the e-value threshold for the HMMer based domain search in the preprocessor.   (see HMMerThread)

E-mail notification

An e-mail address is required for e-mail notification of completion of the processing run.

Keyword Lists

Here one can select various predefined keyword lists for use directly with ProFAT.  The real power of APArT is in the use of user defined keyword lists.  If you wish to add a keyword list of your own to this list please contact bradshaw@mpi-cbg.de

User Defined Keyword List

In this text box you can enter a keyword list for the text mining component of ProFAT.  Keywords entered into this window can undergo a processing step where basic endings and prefixes are removed (Stemming) - See Preprocessing below.  This assists the user by not having to think of all permutations of a word and therefore entering it multiple times.  The keyword list is not case sensitive, unless you select the option for it to be - See Case Senitivity below.

Case Sensitivity

Determines whether the Text Mining component will be case-sensitive and therefore, differentiate between "Binding" and "binding".  If this option is used the mining component will only detect EXACT matches.

Preprocessing

Determines whether the preprocessing step (stemming) will be used.  The preprocessing removes basic suffixes from the words in the keyword list.  This is based on the Porter Stemmer (Porter M, 1980).  It is recommended to use the preprocessor to make the most of the text mining component of ProFAT

Domain Prediction Hits

Specifies the quantity of BLASTP hits to be considered for domain prediction.   Increasing this value will increase the domains detected but will also decrease the specificity of the results as the hits deviate further in similarity to the detect domain.

HMMerThread Extension

Specifies the quantity of amino acids upstream and downstream of the domain sequence to extract for Threading.  This value can be varied to get optimal results.  The default value of 15 amino acids works well for most domains that are not at the N or C terminals of the protein.


Valid HTML 4.01 Transitional


























End of page