Server output

Output explanation

The servers use WHAT IF, and thus you get WHAT IF-like output. For regular WHAT IF users that output makes sense, but that might not be the case for you. We therefore explain here some of the output formats that are often used.
  1. Atomic information.
  2. PDB file content.
  3. Residue numbers.
  4. Residue number input.

Explanation for the atomic output per residue

The servers use WHAT IF, and thus you get WHAT IF-like output. The WHAT IF option for displaying all possible information is called LISTA. A typically LISTA output is given below.

The first line gives about the information for one residue. Prp is the residue property value, this value will be zero in the output on most servers. A few servers calculate a value for every residue. This result is stored in this so-called residue property value.

The second line is just a header. Between these first two lines more information can be given in case this residue is member of a family or a cluster (and if this server uses families or clusters), or in case WHAT IF has corrected or mutated this residue.

Residue:    37 ASP  (  37 ) E     (Prp= 0.00)
Atom    X     Y     Z   Acc   B   WT   VdW  Colr   AtOK  Val
 N    18.2  59.6  -5.1  0.0 16.7  1.0  1.7  340     +    0.00
 CA   17.0  58.8  -5.2  1.7 16.0  1.0  1.8  240     +    0.00
 C    16.9  57.7  -4.1  1.6 23.4  1.0  1.8  240     +    0.00
 O    16.1  56.9  -4.2  2.7 19.6  1.0  1.4  120     +    0.00
 CB   16.8  58.2  -6.6  3.5 16.8  1.0  1.8  240     +    0.00
 CG   16.6  59.3  -7.6  4.0 43.8  1.0  1.8  240     +    0.00
 OD1  16.0  60.4  -7.1  7.6 41.3  1.0  1.4  120     +    0.00
 OD2  17.0  59.2  -8.7  6.0 42.4  1.0  1.4  120     +    0.00
  *1    *2    *2    *2   *3   *4   *5   *6   *7    *8      *9
The last line (the one with *1 *2 etc., on it) is not part of the output but added here to guide you to the column by column explanation given below.
  1. The atom names.
  2. The coordinates in Ångstrom
  3. The accessible molecular surface area (only zeros indicates buried or not calculated yet, that depends on which server you used).
  4. The crystallographic B-factor. >60 means this atom is for sure not here....
  5. Weight. This is almost always 1.0. If 0.0 the coordinates were modeled. If between 0.0 and 1.0, alternative conformations have been observed.
  6. The Van der Waals' radius for this atom. (Using the WHAT IF defaults:C:1.8 Ångstrom; O:1.4 Ångstrom; N:1.7 Ångstrom; S:2.0 Ångstrom.).
  7. The colour for this atom. (Only used by servers that also produce graphics output).
  8. Is-atom-OK flag. Atoms that are wrong (or missing) according to WHAT IF get a minus in this column.
  9. The atomic value. Several servers calculate values for each atom. Those values are displayed in this so-called atomic value column.
Sometimes some colums are added to the type of output described above. For example, the vacuum accessibility server produces output like:
Residue:    46 ASN  (  46 )       (Prp= 0.00)
 Phi=-112.9 Psi= 162.3 Omega= 178.4
Atom    X     Y     Z   Acc   B   WT   VdW Colr   OK  Use  Vac.   %
  N    14.0   6.5  13.7  0.0  5.8  1.0  1.7 340    +   -   0.8   0.0
  CA   13.5   5.4  12.9  1.0  6.2  1.0  1.8 240    +   -   1.2  85.7
  C    13.3   5.9  11.5  2.3  6.6  1.0  1.8 240    +   -  12.9  17.6
  O    13.7   6.9  11.0  0.7  7.2  1.0  1.4 120    +   -   8.8   8.4
  CB   12.3   4.8  13.5  0.5  7.3  1.0  1.8 240    +   -   9.4   5.6
  CG   12.5   4.3  14.9  0.0  8.0  1.0  1.8 240    +   -   2.4   0.0
  OD1  12.0   4.8  15.9  3.4 11.0  1.0  1.4 120    +   -   8.9  38.1
  ND2  13.4   3.3  15.0  9.7 10.3  1.0  1.7 340    +   -  15.4  62.6
                        17.6                              59.9  29.4
But in such cases the extra output is trivial. Here the right two columns do of course give you the accessibility in vacuum and the ratio between normal and vacuum accessibility as a percentage. The extra numbers on the bottom are residue wide summaries.

File content

The WHAT IF program uses the famous 'SHOSOU' command to analyse the contents of a PDB entry.
A typical results from the SHOSOU command looks like:
    Contents of the SOUP:                                      *1   
 
Protein .................... : 2                               *2
Drug, ligand or co-factor .. : 1
DNA or RNA ................. : 0
Single atom entity ......... : 7
(Groups of) water .......... : 1
Drug with known topology ... : 0
 
 Molecule      Range              Type              Set name   *3
     1    1 (    1)  316 (  316)E Protein           set        *4
     2  317 (  322)  318 (  323)D Protein           set        *4
     3  319 (  O2 )  319 (  O2 )E K O2 <-           set        *5
     4  320 (  317)  320 (  317)   CA               set        *6
     5  321 (  318)  321 (  318)   CA               set
     6  322 (  319)  322 (  319)   CA               set
     7  323 (  320)  323 (  320)   CA               set
     8  324 (  321)  324 (  321)   ZN               set
     9  325 (  324)  325 (  324)  DMS               set        *7
    10  326 (  O2 )  326 (  O2 )D L O2 <-           set        *8
    11  327 ( HOH )  327 ( HOH )  water   ( 157)    tnl        *9
   *10  *11   *12    *13    *14   *15               *16
  1. This is the header of the SHOSOU output
  2. First the contents of the soup is counted
  3. This is the header of the real thing of the SHOSOU command. The set name (that is the name the user gave to the ensemble of molecules added to the soup with one single GETMOL or GETGRO, etc., command.
  4. Molecule one is a protein with chain identifier E. This protein has 316 amino acids. The second protein is a two residue peptide with chain identifier D.
  5. The third molecule is the C-terminal oxygen of chain E. It is attached to a Lysine (that is indicated by the character K) and the arrow indicates that it is bound to something.
  6. Molecules 5 till 8 are single atomic entities (together with the two C-terminal oxygens they form the seven single atomic entities mentioned in the top half of the output.
  7. DMS probably stands for DMSO, and is a drug, ligand or co-factor. For WHAT IF drug, ligand, and co-factor are all the same thing.
  8. This is the C-terminal oxygen of the second molecule. You can see that because the O2 indicates that it is a C-terminal oxygen. The D indicates that it is part of the D chain and the arrow indicates that it is bound to something. The L indicates that it is bound to a Leucine.
  9. This is a group of 157 water molecules.
  10. The 'molecule' number.
  11. The WHAT IF number of the first residue in this molecule.
  12. The PDB number of the first residue in this molecule.
  13. The WHAT IF number of the last residue in this molecule.
  14. The PDB number of the last residue in this molecule.
  15. A short description of this molecule.
  16. The so-called set-name is only relevant when WHAT IF is used interactively.

Residue numbers

When WHAT IF lists a residue number, it gives a lot of information. E.g.:
3 LYS  (   5 ) A 12 
Means from left to right:
  1. This is the third residue in the PDB file.
  2. It is a lysine
  3. The number in brackets is the number found in the PDB file. This example strongly suggests that the first two residues could not be seen by the crystallographer or NMR spectroscopist.
  4. The character A is the chain identifier.
  5. The number 12 indicates that this residue sits in 12-th NMR model.

Residue number input

If you are supposed to enter a residue number or a residue range, you can respond in several ways. The first possibility is to just type the residue number(s) which WHAT IF has assigned to your residue(s) (or drugs, or waters). These are just sequential numbers, starting with 1 for the first residue encountered, etc. Use the List a file server in the Administration server class to see how WHAT IF assigned the numbers to your PDB entry.

Whenever you are prompted for a residue or residue range without any specification of the residue type, you can also enter drugs, ions or water groups. The few times that that is not considered valid input, WHAT IF will tell you what it thinks about the input.

W A R N I N G. Be aware that most servers will prompt you for only one residue and not for a residue range.

Use the PDB names

If your input file used a different scheme for the numbering of residues you can give those number(s) by typing O (the character O, not the digit zero) followed by the original residue number(s) (Which, in contrast to the strict PDB rules, do not need to be numerical, WHAT IF will also accept names like 17A etc.). Use O as the first character of the line, and not for every residue. This holds for all options throughout WHAT IF. The original names are always listed by WHAT IF in brackets.

Residue input via picking

Input by picking an atom is not possible when using the servers, but for the interactive version of WHAT IF the help looks like:
If you give just only P, you will be asked to pick the residue(s) in the graphics window. In this case you can pick any atom in the residue(s) you want. I suggest you test if certain options function as expected with P input the day before you have to give the big demonstration to the director general of your company...

Input all residues

If you want to input all residues (protein, sugar and DNA/RNA) as a range, you can just type ALL.

Input the total PDB entry

If you want to input all amino acids, DNA/RNA, sugars, co-factors, and water in one shot, you can type TOT.

Input by molecule number

In case you want to enter one entire molecule you can give M followed by the molecule number (as assigned to the molecule by WHAT IF). Of course, when asked for a molecule you can type just the molecule number, although giving an M in front of the molecule number will not hurt the WHAT IF performance.

Separating between identical molecules with U

In case you have multiple copies of one molecule (for example before and after a Molecular Dynamics run) you can type U followed by first the molecule number and then the two original residue names. U3 17A 123 will use the residues 17A till 123 (according to the original, PDB given numbering scheme) from the third molecule in the soup.

Separating between identical molecules with S

In case you have multiple copies of one molecule (for example before and after a molecular dynamics run) you can type S followed by first the molecule number and then the two sequential residue names. S3 18 24 will use the 18-th till 24-th residue from the third molecule.

Addressing groups of residues

Input of families is not possible when using the servers, but for the interactive version of WHAT IF the help looks like:
A family is defined as a group of one or more amino acids consecutively located in the sequence. Families are not something very intelligent or so, it is just a way of giving names to stretches of amino acids. One can for example give all major secondary structure elements their own name. Families can be used at several occasions as input for options. It is for example possible to give families a color, or delete all residues from a family.
Commands that are related to usage of families are easily recognized because they have the three letter combination FAM in their name. The CLUFAM option brings you in the menu that deals with families and clusters.
Whenever you are prompted for one or more ranges you can also enter a family name.

Addressing groups of residues

Input of clusters is not possible when using the servers, but for the interactive version of WHAT IF the help looks like:
A cluster is a group of residues that does not need to sit next to each other in the sequence. In a way clusters are sets of families.
Commands that are related to usage of clusters are easily recognized because they have the three letter combination CLU in their name. The CLUFAM option brings you in the menu that deals with families and clusters.
Whenever you are prompted for multiple ranges you can also enter a cluster name.

Addressing by type

When you are prompted for a range of residues you can also type PROT, WATER, DRUG or NUC. This will add the protein, water, ligands or nucleic acids, respectively, to the list of selected residues. -PROT, -WATER, -DRUG and -NUC will remove protein, water, ligands or nucleic acids, respectively, from the list of selected residues.
So, the command LISTAA TOT -PROT -WATER will list all nucleic acids and co-factors in the soup.

Addressing by position in the soup

If you are prompted for a (range) of residue(s) you can give LAST if you want to address the last residue (amino acid, nucleic acid or sugar) in the soup. If you want to address the last entity (and this can also be water group, C-terminal oxygen, N-terminal protons, ligand, etc.) you can enter END (actually END-OF-SOUP, but just END is enough).

Residue input summary

The table below lists all possible inputs when you are prompted for residue ranges (X and Y are residue numbers in the WHAT IF soup; P and Q are PDB residue identifiers, * means any number):
P        Ask the user to pick residues in the graphics window
PL       Take the residue that was last picked
X Y      Simple residue range
OP Q     PDB identifiers
M*       Whole molecule *
S* X Y   The X-th till Y-th residue in molecule *
U* P Q   The residues P till Q in molecule *
ALL      All residues (amino acids, nucleic acids and sugars)
TOT      The whole soup
LAST     The last residue (amino acid, nucleic acid or sugar) 
END      (actuall END-OF-SOUP) the last thing in the soup
PROT     All amino acids
NUC      All nucleic acids
SUG      All sugars
WATER    All water groups
DRUG     All drugs, ligands, ions
ION      All single atomic molecules (not C-terminal oxygens)
OXT      All C-terminal oxygens
HELIX    All helical residues (in proteins)
STRAND   All residues in strands
4BUNDL   All residues in 4-helix bundles
BURIED   All buried residues (default cutoff < 2 A surface area)
ACCESS   All accessible residues (default cutoff < 2 A surface area)
CLUNAM   All residues in cluster are selected by giving cluster name
FAMNAM   All residues in family are selected by giving family name
The input listed below will remove things from the list of selected residues etc. Nothing goes wrong if you try to remove a residue that was not selected yet.
-PROT    All proteins
-WATER   All water
-DRUG    All drugs, ligand, ions
-NUC     All nucleic acids
-SUG     All sugars
-ALL     All residues
-HELIX   All helical residues
-STRAND  All strand residues
-ION     All ions
-OXT     All C-terminal oxygens

Use E-mail if you have questions about these servers.
EMBL Home Biocomputing Unit Bork Group Gibson Group Vriend Group EBI Homepage BIOTECH validation server Science information Databases Compute facilities
(C) G.V. 18-Aug-1998
Last updated: 13-Feb-2003 EB