SETTER Help index
The web-server utilizes SETTER (SEcondary sTructure-based TERtiary Structure Similarity Algorithm) method for fast and accurate structural pairwise alignment. The efficiency of the algorithm is given by the decomposition of the RNA structure into the set of non-overlapping generalized secondary structure motifs (GSSUs). GSSU usually resembles a hairpin motif possibly containing bulges and/or internal loops in its stem part. A segmentation to GSSUs offers good scalability with respect to the structure size (SETTER scales linearly with the structure size) because the number of residues in GSSUs (SETTER scales quadratically with the GSSU size) generally does not increase with increased size of the RNA structure. The underlying SETTER algorithm is both accurate and very fast, and does not impose limits on the size of aligned RNA structures.
Users can switch between two alignment types: pairwise (SETTER) and multiple (MultiSETTER). The pairwise alignment allows users to align RNA structures in a single or batch mode. The single mode means an alignment of two RNA structures. The batch mode means alignment of one structure against each of one in the batch of RNA structures. The MultiSETTER is extension of SETTER for multiple RNA structure alignment. The main idea of MultiSETTER is to build an average structure and to superpose all the input structures onto this average structure.
Format of the entries for the RNA molecule in the single mode of the pairwise alignment should look like the following example:
The PDB code entry takes a standard 4-characters PDB ID. You can upload also your own file in the PDB format. If you leave filled-in both of PDB code and upload PDB file fields, the first one will be involved to the alignment process. Chain ID entry is not mandatory. You can fill in an example structure by pressing the icon. Clear all entries in the current block by pressing the icon.
Appearance of the entries for the 2nd molecule/molecules depends on currently selected mode. In the case of single mode, the 2nd RNA molecule entries are identical to the previously described. By switching to batch of RNA molecules or multiple RNA alignment mode, entries look like:
The PDB codes entry takes a standard 4-characters PDB ID codes separated by spaces or commas. If you wish to define the specific chain of the structure, type its chain ID into parenthesis or with a colon immediately behind the PDB code. You can upload also a plain text file with PDB codes written in the same format. It is also possible upload one or more PDB files. If you leave filled-in PDB code and upload file with PDB codes and upload PDB file fields together, the first one non-empty entry box will be involved to the alignment process.
A few parameters are set for SETTER algorithm. See the list of parameters bellow. The statistics provided in the result page are valid only for default parameter values.
- neck shift
Since WC hydrogen bonds are identified by 3DNA using simple geometric criteria, their detection may sometimes be inaccurate. This leads to the shift of the neck position within the GSSU. SETTER simulates the neck shift by aligning also the stem residues under the necks. The number of residues for which the position of the neck is examined is given by this parameter. Higher value increases accuracy and runtime. Generally, values higher than the stem length are meaningless, and our experience also shows that values above 10 do not lead to significant increase in the alignment accuracy. The default value = 8.
- identical pair nt type modifier
If two aligned residues are of the same nucleotide type (i.e., A, C, U, G) the distance between them is multiplied by this parameter. The meaningful range of this parameter is 0 < ζ ≤ 1. The lower its value the more matching identical nucleotides are rewarded. The value of this parameter influences only the accuracy of the alignment, not its speed. The default value = 0.1.
- pair distance threshold
To prefer high-quality alignments S-distance is divided by the number of residues lying within the distance ε (in Å). The value of this parameter influences only the accuracy of the alignment, not its speed. The default value = 6.0 Å.
- top k gssu
For structures with multiple GSSUs only first κ GSSU pairs are considered for the S-distance calculation. Higher value increases accuracy and runtime. The highest possible meaningful value is m x n where m is the number of the GSSUs in the first structure and n is the number of the GSSUs in the second structure. The default value of 3 represents a very good compromise between the speed and the accuracy of the algorithm.
- no loop - stem percentage
For structure without the loop third point necessary for the alignment (two points are already given as the neck residues) must be chosen from the stem. This parameter (in %) specifies the percentage of the stem residues being considered in the search for the third point. Higher value increases accuracy and runtime. The default value = 10 %.
- early termination
Value influencing how often the early termination heuristics increasing the algorithm’s speed is applied. The higher the value, the less often the early termination occurs, and the more accurate and slower the algorithm is. Parameter takes an integer value with λ ≥ 1, however, values higher than 10 do not lead to a significant improvement in the alignment accuracy. The default value = 1.
The reset button clears all entries of the query form and set all parameters to the default values.
Within the submit of the query form with run SETTER button the Processing screen will appear.
The processing screen displays the progress of the running SETTER. There are a few steps in a whole process. Initially, existence of the entered structures is verified, alternatively PDB files are downloaded. List of all involved structures is provided. The unique task ID and URL to the results are also provided. Each step running is indicated with a blue spinner. If any of all steps fails, it is indicated with a red point and SETTER process stops. Otherwise the successful steps are indicated with a green point . Results are not available until the status of the last step "running SETTER" is "done" . After that message the screen will be automatically switched to the Results tab.
The details about the alignment are displayed in the Result tab. The each alignment result contains an S-distance, a statistical significance of the alignment given as its p-value, and a running time of the algorithm and other details. The structure can be downloaded either in PDB or in mmCIF formats under the "format" link. In addition to the S-distance several commonly used measures of the alignment quality are reported under the "alignment quality measures" link. These include:
RMSD is a root-mean-square deviation between P atoms. RMSD captures the general 3D shape of RNA, but it can be misleading as the errors are spread over the whole molecule.
PSI is defined as a percentage of superimposed residues within 4.0 Å with respect to the length of the shorter of the two structures.
PID is the percentage of aligned nucleotides of the same type with respect to the length of the shorter of the two structures.
- number of aligned nucleotides, number of exact base matches
These values give similar information as PSI and PID.
Next the table shows the number of GSSU units and the number of nucleotides in each structure. Links below this table allow to download the alignment report in the plain text format. In addition, the superimposed structures can be downloaded in the PDB format or the picture of the aligned structures can be downloaded as the JPEG image.
Two superimposed structures are visualized on the right side utilizing the JSmol . The visualization can be controlled using the top panel which allows to easily adjust color, molecular display scheme, and to turn each of the models on or off independently. In addition, the menu allowing a more detailed handling of various aspects of the visualization is available upon the right click in the JSmol window.
Individual GSSUs in both structures can be inspected by checking the box in the "Show GSSU pairs" panel. Two aligned GSSUs are shown in different colors (red and blue). The displayed GSSUs can either be cycled through by clicking on the left/right arrows or they can be selected from the drop-down box.
The aligned residues are displayed by checking the "Show aligned residues" panel. The alignment precision can further be studied by highlighting the nearest neighbor nucleotides defined by the adjustable distance range in angstrom units.
In the batch mode or multiple alignment, the resulting alignments are displayed as vertically stacked boxes (see the next figure). The boxes are ordered in a descending order given by S-distances. Alignment details are expanded into the result window after clicking the box.
In the case of multiple alignment, the result page begins by the JSmol window, where the resulting average structure is displayed together with all the other involved RNA structures. Then follow pairwise alignment results as described before.
-  BERMAN, H., et al. The Protein Data Bank. Nucleic Acids Research, 2000, 28(1), 235-242.