|
|
||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
REPuter - Manualchanges to previous online versions of reputerWe do not offer precalculated genomes any more.The online version of REPuter has only little restrictions now (Our server capacity grows), so there is no reason to offer any precompute genomes (which are mainly not needed). The textual output of reputer can be optional filtered, before downloading. The Graphical Visualisation is available as static image (as before) and as an partly interactive version (activated JavaScript is required). A full dynamic/interactive visualisation will be part of an future release. Parameter DescriptionREPuter offers a various parameters. All off them are explained in this chapter Sequence Format: REPuter supports sequences in FASTA format via file upload or copyîpaste in a textfield. A sequence in FASTA format begins with a single-line description, followed by lines of sequence data.
>sequence_1 caagcacagaaacctatggcataaatccctctgagacgcgttgtactatggttatctaat tctccggcgacacaagttgtctaaccgtgatcaccttaaagggcaagccgcccaatagat gttagttaatactacgtaccaagtatgcctgcgcttggtaaagccgcctgtccatagttc tactagggtagagcttcaggatgctctatagttcgagcggttctttgatcaactcgacta gctaccaccatgtctgtgttttattgcacgcaaagtcgtaagtttaaacggaccaagaag ccttcttcggtcagtagcaggttaagggccaagtacaagcctctccaggaatgcttaacg gcatcgatgcaacttggacaagtaaacatcctgaagctta Match Direction:REPuter offer four possibilities of searching for repeats:
Maximum Computed Repeats: show the repeats with smallest E-value (default :50) Minimal Repeat Size: specify that repeats must have the given length. Attention : long sequences and a small minimum repeat size results in a long computation time. Error Distance: Search repeats up to the given hamming/edit distance Output DescriptionAfter the REPuter run has finished, a reputer result page is shown, which offer various opinions to view the result. Textual Output: The result of a run can viewed/downloaded as a space separated table. Optional the output can be filtered.The head of a sample output looks like : # 235 -3 8 reputer_bibitest_1091788224_479525172.xmlrpc 9 150 F 9 151 0 5.92e-02 8 150 F 8 152 0 2.37e-01 10 150 F 10 153 -1 4.44e-01 9 150 F 9 154 -1 1.60e+00 [1][2][3][4][5] [6] [7] ...The first line, starting with '#' is acomment. The sequence length (235), the maximum allowed distance ([-]3), the minimum repeat size (8) and the processed file are described here. The following lines contain repeats found , one line each .
Graphical Output: The output of the REPuter is processed and gives a nice overview of the number, the length and the location of repeats in the uploaded sequence. In this version of online REPuter we offer two kinds of visualisations - a static image and partly interactive version (a modern browser like IE 5.5 and above, Mozilla, Netscape 6 and above, Opera, etc. and activated JavaScript are required). For the next release a full interactive visualisation (as Java Applet) is planned. Theoretical BackgroundThis tool reports maximal forward, reverse, complemented, and reverse complemented repeats for a given input sequence. The definition of 'maximality' as in [1] basically limits the output to only the longest repeats in the sequence. These may contain shorter repeats which are not explicitly reported. Let your input sequence be a text string s of length
n. REPuter distinguishes four different kinds of repeats:
The triple (l, i, j) is a MFR if:
[1] Gusfield, D., Algorithms on Strings, Trees, and Sequences, Cambridge University Press, 1997 REPuter Sample RunTHIS EXAMPLE IS NOT UPDATED UNTIL NOW, SO BE CAREFULL READING THIS CHAPTER Consider the following 30 bases input sequence, which is a three-fold repetition of 'gacagtcagt': >5.seq gacagtcagtgacagtcagtgacagtcagtThe reputer engine produces the following raw data output, starting with the input sequence name. Following, each line describes one repeat, its size, starting position of the first part, one of the four possible modi (F, P, R, C), then the starting position of the second part. The output below therefore reports two repeats, both starting at position 0. The first part of the first repeat starts at position 0, its second part at position 20. # /tmp/5.seq.flat 30 10 0 F 20 20 0 F 10Drawing the sequence in dark blue and the repeats in lightblue this might look like this:
Note that according to the 'left character' rule 3. for MFRs in the Theoretical Background section, we do not report a repeat like "10 0 F 10", since this short repeat will become part of "20 0 F 10". Additionally, to keep the starting position information visible, each part of a repeat is displayed on a separate strand:
|
|
|||||||||||||||||||||||||||||||||
|
|
|
|||||||||||||||||||||||||||||||||