REPvis - Repeats Visualization



Configuring repvis

Some features of repvis can optionally be influenced via configuration files stored in the directory .reputer in the user's home directory.



Command Line Options

To list all available command line options call repvis with -help:
> repvis -help

*** repvis - Repeats Visualizer v0.3
*** Compiled by chris@kakerlake, Sun Apr 23 13:19:13 CEST 2000
*** gcc version 2.95 19990728 (release)
*** (c) 1998-2000 Stefan Kurtz and Chris Schleiermacher
*** Contact: reputer@genomes.de
*** For updates visit: http://www.genomes.de

Name
      repvis

Synopsis
      repvis [Options] directory

Description
      This program visualizes 4 kinds of repeats as square, circle or
      dot plot graph.

Options
       -seqpath     Override path for input sequence stored in .bin file.
       -help        Print help message.
       -noseq       Do not read in source sequence.
       -png         Generate a png-image in batch mode.
       -pngprefix   Prefix for png image.
       -l size      In batch mode: display repeats of this size and above.
       -f           In batch mode: display forward repeats
       -r           In batch mode: display reverse repeats
       -c           In batch mode: display complemented repeats
       -p           In batch mode: display reverse complemented repeats
       -width       In batch mode: png x size in pixels
       -height      In batch mode: png y size in pixels
       directory    Directory containing repfind binary output files
                    with .bin extensions.
                    In batch mode this must be a single repfind file.

-seqpath
Each REPuter binary format file (as generated by repfind or repselect) contains the path to the input sequence used for the repeats calculation. This option overrides the path to the sequence source file.

-help
Display available command line options. Also -h or -?.

-noseq
Do not read in the source sequence used to generate the repfind result. Consequently some features dealing with source sequence data are not available, e.g. BLAST database queries.

-png
Generate a portable network graphics (png) image in batch mode. The X interface is not launched.

The following options can only be used in connection with the -png option.

-pngprefix
The portable network graphics (png) image is not created in the current working directory. Instead, the prefix supplied here is prepended to the image file name.

-l
Draw repeats of this size and above.

-f -r -c -p
Draw repeats of this kind.

-width w
Create an image of this width. The default width is 790 pixels.

-height h
Create an image of this height. The default size is width/2.257 pixels.



The Graphical User Interface

repvis expects repfind/repselect result data files to have the file extension '.bin'. By specifying a directory at startup, this location is scanned for .bin files.

If repfind encounteres a result data files, which was either not generated in your filesystem or the original DNA sequence was (re)moved, the program asks you to browse for the missing file.

Selecting No disables certain sequence related features like BLAST database queries. The same is archieved by the command line option -noseq. The advantage of ommitting the DNA sequence is an increased program performance.


Eventually, the repvis user interface is launched. The screenshot below is a clickable map: select a region to get more information.


Repeats Graph
According to the type of graph and the kind of repeats selected by the user, this panel displays either forward, reverse complemented (palindromic), complemented or reversed repeats as square graph, circle graph or dot plot.

Clicking the graph window evokes the Back...

Color Key
The color key associates a color to a certain range of repeats sizes. In the graph given above, repeats of sizes 4877 to 5502 are displayed as yellow lines, for example.
The length of the shortest and longest repeat are marked in red color. In our example this is 2378 and 8626.
Back...

Genome Selector
Clicking this button evokes a browser containing all files with the extension .bin in the directory specified when calling repvis.

For better readability, a translation list may be used to display full organism names in addition to the filenames (see section Configuring).

Back...

Graphics Type
Three different types of graphical representation are available:

Square Graph
Circle Graph
Dot Plot
Back...

Color Scheme Selector
Two color schemes are available. The first creates a more 'esthetic' image of the repeats while the second emphasizes the longest, i.e. most important repeats.


Back...

Repeats Data Investigator

Depending on the current setting of the selection slider and the limit choice box, this window displays data for all repeats as extracted with repselect, fulfilling the selection slider setting: The "Show Sequence" check button adds the DNA sequence to the list, "Show mismatch as IUB Code" switches from the IUB representation of incompletely specified residues to a list of all possible bases.

Back...

Repeats Type Selector
These four buttons specify the kind of repeat to display:

The button label contains the number of repeats extracted from the binary input file for each kind. Our example above lists 500 forward repeats. Since there are no P, C or R repeats, the respective buttons are disabled.
Back...

Selection Criterium Selector
This choice box determines the repeats selection criterium.

Back...

Selection Value Slider
The Selection Value slider allows the selection of repeats according to the setting of the Seleciton Criterium Selector. The boundaries of the slider are either the shortest/longest repeat from the current repeat kind or the smallest/largest E-Value.
Back...

Program Status
This line reports various program status messages. Clicking on the status line brings up a status history.
Back...


The Inspector Window

After the Inspector window is launched, the user can zoom in to, or out from a certain region for closer examination of repeats structures. This is achieved by left or right clicking the repeats graph.

Note that the lines connecting the starting positions of a repeat are not drawn under the following condition:

Instead a short vertical marker represents the connecting line.

Clicking a repeat region in the upper sequence symbol directly, the repeat data associated with this position are displayed in the Data Browser. Finally the repeat sequence can be displayed with "View Sequence" or submitted to a database query with "Transfer to Netscape".
This feature uses the remote control facility of the Netscape Navigator browser as described in [3].





Exporting a Subsequence

In the Inspector window, the "Save Subsequence" button allows to save a part of the current source sequence.



Annotating the Visualization

An additional file can be created by hand or shell scipts to annotate the repeats visualization. The annotation file must be a text file of the following format:

Text_Symbol   Position_1   Position_2   Color_Code   #Text_Comment

E.g.:

- 76 81 #00FF00 #1.14 PlyA
= 125 517 #000000 #1.13 Term
<= 682 966 #0000FF #1.12 Intr
<= 1494 2479 #0000FF #1.11 Intr
<= 2644 2702 #0000FF #1.10 Intr
<= 2772 2935 #0000FF #1.09 Intr
<= 3393 4154 #0000FF #1.08 Intr
<= 4327 4431 #0000FF #1.07 Intr

The Text_Symbol is translated into a graphical symbol.

Position_1 and Position_2 denote the starting and ending position of the annotation symbol. Note, that some symbols principially require only one position, as the vertical line. Nevertheless a second position must be supplied which can have an arbitrary value.

The Color_Code can either be a hexadecimal value starting with a hash character #RRGGBB, consisting of three two digit hexadecimal values for red, green and blue. For example, #FF0000 means red, #00FF00 means green. Alternatively, Color_Code can be a colorname as specified here.

Text Symbol Graphics Symbol
=
<=
=>
<=>
-
<-
->
<->
V
O
*
|

Available Text Symbols