BiBiServ Logo
Attention:
Due to technical maintenance some tools might be unavailable.
See maintenance information.
BiBiServ -
                                    Bielefeld         University Bioinformatic Service
Tools
Education
Administration
Tools
Genome Comparison
Gecko
REPuter
...more
Alignments
PoSSuMsearch2
ChromA
...more
Primer Design
GeneFisher2
RNA Studio
RNAshapes
KnotInFrame
RNAhybrid
...more
Evolutionary Relationship
ROSE
...more
Others
XenDB
jPREdictor
...more

Rose - Manual


$Id: manual.html,v 1.8 2003/11/24 14:44:38 hmersch Exp $

NAME
        ROSE - Random-model Of Sequence Evolution 
        Rose implements a new probabilistic model of RNA-, DNA-, or
        protein-sequence evolution.

SYNOPSIS
        rose [-I <dir>[:<dir>]] <input file> | -

AVAILABILITY
        http://bibiserv.TechFak.Uni-Bielefeld.DE/rose/

DESCRIPTION

Rose: generating sequence families

Jens Stoye (1) Dirk Evers (2) and Folker Meyer (2) 

1 Research Center for Interdisciplinary Studies on Structure Formation (FSPM)
2 Technische Fakultaet, University of Bielefeld, Postfach 100 131,
  33501 Bielefeld, Germany 

Motivation: We present a new probabilistic model of the evolution of
RNA-, DNA-, or protein-like sequences and a software tool, Rose, that
implements this model. Guided by an evolutionary tree, a family of
related sequences is created from a common ancestor sequence by
insertion, deletion and substitution of characters. During this
artificial evolutionary process, the `true' history is logged and the
`correct' multiple sequence alignment is created simultaneously. The
model also allows for varying rates of mutation within the sequences,
making it possible to establish so-called sequence motifs.  

Results: The data created by Rose are suitable for the evaluation of
methods in multiple sequence alignment computation and the prediction of
phylogenetic relationships. It can also be useful when teaching courses
in or developing models of sequence evolution and in the study of
evolutionary processes.
 
OPTIONS
        -I dir[:dir]
                A colon-separated list of directories used  to  specify
                include search directories to the input parser.

USAGE

        rose <input file>| -

        Input can be from stdin ( specify a '-'(minus) on the command line)
        or from an input file.

        The input stream may contain the following parameters:

Name                    Type            Default Optional Comment

StdOut                  Boolean         True    Yes     output to stdout
OutputFilename          String          None    Yes     output to single filename
OutputFilebase          String          None    Yes     out to separate files named...
SequenceSuffix          String          ".fas"  Yes     sequence file suffix
AlignmentFormat         String         "PHYLIP" Yes     "FASTA" or "PHYLIP"
AlignmentWithAncestors  Boolean         False   Yes     alignment will contain ancestors
AlignmentSuffix         String (".fa" or ".phy") Yes    alignment file suffix
TreeSuffix              String          ".tree" Yes     tree file suffix
SequenceOutputLen       Integer         60      Yes     Length of Seq on a Line
SeedVal                 Integer         None    Yes     Seed of random num gen
SequenceLen             Integer         100     Yes     average sequence length
SequenceNum             Integer         10      Yes     How many sequences?
InputType               Integer         1       Yes     1=Protein, 4=DNA
Relatedness             Integer         1       Yes     nonsense default value!
ChooseFromLeaves        Boolean         True    Yes     Output only leaf seqs 
TreeWithSequences       Boolean         False   Yes     Tree with seqs attached
TreeSequencesWithGaps   Boolean         False   Yes     Sequences in tree will contain alignment gaps
TreeWithAncestors       Boolean         False   Yes     Give all ancestors in the tree
TheTree                 Tree            None    Yes     Tree in Phylip format
TheSequence             String          None    Yes     Start Sequence
ThePAMMatrix            FP Matrix       None    No!*    The Mutation Matrix
TheAlphabet             String          None    No!     The used Alphabet
TheFreq                 FP Vector       None    No!     The average freq of Elem
TheInsertThreshold      FP              0.03    Yes     Insertion only % time
TheDeleteThreshold      FP              0.03    Yes     Deletion only % time
TheMutationProbability  FP Vector       [1.0+]  Yes     at a given site
TheDNAmodel             String          None    No!*    "JC","HKY","F81","F84","K2P"
MeanSubstitution        Double          0.01342302      Yes     Mean Subst. Rate (all)
TransitionBias          Double          1.0     Yes     needed for HKY, K2P
TTratio                 Double          0.0     Yes     Transition/Transversion (F84)
NumberOfRuns            Integer         1       Yes     number of rose-runs
TheInsFunc              FP Vector       None    No!     Prob of certain length
TheDelFunc              FP Vector       None    No!     Prob of certain length


* either ThePAMMatrix or TheDNAModel has to be specified !!

Assignment
==========

{Tag} = {Value} [;]

Example:
OutputFilename = "myoutput";

Includes
========

May be placed anywhere between complete assignments in the input file
and nested to a given depth.

%include {include filename}

Example:
%include protein-defaults

Comments
========

Can be any of:

C type comments:        /* A comment
                        stretching several lines */
C++ type comments:      //Another comment ending with the line

Bourne Shell comments:  # The hash has to be the first character on the line

Type Description
================

Name            Regexp like             Description             Example

Integer         {DIGIT}+                1 or more digits        123
FP              {DIGIT}+"."{DIGIT}*     FP has to have "."      3.4 or .5
                "."{DIGIT}*
Boolean         [Tr]"rue"                                       True or false
                [Ff]"alse"
String          \"[^\"\n]*\"            double quoted
                                        text no newlines        "An Example"
Vector          [\[\{]{Objects}[\]\}]                           [4], {.4,5.5}
Matrix                                                          [[3,2],[5,5]]
Tree                                    Phylip Tree             (a,b,(c,d:5,e));


Parse Errors
============

Are commented in compiler style giving file names and line numbers
in a nested fashion together with expected symbol.

Example:
In file included from sample1:2:
protein-defaults:13: parse error, expecting `EQ' or `OBRACE'


EXAMPLES

rose sample2

Takes this input file
----------------------------------------------
# Sample2 for ISMB97 Poster

%include dna-defaults

SequenceNum = 5
ChooseFromLeaves = True
TheSequence = "AGTCTGTACTATAATGGGAGGAAAGCC"
TheTree = ((a:3,b:5):5,(c:4,d:2,e:4):5,(f:3,g:4):6,(h:3,i:3):4);
TheMutationProbability =
        [1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,
        0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]
----------------------------------------------

includes this default file
----------------------------------------------
#
# default rose include file for DNA
#

InputType = 4 // DNA
TheAlphabet = "ACGT"
TheFreq = [.25,.25,.25,.25]

TheInsertThreshold = 0.09
TheDeleteThreshold = 0.09

TheInsFunc = [.2,.2,.2,1,1,1,1]
TheDelFunc = [.2,.2,.2,1,1,1,1]

ThePAMMatrix = [[.97,.01,.01,.01],
                [.01,.97,.01,.01],
                [.01,.01,.97,.01],
                [.01,.01,.01,.97]]
----------------------------------------------

results in something like this
----------------------------------------------
#i
ACGCTGTAGTATAATGGGAGGAACGCT

#h
ACTATGTCCAATCAACTATAATGGGAGGAACCCT

#e
AGTCCGTACTATAATGGGTTCCAGGAATGC

#d
AGTCAGTACTATAATGGGTTCCAGGAAAGC

#c
AGTCCGTAATATAATGTGTTCCAGGAATCC


Alignment:
       i  ACGCTGT-------AGTATAATGGG----AGGAACGCT
       h  ACTATGTCCAATCAACTATAATGGG----AGGAACCCT
       e  AGTCCGT-------ACTATAATGGGTTCCAGGAATGC-
       d  AGTCAGT-------ACTATAATGGGTTCCAGGAAAGC-
       c  AGTCCGT-------AATATAATGTGTTCCAGGAATCC-


(
(
(
i:3,
h:3):4,
(
e:4,
d:2,
c:4):5));
----------------------------------------------
Giving you:

1. The chosen ancestor sequences
2. Their alignment
3. The coresponding tree with distances

ENVIRONMENT
        No environment variables are used.

FILES
        protein-defaults        default config file to include for protein seqs
        dna-defaults            default config file to include for dna seqs

SEE ALSO

        For a complete description of the functionality of ROSE see:

        Stoye, J., Evers, D., & Meyer, F. (1998)
        Rose: generating sequence families.
        In Bioinformatics, Vol. 14, Issue 2, pp. 157-163.

http://www.oup.co.uk/bioinformatics/hdb/Volume_14/Issue_02/ps/btb005_gml.ps.gz

preprint version:
ftp://ftp.uni-bielefeld.de/pub/papers/techfak/pi/Report97-04.ps.gz

BUGS
        If you encounter strange behaviour please contact:
        mailto:folker@TechFak.Uni-Bielefeld.DE
        mailto:dirk@TechFak.Uni-Bielefeld.DE
        mailto:Jens.Stoye@CeBiTec.Uni-Bielefeld.DE

An example input file for DNA sequences
An example input file for protein sequences
Welcome
Submission
Expert Submission
Request Results
Reference
Manual
WebService
Download
Contact
Fri Dec 14 13:01:22 2012