DFG-Subject - Advanced Projects

A system for the
declarative description and
efficient search of hybrid patterns
in large genomic data sets

This project is supported by a grant from the "Deutsche Forschungsgemeinschaft". It is part of the special program on Computational Methods for the Analysis and Interpretation of large genomic data sets.

Abstract


This project is a collaboration of the Institute for Biophysics at the University of Düsseldorf and the Technical Faculty at the University of Bielefeld.

Project Team

Institut für Physikalische Biologie
Heinrich-Heine-Universität Düsseldorf
Universitätsstraße 1, Gebäude 26.12.U1
D-40225 Düsseldorf
Germany

Dr. Gerhard Steger
steger@biophys.uni-duesseldorf.de
Phone: +49-211-81-14927
Fax: +49-211-81-15167
    Development of bioinformatic tools
      Thermodynamic stability of double-stranded nucleic acids
      Thermodynamic Prediction of Conserved Secondary Structure in RNA
      Thermodynamic stability of single-stranded nucleic acids
      Kinetic/sequential folding of single-stranded nucleic acids
      Search for and description of hybrid patterns
    Biophysics/Bioinformatics
      Thermodynamics and kinetics of nucleic acids
    Methods
      Temperature-gradient gel-electrophoresis (TGGE)
      Optical denaturation curves
      Hydrodynamics
    Objects
      Viroids
      Viral RNAs
      Ribozymes

Stefan Gräf
graef@biophys.uni-duesseldorf.de
Phone: +49-211-81-14927
Fax: +49-211-81-15167
AG Praktische Informatik
Technische Fakultät
Universität Bielefeld
Postfach 10031
D-33501 Bielefeld
Germany
Dr. Stefan Kurtz
kurtz@techfak.uni-bielefeld.de
Phone: +49-521-106-2910
Fax: +49-521-106-6411
    Data Structures and Algorithms and their Efficient Implementation
    Approximate Patterns
    Index Structures for Sequences (e.g. Suffix trees)
    Structural Patterns (e.g. repeats and palindroms)
    Implementation of Programming Languages
Dirk Strothmann
dirks@techfak.uni-bielefeld.de
Phone: +49-521-106-2914
Fax: +49-521-106-6411
    Combinatorics
    Information Theory
    Hybrid Patterns

Current Project Status

We are currently developing HyPa which is a system for searching hybrid patterns in large genomic datasets. Hybrid patterns combine sequence similarities, structural similarities and other user defined properties like thermodynamic constraints. An important part of HyPa is HyPaL, the pattern matching language to describe the hybrid patterns.
The database, called HyPaLib (for Hybrid Pattern Library), contains annotated structural elements characteristic for certain classes of structural and/or functional RNAs. These elements are described in the language HyPaL specifically designed for this purpose. The language allows to conveniently specify hybrid patterns. We are developing software tools that allow a user to search sequence databases for any pattern in HyPaLib, thus providing functionality which is similar to PROSITE, but dedicated to the more complex patterns in RNA sequences.
Documentation of HyPaL
    Syntax and Semantics of the Hybrid Pattern Language HyPaL
HyPaLib Documentation
    Description of the keywords in the Library
HyPaLib (HTML)
    Hybrid Pattern Library in HTML format
HyPaLib (TXT)
    Hybrid Pattern Library in ASCII format


Dirk Strothmann
Last modified: Wed Apr 21 14:30:12 MET 2004