The structure basically consists of two arrays, one for mapping characters to symbols, and one for mapping symbols to characters. Its purpose is to map printable characters to binary symbols, which are consecutive numbers. There are always as many or more characters than there are symbols. At each position in fid_Alphabet::char_to_sym the corresponding symbol is stored, and at each position in fid_Alphabet::sym_to_char one of the corresponding character is stored. The set of characters that are mapped to the same symbol are called a character class. The latter is only useful for printing out sequences to present them to the user, all algorithms should work on the binary representation.
Note that there is no entry for fid_SEPARATOR in the array of symbols. Also note that up to fid_SYMBOLMAX symbols can be supported. The reason to have UCHAR_MAX+1 entries in the array of characters is that there are only 256 characters in the ASCII character set. It could easily be extended to contain much more entries to support larger character sets, but the array of symbols cannot grow as easily unless one would accept that a symbol occupies more than one byte. A larger range of characters would hence imply that they form up to fid_SYMBOLMAX character classes.
Definition at line 117 of file alphabet.h.
#include <alphabet.h>
Data Fields | |
fid_Uint16 | num_of_chars |
Number of printable characters defined by this alphabet. | |
fid_Uint16 | num_of_syms |
Number of symbols defined by this alphabet. | |
fid_Symbol | char_to_sym [UCHAR_MAX+1] |
Mapping from printable characters to binary symbols. | |
char | sym_to_char [fid_WILDCARD+1] |
Mapping from binary symbols to printable characters. |
Number of printable characters defined by this alphabet.
Definition at line 119 of file alphabet.h.
Referenced by fid_alphabet_add_wildcard(), fid_alphabet_dump(), fid_alphabet_init_from_speclines(), fid_alphabet_init_from_string(), fid_alphabet_write_to_file(), and fid_suffixarray_dump().
Number of symbols defined by this alphabet.
Definition at line 121 of file alphabet.h.
Referenced by fid_alphabet_add_wildcard(), fid_alphabet_dump(), fid_alphabet_init_from_speclines(), fid_alphabet_init_from_string(), fid_alphabet_write_to_file(), fid_sequences_compute_distribution(), fid_suffixarray_compute_distribution(), fid_suffixarray_dump(), fid_suffixarray_dump_intervals(), fid_suffixarray_get_intervals(), and fid_suffixarray_traverse().
fid_Symbol fid_Alphabet::char_to_sym[UCHAR_MAX+1] |
Mapping from printable characters to binary symbols.
Definition at line 123 of file alphabet.h.
Referenced by fid_alphabet_add_wildcard(), fid_alphabet_dump(), fid_alphabet_init_from_speclines(), fid_alphabet_init_from_string(), fid_alphabet_transform_string(), and fid_alphabet_write_to_file().
char fid_Alphabet::sym_to_char[fid_WILDCARD+1] |
Mapping from binary symbols to printable characters.
Definition at line 125 of file alphabet.h.
Referenced by fid_alphabet_add_wildcard(), fid_alphabet_dump(), fid_alphabet_init_from_speclines(), and fid_alphabet_init_from_string().