On similarity codes
We introduce a biologically motivated measure of sequence similarity for quaternary N-sequences, extending Hamming similarity. This measure is the sum over the length of the sequences of "alphabetic" similarities at all positions. Alphabetic similarities are defined, symmetrically, on the...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on information theory 2000-07, Vol.46 (4), p.1558-1564 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce a biologically motivated measure of sequence similarity for quaternary N-sequences, extending Hamming similarity. This measure is the sum over the length of the sequences of "alphabetic" similarities at all positions. Alphabetic similarities are defined, symmetrically, on the Cartesian square of the alphabet. These similarities equal zero whenever the two elements differ. In distinction to Hamming similarity, however, our alphabetic similarities take individual values whenever the two elements are identical. In this correspondence we derive lower and upper bounds on the rate of the corresponding quaternary nonlinear and linear codes called similarity codes and applied to DNA sequences. |
---|---|
ISSN: | 0018-9448 1557-9654 |
DOI: | 10.1109/18.850695 |