On similarity codes

We introduce a biologically motivated measure of sequence similarity for quaternary N-sequences, extending Hamming similarity. This measure is the sum over the length of the sequences of "alphabetic" similarities at all positions. Alphabetic similarities are defined, symmetrically, on the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information theory 2000-07, Vol.46 (4), p.1558-1564
Hauptverfasser:	D'yachkov, A.G., Torney, D.C.
Format:	Artikel
Sprache:	eng
Schlagworte:	Alphabets Cartesian Deoxyribonucleic acid Information theory Linear codes Nonlinearity Similarity Upper bounds
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We introduce a biologically motivated measure of sequence similarity for quaternary N-sequences, extending Hamming similarity. This measure is the sum over the length of the sequences of "alphabetic" similarities at all positions. Alphabetic similarities are defined, symmetrically, on the Cartesian square of the alphabet. These similarities equal zero whenever the two elements differ. In distinction to Hamming similarity, however, our alphabetic similarities take individual values whenever the two elements are identical. In this correspondence we derive lower and upper bounds on the rate of the corresponding quaternary nonlinear and linear codes called similarity codes and applied to DNA sequences.
ISSN:	0018-9448 1557-9654
DOI:	10.1109/18.850695