2-Adic clustering of the PAM matrix
In this paper we demonstrate that the use of the system of 2-adic numbers provides a new insight to some problems of genetics, in particular, degeneracy of the genetic code and the structure of the PAM matrix in bioinformatics. The 2-adic distance is an ultrametric and applications of ultrametric in...
Gespeichert in:
Veröffentlicht in: | Journal of theoretical biology 2009-12, Vol.261 (3), p.396-406 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper we demonstrate that the use of the system of 2-adic numbers provides a new insight to some problems of genetics, in particular, degeneracy of the genetic code and the structure of the PAM matrix in bioinformatics. The 2-adic distance is an ultrametric and applications of ultrametric in bioinformatics are not surprising. However, by using the 2-adic numbers we match ultrametric with a number theoretic structure. In this way we find new applications of an ultrametric which differ from known up to now in bioinformatics.
We obtain the following results. We show that the PAM matrix
A allows the expansion into the sum of the two matrices
A
=
A
(
2
)
+
A
(
∞
)
, where the matrix
A
(
2
)
is 2-adically regular (i.e. matrix elements of this matrix are close to locally constant with respect to the discussed earlier by the authors 2-adic parametrization of the genetic code), and the matrix
A
(
∞
)
is sparse. We discuss the structure of the matrix
A
(
∞
)
in relation to the side chain properties of the corresponding amino acids.
We introduce the family of substitution matrices
A
(
α
,
β
)
=
α
A
(
2
)
+
β
A
(
∞
)
,
α
,
β
≥
0
which should allow to vary the alignment procedure in order to take into account the different chemical and geometric properties of the amino acids. |
---|---|
ISSN: | 0022-5193 1095-8541 1095-8541 |
DOI: | 10.1016/j.jtbi.2009.08.014 |