Using Huffman coding method to visualize and analyze DNA sequences

On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computational chemistry 2011-11, Vol.32 (15), p.3233-3240
Hauptverfasser: Qi, Zhao-Hui, Li, Ling, Qi, Xiao-Qin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of β‐globin gene of 11 species and six ND6 proteins shows the utility of the scheme. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011
ISSN:0192-8651
1096-987X
DOI:10.1002/jcc.21906