Computational localization of transcription factor binding sites using extreme learning machines

Computational localization of transcription factor binding sites (TFBSs, also termed as motif instances) in DNA sequences greatly helps biologists in saving experimental cost and time for motif discovery. The task can be formulated as feature-based object location identification problem, which is re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Soft computing (Berlin, Germany) Germany), 2012-09, Vol.16 (9), p.1595-1606
Hauptverfasser: Wang, Dianhui, Do, Hai Thanh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Computational localization of transcription factor binding sites (TFBSs, also termed as motif instances) in DNA sequences greatly helps biologists in saving experimental cost and time for motif discovery. The task can be formulated as feature-based object location identification problem, which is remarkably different from traditional pattern recognition tasks. This paper aims to develop a machine learning approach for TFBSs location prediction through feature-based classifiers. Some specific features are extracted to characterize and distinguish the TFBSs from random k-mers. Then, a sampling technique is employed to generate dummy positives in the feature space for achieving better prediction performance. Three learner models are examined and a simple ensemble method is adopted in our classifiers design. Experimental results on eight benchmark datasets demonstrate that our proposed techniques have good potential for conserved motif detections. Comparative studies indicate that the extreme learning machine-based ensemble classifier outperforms the other learner models in terms of overall prediction accuracy and computational complexity.
ISSN:1432-7643
1433-7479
DOI:10.1007/s00500-012-0820-x