Parallel and distributed kmeans to identify the translation initiation site of proteins
Prediction of the translation initiation site is of vital importance in bioinformatics since through this process it is possible to understand the organic formation and metabolic behavior of living organisms. Sequential algorithms are not always a viable solution due to the fact that mRNA databases...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Prediction of the translation initiation site is of vital importance in bioinformatics since through this process it is possible to understand the organic formation and metabolic behavior of living organisms. Sequential algorithms are not always a viable solution due to the fact that mRNA databases are normally very large, resulting in long processing times. Applying parallel and distributed computing resources to such databases could help reduce this time. The objective of this article is to present a class balancing solution for the translation initiation site process using parallel and distributed computing resources in a hybrid model. The results reveal a speedup of up to 23 times compared to sequential methods and performance rates for accuracy, precision, sensitivity, specificity and adjusted accuracy of 91.15%, 39.83%, 89.11%, 88.93% and 89.02%, respectively, for the Homo sapiens database. For the Drosophila melanogaster database, the speedup was 18.33 times and accuracy, precision, sensitivity, specificity and adjusted accuracy were 95.22%, 43.01%, 90.83%, 90.47% and 90.64%, respectively. Both sets of results are considered important. Thus, the solution presented in this article demonstrated itself viable for the problem in question. |
---|---|
ISSN: | 1062-922X 2577-1655 |
DOI: | 10.1109/ICSMC.2012.6377972 |