The RNA String Kernel for siRNA Efficacy Prediction
String kernels directly model sequence similarities without the necessity of extracting numerical features in a vector space. Since they better capture complex traits in the sequences, string kernels often achieve better prediction performance. RNA interference is a cell defense mechanism with many...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | String kernels directly model sequence similarities without the necessity of extracting numerical features in a vector space. Since they better capture complex traits in the sequences, string kernels often achieve better prediction performance. RNA interference is a cell defense mechanism with many biological and therapeutical applications, where strings can be used to represent target messenger RNAs and initiating short RNAs and string kernels can be applied for training and prediction. While most existing string kernels are developed for general purpose sequences and have been applied to text and protein classifications, the RNA string kernel is particularly designed to model mismatches, GU wobbles, and bulges of RNA biology and has been applied to RNAi off-target evaluation. We adapt the RNA string kernel to compute the similarity of siRNA sequences and use it in support vector regression to predict siRNA silencing efficacy. We evaluate the performance of the RNA kernel against the spectrum kernel, the string subsequence kernel of arbitrary mismatch, the randomized string kernel, and numerical kernels computed from numerical features extracted according to siRNA design rules. We also give insights into computational performance and common properties and differences of the RNA kernel as compared to other kernels. Empirical results on biological data sets demonstrate that the RNA string kernel performed favorably than most existing string kernels and achieved significant improvements over kernels computed from numerical descriptors extracted according to structural and thermodynamic rules. Meanwhile, the string kernels achieved favorable results relative to other methods in related work. Furthermore, the RNA string kernel is simple to implement and fast to compute. |
---|---|
DOI: | 10.1109/BIBE.2007.4375581 |