Regularization of sequence data for machine learning
We examine the problem of classifying biological sequences, and in particular the challenge of generalizing results to novel input data. We observe that the high-dimensionality of sequence data representations results in an extremely sparsely populated input space. This motivates a need for regulari...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We examine the problem of classifying biological sequences, and in particular the challenge of generalizing results to novel input data. We observe that the high-dimensionality of sequence data representations results in an extremely sparsely populated input space. This motivates a need for regularization (a form of inductive bias), in order to achieve generalization. We discuss regularization in the context of regular neural networks, deep belief networks and support vector machines, and provide experimental results for these architectures. Our results support the importance of using an effective regularization method and identify which methods work well on a real-world dataset. |
---|---|
DOI: | 10.1109/BIBMW.2011.6112350 |