Detection of Unknown Polymorphic Patterns Using Feature-Extracting Part of a Convolutional Autoencoder

Background: The present paper proposes a novel approach for detecting the presence of unknown polymorphic patterns in random symbol sequences that also comprise already known polymorphic patterns. Methods: We propose to represent rules that define the considered patterns as regular expressions and s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied sciences 2023-10, Vol.13 (19), p.10842
Hauptverfasser: Kucharski, Przemysław, Ślot, Krzysztof
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background: The present paper proposes a novel approach for detecting the presence of unknown polymorphic patterns in random symbol sequences that also comprise already known polymorphic patterns. Methods: We propose to represent rules that define the considered patterns as regular expressions and show how these expressions can be modeled using filter cascades of neural convolutional layers. We adopted a convolutional autoencoder (CAE) as a pattern detection framework. To detect unknown patterns, we first incorporated knowledge of known rules into the CAE’s convolutional feature extractor by fixing weights in some of its filter cascades. Then, we executed the learning procedure, where the weights of the remaining filters were driven by two different objectives. The first was to ensure correct sequence reconstruction, whereas the second was to prevent weights from learning the already known patterns. Results: The proposed methodology was tested on sample sequences derived from the human genome. The analysis of the experimental results provided statistically significant information on the presence or absence of polymorphic patterns that were not known in advance. Conclusions: The proposed method was able to detect the existence of unknown polymorphic patterns.
ISSN:2076-3417
2076-3417
DOI:10.3390/app131910842