Algorithmic Probability-guided Supervised Machine Learning on Non-differentiable Spaces
We show how complexity theory can be introduced in machine learning to help bring together apparently disparate areas of current research. We show that this new approach requires less training data and is more generalizable as it shows greater resilience to random attacks. We investigate the shape o...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We show how complexity theory can be introduced in machine learning to help
bring together apparently disparate areas of current research. We show that
this new approach requires less training data and is more generalizable as it
shows greater resilience to random attacks. We investigate the shape of the
discrete algorithmic space when performing regression or classification using a
loss function parametrized by algorithmic complexity, demonstrating that the
property of differentiation is not necessary to achieve results similar to
those obtained using differentiable programming approaches such as deep
learning. In doing so we use examples which enable the two approaches to be
compared (small, given the computational power required for estimations of
algorithmic complexity). We find and report that (i) machine learning can
successfully be performed on a non-smooth surface using algorithmic complexity;
(ii) that parameter solutions can be found using an algorithmic-probability
classifier, establishing a bridge between a fundamentally discrete theory of
computability and a fundamentally continuous mathematical theory of
optimization methods; (iii) a formulation of an algorithmically directed search
technique in non-smooth manifolds can be defined and conducted; (iv)
exploitation techniques and numerical methods for algorithmic search to
navigate these discrete non-differentiable spaces can be performed; in
application of the (a) identification of generative rules from data
observations; (b) solutions to image classification problems more resilient
against pixel attacks compared to neural networks; (c) identification of
equation parameters from a small data-set in the presence of noise in
continuous ODE system problem, (d) classification of Boolean NK networks by (1)
network topology, (2) underlying Boolean function, and (3) number of incoming
edges. |
---|---|
DOI: | 10.48550/arxiv.1910.02758 |