Alpha-Net: Architecture, Models, and Applications

Deep learning network training is usually computationally expensive and intuitively complex. We present a novel network architecture for custom training and weight evaluations. We reformulate the layers as ResNet-similar blocks with certain inputs and outputs of their own, the blocks (called Alpha b...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2020-06
Hauptverfasser:	Shaikh, Jishan, Sharma, Adya, Chouhan, Ankit, Mahawar, Avinash
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Benchmarks Computer architecture Configurations Datasets Empirical analysis Machine learning Object recognition Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep learning network training is usually computationally expensive and intuitively complex. We present a novel network architecture for custom training and weight evaluations. We reformulate the layers as ResNet-similar blocks with certain inputs and outputs of their own, the blocks (called Alpha blocks) on their connection configuration form their own network, combined with our novel loss function and normalization function form the complete Alpha-Net architecture. We provided the empirical mathematical formulation of network loss function for more understanding of accuracy estimation and further optimizations. We implemented Alpha-Net with 4 different layer configurations to express the architecture behavior comprehensively. On a custom dataset based on ImageNet benchmark, we evaluate Alpha-Net v1, v2, v3, and v4 for image recognition to give the accuracy of 78.2%, 79.1%, 79.5%, and 78.3% respectively. The Alpha-Net v3 gives improved accuracy of approx. 3% over the last state-of-the-art network ResNet 50 on ImageNet benchmark. We also present an analysis of our dataset with 256, 512, and 1024 layers and different versions of the loss function. Input representation is also crucial for training as initial preprocessing will take only a handful of features to make training less complex than it needs to be. We also compared network behavior with different layer structures, different loss functions, and different normalization functions for better quantitative modeling of Alpha-Net.
ISSN:	2331-8422