Improvement accuracy in deep learning: An increasing neurons distance approach with the penalty term of loss function
The increasing use of neural networks for solving complex tasks has emphasized the need to optimize their performance. In recent years, the development of neural networks has enabled them to tackle increasingly challenging problems for applications in various fields. Existing methods, such as L1 and...
Gespeichert in:
Veröffentlicht in: | Information sciences 2023-10, Vol.644, p.119268, Article 119268 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The increasing use of neural networks for solving complex tasks has emphasized the need to optimize their performance. In recent years, the development of neural networks has enabled them to tackle increasingly challenging problems for applications in various fields. Existing methods, such as L1 and L2 penalty terms, improve the neural network performance by reducing the neurons' weight and removing some neurons' output. However, these algorithms reduce the number of active neurons and the theoretical maximum capacity of the neural network. A loss function penalty term can improve the network performance by guiding the gradient descent direction of neurons during the training process. Moreover, it does not need to alter the network structure, increase the data, or change the training process. In this study, a novel algorithm is proposed to improve the performance of the network by adding a new loss function penalty term. The algorithm proposed in this study enhances the accuracy and anti-interference capabilities of neural networks by increasing the number of active neurons based on the finding that identical neurons extract identical features from the data, while distinct neurons extract unique features. The neurons are treated as hyperspace points and the distance between them is converted into repulsive forces to increase the distance between these points. To verify the validity of the proposed algorithm, we tested it on image classification datasets (CIFAR-100 and CALTECH-256), text datasets (SST-2 and THUCNews), and an image detection dataset (PASCAL VOC). The results show that the proposed algorithm can improve the network performance for a variety of neural network structures, data types, and task types. This algorithm has demonstrated its effectiveness in robustness against data noise and compatibility with pre-trained models such as Faster R-CNN, YOLO, and RetinaNet. |
---|---|
ISSN: | 0020-0255 |
DOI: | 10.1016/j.ins.2023.119268 |