Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks
Deep learning and Convolutional Neural Networks (CNNs) have driven major transformations in diverse research areas. However, their limitations in handling low-frequency information present obstacles in certain tasks like interpreting global structures or managing smooth transition images. Despite th...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning and Convolutional Neural Networks (CNNs) have driven major
transformations in diverse research areas. However, their limitations in
handling low-frequency information present obstacles in certain tasks like
interpreting global structures or managing smooth transition images. Despite
the promising performance of transformer structures in numerous tasks, their
intricate optimization complexities highlight the persistent need for refined
CNN enhancements using limited resources. Responding to these complexities, we
introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM)
Network, with the goal to harness the full potential of CNNs while keeping
their complexity unchanged. The MLFM efficiently preserves low-frequency
information, enhancing performance in targeted computer vision tasks. Central
to our MLFM is the Low-Frequency Memory Unit (LFMU), which stores various
low-frequency data and forms a parallel channel to the core network. A key
advantage of MLFM is its seamless compatibility with various prevalent
networks, requiring no alterations to their original core structure. Testing on
ImageNet demonstrated substantial accuracy improvements in multiple 2D CNNs,
including ResNet, MobileNet, EfficientNet, and ConvNeXt. Furthermore, we
showcase MLFM's versatility beyond traditional image classification by
successfully integrating it into image-to-image translation tasks, specifically
in semantic segmentation networks like FCN and U-Net. In conclusion, our work
signifies a pivotal stride in the journey of optimizing the efficacy and
efficiency of CNNs with limited resources. This research builds upon the
existing CNN foundations and paves the way for future advancements in computer
vision. Our codes are available at https://github.com/AlphaWuSeu/ MLFM. |
---|---|
DOI: | 10.48550/arxiv.2403.08157 |