Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation
Reducing computational costs is an important issue for development of embedded systems. Binary-weight Neural Networks (BNNs), in which weights are binarized and activations are quantized, are employed to reduce computational costs of various kinds of applications. In this paper, a design methodology...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Reducing computational costs is an important issue for development of
embedded systems. Binary-weight Neural Networks (BNNs), in which weights are
binarized and activations are quantized, are employed to reduce computational
costs of various kinds of applications. In this paper, a design methodology of
hardware architecture for inference engines is proposed to handle modern BNNs
with two operation modes. Multiply-Accumulate (MAC) operations can be
simplified by replacing multiply operations with bitwise operations. The
proposed method can effectively reduce the gate count of inference engines by
removing a part of computational costs from the hardware system. The
architecture of MAC operations can calculate the inference results of BNNs
efficiently with only 52% of hardware costs compared with the related works. To
show that the inference engine can handle practical applications, two
lightweight networks which combine the backbones of SegNeXt and the decoder of
SparseInst for instance segmentation are also proposed. The output results of
the lightweight networks are computed using only bitwise operations and add
operations. The proposed inference engine has lower hardware costs than related
works. The experimental results show that the proposed inference engine can
handle the proposed instance-segmentation networks and achieves higher accuracy
than YOLACT on the "Person" category although the model size is 77.7$\times$
smaller compared with YOLACT. |
---|---|
DOI: | 10.48550/arxiv.2501.01841 |