Training large scale visual transducer neural network with variable tile size
The invention relates to training a large-scale visual transducer neural network with variable tile size. A method of training the neural network includes, at each training step: obtaining a plurality of training images; obtaining a corresponding target output of each training image; selecting an im...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention relates to training a large-scale visual transducer neural network with variable tile size. A method of training the neural network includes, at each training step: obtaining a plurality of training images; obtaining a corresponding target output of each training image; selecting an image tile generation scheme from a plurality of image tile generation schemes, where each image tile generation scheme generates a different number of tiles of the given input image, and where each tile includes a respective subset of pixels of the given input image; for each training image: generating a plurality of image tiles by applying the selected image tile generation scheme to the training image; and processing the plurality of image tiles using a neural network to generate a network output; and training the neural network on an objective function that measures, for each training image, a difference between a network output of the training image and a target network output of the training image.
本公开涉及训练具有可变图 |
---|