Resource allocation method and device for QOS (Quality of Service) perception in deep learning multi-model deployment scene

The invention relates to a QoS (Quality of Service) sensing resource allocation method and device under a deep learning multi-model deployment scene. The method comprises the following steps: splitting a deep learning model into a plurality of serial dependent sub-models, and splitting a target task...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: YU TIAN, WU HENG, LUO DIAOHAN, WU YUEWEN, ZHANG WENBO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to a QoS (Quality of Service) sensing resource allocation method and device under a deep learning multi-model deployment scene. The method comprises the following steps: splitting a deep learning model into a plurality of serial dependent sub-models, and splitting a target task corresponding to the serial dependent sub-models into a plurality of sub-tasks; inserting the sub-tasks into the global task queue according to the total response ratio of all the sub-tasks in the global task queue when the queue changes; and when the sub-tasks are to be operated, issuing tokens to the sub-tasks according to the number of the current various types of tasks and the attributes of the sub-tasks so as to obtain an operation result of the target task based on the deep learning model or a plurality of serial dependent sub-models. According to the method, the problem of overlong waiting of the short task caused by the long task can be effectively solved, the resource allocation strategy for adjusting the