Resource allocation method and device for QOS (Quality of Service) perception in deep learning multi-model deployment scene

The invention relates to a QoS (Quality of Service) sensing resource allocation method and device under a deep learning multi-model deployment scene. The method comprises the following steps: splitting a deep learning model into a plurality of serial dependent sub-models, and splitting a target task...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	YU TIAN, WU HENG, LUO DIAOHAN, WU YUEWEN, ZHANG WENBO
Format:	Patent
Sprache:	chi ; eng
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The invention relates to a QoS (Quality of Service) sensing resource allocation method and device under a deep learning multi-model deployment scene. The method comprises the following steps: splitting a deep learning model into a plurality of serial dependent sub-models, and splitting a target task corresponding to the serial dependent sub-models into a plurality of sub-tasks; inserting the sub-tasks into the global task queue according to the total response ratio of all the sub-tasks in the global task queue when the queue changes; and when the sub-tasks are to be operated, issuing tokens to the sub-tasks according to the number of the current various types of tasks and the attributes of the sub-tasks so as to obtain an operation result of the target task based on the deep learning model or a plurality of serial dependent sub-models. According to the method, the problem of overlong waiting of the short task caused by the long task can be effectively solved, the resource allocation strategy for adjusting the