Resource allocation method and device for QOS (Quality of Service) perception in deep learning multi-model deployment scene
The invention relates to a QoS (Quality of Service) sensing resource allocation method and device under a deep learning multi-model deployment scene. The method comprises the following steps: splitting a deep learning model into a plurality of serial dependent sub-models, and splitting a target task...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention relates to a QoS (Quality of Service) sensing resource allocation method and device under a deep learning multi-model deployment scene. The method comprises the following steps: splitting a deep learning model into a plurality of serial dependent sub-models, and splitting a target task corresponding to the serial dependent sub-models into a plurality of sub-tasks; inserting the sub-tasks into the global task queue according to the total response ratio of all the sub-tasks in the global task queue when the queue changes; and when the sub-tasks are to be operated, issuing tokens to the sub-tasks according to the number of the current various types of tasks and the attributes of the sub-tasks so as to obtain an operation result of the target task based on the deep learning model or a plurality of serial dependent sub-models. According to the method, the problem of overlong waiting of the short task caused by the long task can be effectively solved, the resource allocation strategy for adjusting the |
---|