CONFIGURATION OF COMPUTE RESOURCES TO PERFORM TASK USING ENSEMBLE
Computer-assisted configuration of compute resource to perform tasks of a given inference task type. For each of multiple model combinations, the computing system estimates 1) a compute level that can perform tasks of the given inference type using the model combination, and 2) an accuracy of the mo...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Computer-assisted configuration of compute resource to perform tasks of a given inference task type. For each of multiple model combinations, the computing system estimates 1) a compute level that can perform tasks of the given inference type using the model combination, and 2) an accuracy of the model combination in performing tasks of the given inference task type. The computing system then selects a model combination for the given inference task type based on the estimated compute level of the model combination and the estimated accuracy of the model combination. In response to the selection, an inference component is configured to respond to task requests of the given inference task type by using the selected model combination. Scheduling using batch size and input size may further improve accuracy and efficiency of the model combination. |
---|