Inference method of artificial intelligence inference framework, computer equipment and medium

The embodiment of the invention discloses a reasoning method of an artificial intelligence reasoning framework, computer equipment and a medium. In a specific embodiment, the method comprises the steps of obtaining a reasoning request; performing reasoning performance evaluation on the artificial in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: ZU CHUNSHAN
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The embodiment of the invention discloses a reasoning method of an artificial intelligence reasoning framework, computer equipment and a medium. In a specific embodiment, the method comprises the steps of obtaining a reasoning request; performing reasoning performance evaluation on the artificial intelligence reasoning framework according to the maximum allowable delay information contained in the reasoning request and the computing resource occupancy rate of the artificial intelligence reasoning framework, and configuring the instance number of the reasoning model and the maximum batch size of each instance according to the reasoning performance evaluation result; and loading the inference model to the instances according to the number of the inference requests, the number of the instances of the inference model and the maximum batch size of each instance so as to perform inference processing on the inference requests. According to the embodiment, dynamic reasoning performance optimization of an AI reasoning