FPGA IMPLEMENTATION OF LOW LATENCY ARCHITECTURE OF XGBOOST FOR INFERENCE AND METHOD THEREFOR

Various embodiments disclosed herein provides method and system for low latency FPGA based system for inference such as recommendation models. Conventional models for inference have high latency and low throughput in decision making models/processes. The disclosed method and system exploits parallel...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: MANAVAR, Piyush, NAMBIAR, Manoj
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Various embodiments disclosed herein provides method and system for low latency FPGA based system for inference such as recommendation models. Conventional models for inference have high latency and low throughput in decision making models/processes. The disclosed method and system exploits parallelism in processing of XGB models and hence enables minimum possible latency and maximum possible throughput. Additionally, the disclosed system uses a trained model that is (re)trained using only those features which the model had used during training, remaining features are discarded during retraining of the model. The use of such selected set of features thus leads to reduction in the size of digital circuit significantly for the hardware implementation, thereby greatly enhancing the system performance.