FPGA IMPLEMENTATION OF LOW LATENCY ARCHITECTURE OF XGBOOST FOR INFERENCE AND METHOD THEREFOR
Various embodiments disclosed herein provides method and system for low latency FPGA based system for inference such as recommendation models. Conventional models for inference have high latency and low throughput in decision making models/processes. The disclosed method and system exploits parallel...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Various embodiments disclosed herein provides method and system for low latency FPGA based system for inference such as recommendation models. Conventional models for inference have high latency and low throughput in decision making models/processes. The disclosed method and system exploits parallelism in processing of XGB models and hence enables minimum possible latency and maximum possible throughput. Additionally, the disclosed system uses a trained model that is (re)trained using only those features which the model had used during training, remaining features are discarded during retraining of the model. The use of such selected set of features thus leads to reduction in the size of digital circuit significantly for the hardware implementation, thereby greatly enhancing the system performance. |
---|