SLICE BY SLICE AI/ML MODEL INFERENCE OVER COMMUNICATION NETWORKS

In one implementation, the AI/ML model is first split into several unitary chunks that correspond to sub-parts of the model. Then an aggregation of unitary chunks is made by considering the download time, inference time of unitary chunks, and/or device constraints. The first split corresponds to a f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	FILOCHE, Thierry, QUINQUIS, Cyril, LE GUYADEC, Pascal, FONTAINE, Patrick, LAMBERT, Anne, SCHNITZLER, Francois
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ELECTRIC COMMUNICATION TECHNIQUE ELECTRICITY TRANSMISSION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In one implementation, the AI/ML model is first split into several unitary chunks that correspond to sub-parts of the model. Then an aggregation of unitary chunks is made by considering the download time, inference time of unitary chunks, and/or device constraints. The first split corresponds to a first chunk of AI/ML layers that, once downloaded, is useable as is, and generates intermediate results based on some sensing/perception data. As soon as a new chunk arrives, it is used to generate new results based on the intermediate data of the previous chunk. Since download and inference are parallelized, a final result can be generated earlier than with the full sequential method. In addition, as soon as the inference ends on a chunk, this chunk may be removed from the device. Several AI/ML model split methods are provided to generate model subsets/chunks for different model architectures.