Recommendation strategy optimization system, method and device and related equipment
The invention provides a recommendation strategy optimization method and device based on reinforcement learning and related equipment. The method comprises the steps that a first scene recommendation network determines a current user preference feature sequence P1, performs similarity calculation on...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides a recommendation strategy optimization method and device based on reinforcement learning and related equipment. The method comprises the steps that a first scene recommendation network determines a current user preference feature sequence P1, performs similarity calculation on the current user preference feature sequence P1 and a candidate commodity feature sequence, outputs a commodity feature sequence P2 with the highest similarity score, and records actions adopted by a user for the P2; the second scene recommendation network receives the P2, operation is carried out according to the P2 to obtain a commodity feature sequence P3, the action of the user for the P3 is recorded, and the P2 and the P3 have correlation; and the scene recommendation decision network generates a state action value function according to the P2, the action adopted by the user for the P2, the P3 and the action adopted by the user for the P3, optimizes the state action value function by using a near-end optimiza |
---|