QUALITY ASSURANCE FOR DIGITAL TECHNOLOGIES USING LARGE LANGUAGE MODELS

Systems and methods are provided for implementing quality assurance for digital technologies using language model ("LM")-based artificial intelligence ("AI") and/or machine learning ("ML") systems. In various embodiments, a first prompt is provided to an LM actor or att...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	ITER, Dan, XU, Yichong, ZHU, Chenguang, ZHANG, Yi, LEE, Yin Tat, QIN, Lijuan, PRYZANT, Reid Allen, ELDAN, Ronen, HUANG, Xuedong, LI, Yuanzhi, ZENG, Nanshan, BUBECK, Sebastien, FANG, Yuwei
Format:	Patent
Sprache:	eng ; fre
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Systems and methods are provided for implementing quality assurance for digital technologies using language model ("LM")-based artificial intelligence ("AI") and/or machine learning ("ML") systems. In various embodiments, a first prompt is provided to an LM actor or attacker to cause the LM actor or attacker to generate interaction content for interacting with test software. Responses from the test software are then evaluated by an LM evaluator to produce evaluation results. In some examples, a second prompt is generated that includes the responses from the test software along with the evaluation criteria for the test software. When the second prompt is provided to the LM evaluator, the LM evaluator generates the evaluation results. L'invention concerne des systèmes et des procédés pour mettre en œuvre une assurance qualité pour technologies numériques à l'aide de systèmes d'intelligence artificielle ("AI") et/ou d'apprentissage automatique ("ML") basés sur un modèle de langage ("LM"). Dans divers modes de réalisation, une première invite est fournie à un acteur ou attaquant LM pour amener l'acteur ou attaquant LM à générer un contenu d'interaction pour interagir avec un logiciel test. Des réponses provenant du logiciel test sont ensuite évaluées par un évaluateur LM pour produire des résultats d'évaluation. Dans certains exemples, une seconde invite est générée qui comprend les réponses provenant du logiciel test conjointement avec les critères d'évaluation pour le logiciel test. Lorsque la seconde invite est fournie à l'évaluateur LM, l'évaluateur LM génère les résultats d'évaluation.