SYSTEM AND METHOD USING A LARGE LANGUAGE MODEL (LLM) AND/OR REGULAR EXPRESSIONS FOR FEATURE EXTRACTIONS FROM UNSTRUCTURED OR SEMI-STRUCTURED DATA TO GENERATE ONTOLOGICAL GRAPH

A system and method are provided for generating a cybersecurity behavioral graph from a log files and/or other telemetry data, which can be unstructured or semi-structured data. The log files are applied to a machine learning (ML) model (e.g., a large language model (LLM)) that generates/extract fro...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zawadowskiy, Andrew, Bessonov, Oleg, Parla, Vincent
Format:	Patent
Sprache:	eng
Schlagworte:	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A system and method are provided for generating a cybersecurity behavioral graph from a log files and/or other telemetry data, which can be unstructured or semi-structured data. The log files are applied to a machine learning (ML) model (e.g., a large language model (LLM)) that generates/extract from the log files entities and relationships between said entities. The entities and relationships can be constrained using a cybersecurity ontology or schema to ensure that the results are meaningful to a cybersecurity context. A graph is then generated by mapping the extracted entities to nodes in the graph and the relationships to edges connecting nodes. To more efficiently extract the entities and relationships from the data file, an LLM is used to generate regular expressions for the format of the log files. Once generated, the regular expressions can rapidly parse the log files to extract the entities and relationships.