P-TIMA: a framework of T witter threat intelligence mining and analysis based on a prompt-learning NER model

Open-source information platforms such as Twitter continuously provide the latest threat intelligence, including new vulnerabilities and in-the-wild exploitations of advanced persistent threat (APT) groups. Automated extraction of threat intelligence from Twitter has become crucial for defenders to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer journal 2024-12, Vol.67 (12), p.3221-3238
Hauptverfasser: You, Yizhe, Jiang, Zhengwei, Yang, Peian, Jiang, Jun, Zhang, Kai, Wang, Xuren, Tu, Chenpeng, Feng, Huamin
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Open-source information platforms such as Twitter continuously provide the latest threat intelligence, including new vulnerabilities and in-the-wild exploitations of advanced persistent threat (APT) groups. Automated extraction of threat intelligence from Twitter has become crucial for defenders to access up-to-date threat knowledge. However, existing studies mainly rely on supervised learning methods to extract threat intelligence knowledge, such as entities, which require a large amount of annotated data. This paper presents Threat Intelligence Mining and Analysis based on Prompt Learning (P-TIMA), a framework specifically crafted for extracting and analyzing threat intelligence from Twitter. P-TIMA employs our innovative few-shot entity recognition method, SecEntPrompt (SEP), built on prompt learning, to extract vulnerability intelligence from Twitter. Additionally, P-TIMA analyzes and profiles the overarching vulnerability intelligence obtained from Twitter, along with in-the-wild exploitation intelligence of APT groups. The SEP improves the average entity recognition F1 score by 3.62-4.40 compared with the best-performing comparison model and outperforms the method based on the large language model on recognition performance and inference time. To validate our framework, we apply P-TIMA to extract vulnerability-related threat intelligence from real Twitter data. Through case studies, we then analyze trends in vulnerability threats and the exploitation capabilities of APT groups. In conclusion, our framework provides a more efficient and accurate method for extracting threat intelligence from Twitter, enabling defenders to stay up-to-date with the latest threat trends and helping them improve their defense strategies against cyber attacks.
ISSN:0010-4620
1460-2067
DOI:10.1093/comjnl/bxae084