Tanium reveal: a federated search engine for querying unstructured file data on large enterprise networks

Tanium Reveal is a federated search engine deployed on large-scale enterprise networks that is capable of executing data queries across billions of private data files within 60 seconds. Data resides at the edge of networks, potentially distributed on hundreds of thousands of endpoints. The anatomy o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2021-08, Vol.14 (12), p.3096-3109
Hauptverfasser: Stoddard, Josh, Mustafa, Adam, Goela, Naveen
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Tanium Reveal is a federated search engine deployed on large-scale enterprise networks that is capable of executing data queries across billions of private data files within 60 seconds. Data resides at the edge of networks, potentially distributed on hundreds of thousands of endpoints. The anatomy of the search engine consists of local inverse indexes on each endpoint and a global communication platform called Tanium for issuing search queries to all endpoints. Reveal enables asynchronous parsing and indexing on endpoints without noticeable impact to the endpoints' primary functionality. The engine harnesses the Tanium platform, which is based on a self-organizing, fault-tolerant, scalable, linear chain communication scheme. We demonstrate a multi-tier workflow for executing search queries across a network and for viewing matching snippets of text on any endpoint. We analyze metrics for federated indexing and searching in multiple environments including a production network with 1.05 billion searchable files distributed across 4236 endpoints. While primarily focusing on Boolean, phrase, and similarity query types, Reveal is compatible with further automation (e.g., semantic classification based on machine learning). Lastly, we discuss safeguards for sensitive information within Reveal including cryptographic hashing of private text and role-based access control (RBAC).
ISSN:2150-8097
2150-8097
DOI:10.14778/3476311.3476386