Efficient Indexing of Top-k Entities in Systems of Engagement with Extensions for Geo-tagged Entities

Next-generation enterprise management systems are beginning to be developed based on the Systems of Engagement (SOE) model. We visualize an SOE as a set of entities. Each entity is modeled by a single parent document with dynamic embedded links (i.e., child documents) that contain multi-modal inform...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Data science and engineering 2021-12, Vol.6 (4), p.411-433
Hauptverfasser:	Mondal, Anirban, Kakkar, Ayaan, Padhariya, Nilesh, Mohania, Mukesh
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithm Analysis and Problem Complexity Artificial Intelligence Cellular telephones Chemistry and Earth Sciences Computer Science Customers Data Mining and Knowledge Discovery Database Management Indexes (documentation) Keywords Management systems Performance indices Physics Product reviews Single parents Social networks Statistics for Engineering Supply chains Systems and Data Security
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Next-generation enterprise management systems are beginning to be developed based on the Systems of Engagement (SOE) model. We visualize an SOE as a set of entities. Each entity is modeled by a single parent document with dynamic embedded links (i.e., child documents) that contain multi-modal information about the entity from various networks. Since entities in an SOE are generally queried using keywords, our goal is to efficiently retrieve the top- k entities related to a given keyword-based query by considering the relevance scores of both their parent and child documents. Furthermore, we extend the afore-mentioned problem to incorporate the case where the entities are geo-tagged. The main contributions of this work are three-fold. First, it proposes an efficient bitmap-based approach for quickly identifying the candidate set of entities, whose parent documents contain all queried keywords. A variant of this approach is also proposed to reduce memory consumption by exploiting skews in keyword popularity. Second, it proposes the two-tier HI-tree index, which uses both hashing and inverted indexes, for efficient document relevance score lookups. Third, it proposes an R-tree-based approach to extend the afore-mentioned approaches for the case where the entities are geo-tagged. Fourth, it performs comprehensive experiments with both real and synthetic datasets to demonstrate that our proposed schemes are indeed effective in providing good top- k result recall performance within acceptable query response times.
ISSN:	2364-1185 2364-1541
DOI:	10.1007/s41019-021-00173-1