Learning heterogeneous subgraph representations for team discovery

The team discovery task is concerned with finding a group of experts from a collaboration network who would collectively cover a desirable set of skills. Most prior work for team discovery either adopt graph-based or neural mapping approaches. Graph-based approaches are computationally intractable o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Information retrieval (Boston) 2023-12, Vol.26 (1-2), p.8, Article 8
Hauptverfasser: Hamidi Rad, Radin, Nguyen, Hoang, Al-Obeidat, Feras, Bagheri, Ebrahim, Kargar, Mehdi, Srivastava, Divesh, Szlichta, Jaroslaw, Zarrinkalam, Fattane
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The team discovery task is concerned with finding a group of experts from a collaboration network who would collectively cover a desirable set of skills. Most prior work for team discovery either adopt graph-based or neural mapping approaches. Graph-based approaches are computationally intractable often leading to sub-optimal team selection. Neural mapping approaches have better performance, however, are still limited as they learn individual representations for skills and experts and are often prone to overfitting given the sparsity of collaboration networks. Thus, we define the team discovery task as one of learning subgraph representations from a heterogeneous collaboration network where the subgraphs represent teams which are then used to identify relevant teams for a given set of skills. As such, our approach captures local (node interactions with each team) and global (subgraph interactions between teams) characteristics of the representation network and allows us to easily map between any homogeneous and heterogeneous subgraphs in the network to effectively discover teams. Our experiments over two real-world datasets from different domains, namely DBLP bibliographic dataset with 10,647 papers and IMDB with 4882 movies, illustrate that our approach outperforms the state-of-the-art baselines on a range of ranking and quality metrics. More specifically, in terms of ranking metrics, we are superior to the best baseline by approximately 15 % on the DBLP dataset and by approximately 20 % on the IMDB dataset. Further, our findings illustrate that our approach consistently shows a robust performance improvement over the baselines.
ISSN:1386-4564
1573-7659
DOI:10.1007/s10791-023-09421-6