PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models
Stack Overflow is one of the most influential Software Question & Answer (SQA) websites, hosting millions of programming-related questions and answers. Tags play a critical role in efficiently organizing the contents in Stack Overflow and are vital to support a range of site operations, e.g., qu...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Stack Overflow is one of the most influential Software Question & Answer
(SQA) websites, hosting millions of programming-related questions and answers.
Tags play a critical role in efficiently organizing the contents in Stack
Overflow and are vital to support a range of site operations, e.g., querying
relevant content. Poorly selected tags often raise problems like tag ambiguity
and tag explosion. Thus, a precise and accurate automated tag recommendation
technique is demanded.
Inspired by the recent success of pre-trained models (PTMs) in natural
language processing (NLP), we present PTM4Tag+, a tag recommendation framework
for Stack Overflow posts that utilizes PTMs in language modeling. PTM4Tag+ is
implemented with a triplet architecture, which considers three key components
of a post, i.e., Title, Description, and Code, with independent PTMs. We
utilize a number of popular pre-trained models, including the BERT-based models
(e.g., BERT, RoBERTa, CodeBERT, BERTOverflow, and ALBERT), and encoder-decoder
models (e.g., PLBART, CoTexT, and CodeT5). Our results show that leveraging
CodeT5 under the PTM4Tag+ framework achieves the best performance among the
eight considered PTMs and outperforms the state-of-the-art Convolutional Neural
Network-based approach by a substantial margin in terms of average P
recision@k, Recall@k, and F1-score@k (k ranges from 1 to 5). Specifically,
CodeT5 improves the performance of F1-score@1-5 by 8.8%, 12.4%, 15.3%, 16.4%,
and 16.6%. Moreover, to address the concern with inference latency, we
experiment PTM4Tag+ with smaller PTM models (i.e., DistilBERT, DistilRoBERTa,
CodeBERT-small, and CodeT5-small). We find that although smaller PTMs cannot
outperform larger PTMs, they still maintain over 93.96% of the performance on
average, meanwhile shortening the mean inference time by more than 47.2% |
---|---|
DOI: | 10.48550/arxiv.2408.02311 |