Datasets & utils for paper USING PRE-TRAINED MODELS TO PARTIALLY AUTOMATE CODE REVIEW ACTIVITIES
Raw and processed datasets & Configurations files for Pre-training and Fine-Tuning T5 models Pre-Training dataset Obtained by mining Stack Overflow and CodeSearchNet data. Fine-Tuning dataset We will fine-tune our T5 small model on different datasets obtained by mining code review data from Gerr...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Dataset |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Raw and processed datasets & Configurations files for Pre-training and Fine-Tuning T5 models Pre-Training dataset Obtained by mining Stack Overflow and CodeSearchNet data. Fine-Tuning dataset We will fine-tune our T5 small model on different datasets obtained by mining code review data from Gerrit and GitHub repositories. Fine-Tuning dataset v1 (Small) Same dataset used by Tufano et al., abstracted code and raw comments. Fine-Tuning dataset v2 (Small) Same dataset used by Tufano et al., not abstracted code and cleaned comments. Fine-Tuning dataset (Large) Our new Large dataset |
---|---|
DOI: | 10.5281/zenodo.4812784 |