Improving Model Evaluation using SMART Filtering of Benchmark Datasets

One of the most challenging problems facing NLP today is evaluation. Some of the most pressing issues pertain to benchmark saturation, data contamination, and diversity in the quality of test examples. To address these concerns, we propose Selection Methodology for Accurate, Reduced, and Targeted (S...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2024-10
Hauptverfasser:	Gupta, Vipul, Ross, Candace, Pantoja, David, Passonneau, Rebecca J, Ung, Megan, Williams, Adina
Format:	Artikel
Sprache:	eng
Schlagworte:	Benchmarks Datasets Filtration Regeneration
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!