LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

CHI 2025 Case Study Track LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Wu, Tongshuang, Zhu, Haiyi, Albayrak, Maya, Axon, Alexis, Bertsch, Amanda, Deng, Wenxing, Ding, Ziqi, Guo, Bill, Gururaja, Sireesh, Kuo, Tzu-Sheng, Liang, Jenny T, Liu, Ryan, Mandal, Ihita, Milbauer, Jeremiah, Ni, Xiaolin, Padmanabhan, Namrata, Ramkumar, Subhashini, Sudjianto, Alexis, Taylor, Jordan, Tseng, Ying-Jui, Vaidos, Patricia, Wu, Zhijin, Wu, Wei, Yang, Chenyang
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language Computer Science - Human-Computer Interaction
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	CHI 2025 Case Study Track LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these ``human computation algorithms,'' but the level of success is variable and influenced by requesters' understanding of LLM capabilities, the specific skills required for sub-tasks, and the optimal interaction modality for performing these sub-tasks. We reflect on human and LLMs' different sensitivities to instructions, stress the importance of enabling human-facing safeguards for LLMs, and discuss the potential of training humans and LLMs with complementary skill sets. Crucially, we show that replicating crowdsourcing pipelines offers a valuable platform to investigate 1) the relative LLM strengths on different tasks (by cross-comparing their performances on sub-tasks) and 2) LLMs' potential in complex tasks, where they can complete part of the tasks while leaving others to humans.
DOI:	10.48550/arxiv.2307.10168