Possibilities and challenges in the moral growth of large language models: a philosophical perspective: Possibilities and challenges in the moral growth of large

With the rapid expansion of parameters in large language models (LLMs) and the application of Reinforcement Learning with Human Feedback (RLHF), there has been a noticeable growth in the moral competence of LLMs. However, several questions warrant further exploration: Is it really possible for LLMs...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ethics and information technology 2025, Vol.27 (1)
Hauptverfasser: Wang, Guoyu, Wang, Wei, Cao, Yiqin, Teng, Yan, Guo, Qianyu, Wang, Haofen, Lin, Junyu, Ma, Jiajie, Liu, Jin, Wang, Yingchun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the rapid expansion of parameters in large language models (LLMs) and the application of Reinforcement Learning with Human Feedback (RLHF), there has been a noticeable growth in the moral competence of LLMs. However, several questions warrant further exploration: Is it really possible for LLMs to fully align with human values through RLHF? How can the current moral growth be philosophically contextualized? We identify similarities between LLMs’ moral growth and Deweyan ethics in terms of the discourse of human moral development. We then attempt to use Dewey’s theory on an experimental basis to examine and further explain the extent to which the current alignment pathway enables the development of LLMs. A beating experiment serves as the foundational case for analyzing LLMs’ moral competence across various parameters and stages, including basic moral cognition, moral dilemma judgment, and moral behavior. The results demonstrate that the moral competence of the GPT series has seen a significant improvement, and Dewey’s Impulse-Habit-Character theory of moral development can be used to explain this: the moral competence of LLMs has been enhanced through experience-based learning, supported by human feedback. Nevertheless, LLMs’ moral development through RLHF remains constrained and does not reach the character stage described by Dewey, possibly due to their lack of self-consciousness. This fundamental difference between humans and LLMs underscores both the limitations of LLMs’ moral growth and the challenges of applying RLHF for AI alignment. It also emphasizes the need for external societal governance and legal regulation.
ISSN:1388-1957
1572-8439
DOI:10.1007/s10676-024-09818-x