"Genlangs" and Zipf's Law: Do languages generated by ChatGPT statistically look human?
OpenAI's GPT-4 is a Large Language Model (LLM) that can generate coherent constructed languages, or "conlangs," which we propose be called "genlangs" when generated by Artificial Intelligence (AI). The genlangs created by ChatGPT for this research (Voxphera, Vivenzia, and Lu...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | OpenAI's GPT-4 is a Large Language Model (LLM) that can generate coherent
constructed languages, or "conlangs," which we propose be called "genlangs"
when generated by Artificial Intelligence (AI). The genlangs created by ChatGPT
for this research (Voxphera, Vivenzia, and Lumivoxa) each have unique features,
appear facially coherent, and plausibly "translate" into English. This study
investigates whether genlangs created by ChatGPT follow Zipf's law. Zipf's law
approximately holds across all natural and artificially constructed human
languages. According to Zipf's law, the word frequencies in a text corpus are
inversely proportional to their rank in the frequency table. This means that
the most frequent word appears about twice as often as the second most frequent
word, three times as often as the third most frequent word, and so on. We
hypothesize that Zipf's law will hold for genlangs because (1) genlangs created
by ChatGPT fundamentally operate in the same way as human language with respect
to the semantic usefulness of certain tokens, and (2) ChatGPT has been trained
on a corpora of text that includes many different languages, all of which
exhibit Zipf's law to varying degrees. Through statistical linguistics, we aim
to understand if LLM-based languages statistically look human. Our findings
indicate that genlangs adhere closely to Zipf's law, supporting the hypothesis
that genlangs created by ChatGPT exhibit similar statistical properties to
natural and artificial human languages. We also conclude that with human
assistance, AI is already capable of creating the world's first
fully-functional genlang, and we call for its development. |
---|---|
DOI: | 10.48550/arxiv.2304.12191 |