Large Language Models are Zero Shot Hypothesis Proposers
Significant scientific discoveries have driven the progress of human civilisation. The explosion of scientific literature and data has created information barriers across disciplines that have slowed the pace of scientific discovery. Large Language Models (LLMs) hold a wealth of global and interdisc...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Significant scientific discoveries have driven the progress of human
civilisation. The explosion of scientific literature and data has created
information barriers across disciplines that have slowed the pace of scientific
discovery. Large Language Models (LLMs) hold a wealth of global and
interdisciplinary knowledge that promises to break down these information
barriers and foster a new wave of scientific discovery. However, the potential
of LLMs for scientific discovery has not been formally explored. In this paper,
we start from investigating whether LLMs can propose scientific hypotheses. To
this end, we construct a dataset consist of background knowledge and hypothesis
pairs from biomedical literature. The dataset is divided into training, seen,
and unseen test sets based on the publication date to control visibility. We
subsequently evaluate the hypothesis generation capabilities of various
top-tier instructed models in zero-shot, few-shot, and fine-tuning settings,
including both closed and open-source LLMs. Additionally, we introduce an
LLM-based multi-agent cooperative framework with different role designs and
external tools to enhance the capabilities related to generating hypotheses. We
also design four metrics through a comprehensive review to evaluate the
generated hypotheses for both ChatGPT-based and human evaluations. Through
experiments and analyses, we arrive at the following findings: 1) LLMs
surprisingly generate untrained yet validated hypotheses from testing
literature. 2) Increasing uncertainty facilitates candidate generation,
potentially enhancing zero-shot hypothesis generation capabilities. These
findings strongly support the potential of LLMs as catalysts for new scientific
discoveries and guide further exploration. |
---|---|
DOI: | 10.48550/arxiv.2311.05965 |