LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors
Integrating inertial measurement units (IMUs) with large language models (LLMs) expands the potential of multimodal AI, enabling more nuanced human activity analysis. In this paper, we introduce LLaSA (Large Language and Sensor Assistant), a multimodal large language model built on LIMU-BERT and Lla...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Integrating inertial measurement units (IMUs) with large language models
(LLMs) expands the potential of multimodal AI, enabling more nuanced human
activity analysis. In this paper, we introduce LLaSA (Large Language and Sensor
Assistant), a multimodal large language model built on LIMU-BERT and Llama,
designed to interpret and answer queries related to human activities and motion
analysis, leveraging sensor data and contextual reasoning. To develop LLaSA, we
introduce two key datasets: SensorCaps, a comprehensive collection of 35,960
IMU-derived narratives with handcrafted features, and OpenSQA, an
instruction-following dataset containing 179,727 question-answer pairs aware of
the sensor and human activity context. These datasets provide diverse and rich
inputs to train LLaSA for complex sensor-based queries. To optimize LLaSA's
performance, we apply a unique hyperparameter tuning method, which
significantly enhances its effectiveness in contextual question-answering
tasks. Extensive evaluations, including a human-led assessment of the
question-answering, demonstrate that LLaSA achieves superior data
interpretation and context-aware responses compared to GPT-3.5-Turbo and
Vicuna-1.5-13b-16K. These contributions advance the frontier of sensor-aware
LLMs and create new opportunities for impactful multimodal research in
healthcare, sports science, and human-computer interactions. Our code
repository and datasets can be found at https://github.com/BASHLab/LLaSA. |
---|---|
DOI: | 10.48550/arxiv.2406.14498 |