Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework
Efficient multimodal large language models (EMLLMs), in contrast to multimodal large language models (MLLMs), reduce model size and computational costs and are often deployed on resource-constrained devices. However, due to data privacy concerns, existing open-source EMLLMs rarely have access to pri...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Efficient multimodal large language models (EMLLMs), in contrast to
multimodal large language models (MLLMs), reduce model size and computational
costs and are often deployed on resource-constrained devices. However, due to
data privacy concerns, existing open-source EMLLMs rarely have access to
private domain-specific data during the pre-training process, making them
difficult to directly apply in device-specific domains, such as certain
business scenarios. To address this weakness, this paper focuses on the
efficient adaptation of EMLLMs to private domains, specifically in two areas:
1) how to reduce data requirements, and 2) how to avoid parameter fine-tuning.
Specifically, we propose a tun\textbf{\underline{I}}ng-free,
a\textbf{\underline{D}}aptiv\textbf{\underline{E}},
univers\textbf{\underline{AL}} \textbf{\underline{Prompt}} Optimization
Framework, abbreviated as \textit{\textbf{\ourmethod{}}} which consists of two
stages: 1) Predefined Prompt, based on the reinforcement searching strategy,
generate a prompt optimization strategy tree to acquire optimization priors; 2)
Prompt Reflection initializes the prompt based on optimization priors, followed
by self-reflection to further search and refine the prompt. By doing so,
\ourmethod{} elegantly generates the ``ideal prompts'' for processing private
domain-specific data. Note that our method requires no parameter fine-tuning
and only a small amount of data to quickly adapt to the data distribution of
private data. Extensive experiments across multiple tasks demonstrate that our
proposed \ourmethod{} significantly improves both efficiency and performance
compared to baselines. |
---|---|
DOI: | 10.48550/arxiv.2412.19684 |