Learning Uniformly Distributed Embedding Clusters of Stylistic Skills for Physically Simulated Characters
Learning natural and diverse behaviors from human motion datasets remains challenging in physics-based character control. Existing conditional adversarial models often suffer from tight and biased embedding distributions where embeddings from the same motion are closely grouped in a small area and s...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Learning natural and diverse behaviors from human motion datasets remains
challenging in physics-based character control. Existing conditional
adversarial models often suffer from tight and biased embedding distributions
where embeddings from the same motion are closely grouped in a small area and
shorter motions occupy even less space. Our empirical observations indicate
this limits the representational capacity and diversity under each skill. An
ideal latent space should be maximally packed by all motion's embedding
clusters. In this paper, we propose a skill-conditioned controller that learns
diverse skills with expressive variations. Our approach leverages the Neural
Collapse phenomenon, a natural outcome of the classification-based encoder, to
uniformly distributed cluster centers. We additionally propose a novel
Embedding Expansion technique to form stylistic embedding clusters for diverse
skills that are uniformly distributed on a hypersphere, maximizing the
representational area occupied by each skill and minimizing unmapped regions.
This maximally packed and uniformly distributed embedding space ensures that
embeddings within the same cluster generate behaviors conforming to the
characteristics of the corresponding motion clips, yet exhibiting noticeable
variations within each cluster. Compared to existing methods, our controller
not only generates high-quality, diverse motions covering the entire dataset
but also achieves superior controllability, motion coverage, and diversity
under each skill. Both qualitative and quantitative results confirm these
traits, enabling our controller to be applied to a wide range of downstream
tasks and serving as a cornerstone for diverse applications. |
---|---|
DOI: | 10.48550/arxiv.2411.06459 |