Sketch-pix2seq: a Model to Generate Sketches of Multiple Categories
Sketch is an important media for human to communicate ideas, which reflects the superiority of human intelligence. Studies on sketch can be roughly summarized into recognition and generation. Existing models on image recognition failed to obtain satisfying performance on sketch classification. But f...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Sketch is an important media for human to communicate ideas, which reflects
the superiority of human intelligence. Studies on sketch can be roughly
summarized into recognition and generation. Existing models on image
recognition failed to obtain satisfying performance on sketch classification.
But for sketch generation, a recent study proposed a sequence-to-sequence
variational-auto-encoder (VAE) model called sketch-rnn which was able to
generate sketches based on human inputs. The model achieved amazing results
when asked to learn one category of object, such as an animal or a vehicle.
However, the performance dropped when multiple categories were fed into the
model. Here, we proposed a model called sketch-pix2seq which could learn and
draw multiple categories of sketches. Two modifications were made to improve
the sketch-rnn model: one is to replace the bidirectional recurrent neural
network (BRNN) encoder with a convolutional neural network(CNN); the other is
to remove the Kullback-Leibler divergence from the objective function of VAE.
Experimental results showed that models with CNN encoders outperformed those
with RNN encoders in generating human-style sketches. Visualization of the
latent space illustrated that the removal of KL-divergence made the encoder
learn a posterior of latent space that reflected the features of different
categories. Moreover, the combination of CNN encoder and removal of
KL-divergence, i.e., the sketch-pix2seq model, had better performance in
learning and generating sketches of multiple categories and showed promising
results in creativity tasks. |
---|---|
DOI: | 10.48550/arxiv.1709.04121 |