Exploiting Contextual Information with Deep Neural Networks
Context matters! Nevertheless, there has not been much research in exploiting contextual information in deep neural networks. For most part, the entire usage of contextual information has been limited to recurrent neural networks. Attention models and capsule networks are two recent ways of introduc...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Context matters! Nevertheless, there has not been much research in exploiting
contextual information in deep neural networks. For most part, the entire usage
of contextual information has been limited to recurrent neural networks.
Attention models and capsule networks are two recent ways of introducing
contextual information in non-recurrent models, however both of these
algorithms have been developed after this work has started.
In this thesis, we show that contextual information can be exploited in 2
fundamentally different ways: implicitly and explicitly. In the DeepScore
project, where the usage of context is very important for the recognition of
many tiny objects, we show that by carefully crafting convolutional
architectures, we can achieve state-of-the-art results, while also being able
to implicitly correctly distinguish between objects which are virtually
identical, but have different meanings based on their surrounding. In parallel,
we show that by explicitly designing algorithms (motivated from graph theory
and game theory) that take into considerations the entire structure of the
dataset, we can achieve state-of-the-art results in different topics like
semi-supervised learning and similarity learning.
To the best of our knowledge, we are the first to integrate graph-theoretical
modules, carefully crafted for the problem of similarity learning and that are
designed to consider contextual information, not only outperforming the other
models, but also gaining a speed improvement while using a smaller number of
parameters. |
---|---|
DOI: | 10.48550/arxiv.2006.11706 |