Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity
Traditional computer vision models often necessitate extensive data acquisition, annotation, and validation. These models frequently struggle in real-world applications, resulting in high false positive and negative rates, and exhibit poor adaptability to new scenarios, often requiring costly retrai...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Traditional computer vision models often necessitate extensive data
acquisition, annotation, and validation. These models frequently struggle in
real-world applications, resulting in high false positive and negative rates,
and exhibit poor adaptability to new scenarios, often requiring costly
retraining. To address these issues, we present Ethosight, a flexible and
adaptable zero-shot video analytics system. Ethosight begins from a clean slate
based on user-defined video analytics, specified through natural language or
keywords, and leverages joint embedding models and reasoning mechanisms
informed by ontologies such as WordNet and ConceptNet. Ethosight operates
effectively on low-cost edge devices and supports enhanced runtime adaptation,
thereby offering a new approach to continuous learning without catastrophic
forgetting. We provide empirical validation of Ethosight's promising
effectiveness across diverse and complex use cases, while highlighting areas
for further improvement. A significant contribution of this work is the release
of all source code and datasets to enable full reproducibility and to foster
further innovation in both the research and commercial domains. |
---|---|
DOI: | 10.48550/arxiv.2307.10577 |