Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Brandon Kynoch

Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity

Jul 21, 2023

Hugo Latapie, Kristinn R. Thorisson, Shan Yu, Vahagn Petrosyan, Patrick Hammer, Pei Wang, Brandon Kynoch, Hanning Chen, Tangrui Li

Figure 1 for Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity

Figure 2 for Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity

Figure 3 for Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity

Figure 4 for Ethosight: A Reasoning-Guided Iterative Learning System for Nuanced Perception based on Joint-Embedding & Contextual Label Affinity

Abstract:Traditional computer vision models often require extensive manual effort for data acquisition, annotation and validation, particularly when detecting subtle behavioral nuances or events. The difficulty in distinguishing routine behaviors from potential risks in real-world applications, such as differentiating routine shopping from potential shoplifting, further complicates the process. Moreover, these models may demonstrate high false positive rates and imprecise event detection when exposed to real-world scenarios that differ significantly from the conditions of the training data. To overcome these hurdles, we present Ethosight, a novel zero-shot computer vision system. Ethosight initiates with a clean slate based on user requirements and semantic knowledge of interest. Using localized label affinity calculations and a reasoning-guided iterative learning loop, Ethosight infers scene details and iteratively refines the label set. Reasoning mechanisms can be derived from large language models like GPT4, symbolic reasoners like OpenNARS\cite{wang2013}\cite{wang2006}, or hybrid systems. Our evaluations demonstrate Ethosight's efficacy across 40 complex use cases, spanning domains such as health, safety, and security. Detailed results and case studies within the main body of this paper and an appendix underscore a promising trajectory towards enhancing the adaptability and resilience of computer vision models in detecting and extracting subtle and nuanced behaviors.

Via

Access Paper or Ask Questions

RecallM: An Architecture for Temporal Context Understanding and Question Answering

Jul 10, 2023

Brandon Kynoch, Hugo Latapie

Abstract:The ideal long-term memory mechanism for Large Language Model (LLM) based chatbots, would lay the foundation for continual learning, complex reasoning and allow sequential and temporal dependencies to be learnt. Creating this type of memory mechanism is an extremely challenging problem. In this paper we explore different methods of achieving the effect of long-term memory. We propose a new architecture focused on creating adaptable and updatable long-term memory for AGI systems. We demonstrate through various experiments the benefits of the RecallM architecture, particularly the improved temporal understanding of knowledge it provides.

* 12 pages, 6 figures, Our code is publicly available online at: https://github.com/cisco-open/DeepVision/tree/main/recallm

Via

Access Paper or Ask Questions