The Pursuit of Knowledge:
Discovering and Localizing new concepts using Dual Memory
We tackle object category discovery, which is the problem of discovering and localizing novel objects in a large unlabeled dataset. While existing methods show results on datasets with less cluttered scenes and fewer object instances per image, we present our results on the challenging COCO dataset. Moreover, we argue that, rather than discovering new categories from scratch, discovery algorithms can benefit from identifying what is already known and focusing their attention on the unknown. We propose a method to use prior knowledge about certain object categories to discover new categories by leveraging two memory modules, namely Working and Semantic memory. We show the performance of our detector on the COCO minival dataset to demonstrate its in-the-wild capabilities. More results can be found here.
This framework consists of three main modules: encoding, storage, and retrieval, prudently interacting with each other. To begin with, the encoding module extracts image regions and their representations; the storage module has memory slots that represent concepts and that are constantly being updated. The retrieval module takes as input, the storage and the output from the encoding modules, and outputs a decision which updates the memory slots. Finally a memory consolidation operation is performed to amalgamate the objects discovered in to .
Encoding: The goal of the encoding module is to process an input image and extract representations to be used by subsequent discovery pipeline.
Storage: The storage module consists of two memory blocks: Semantic () and Working Memory ().
Retrieval: The retrieval module is the center for making decisions for our method. It takes as input the current state of storage and the encoded representation (of the region being considered) from the encoding module, and makes a decision if the region belongs to 1) a known object (results in 'Update ' in figure), 2) previously encountered novel object ('Update ') or 3) Newly encountered novel object ('Create ').
Memory Consolidation: We propose a memory consolidation step, where representations formed in the Working Memory are added to the Semantic Memory, extending our repertoire of known categories.
Concepts discovered by our method in COCO 2014 train set that can be evaluated using ground truth annotations
Concepts discovered by our method in COCO 2014 train set that cannot be evaluated using ground truth annotations
To demonstrate performance of our approach on unseendata, we evaluate detectors obtained from our approach on COCO-minival. The detectors display a lot of intra-class variation. We achieve the highest AP of 17.38% forthe bear class and a lowest mAP of 0.08% for traffic lights.Check this out for more qualitative results.
The website template was borrowed from Ben Mildenhall.