The Pursuit of Knowledge:
Discovering and Localizing novel concepts using Dual Memory


ICCV 2021

Approach

This framework consists of three main modules: encoding, storage, and retrieval, prudently interacting with each other. To begin with, the encoding module extracts image regions and their representations; the storage module has memory slots that represent concepts and that are constantly being updated. The retrieval module takes as input, the storage and the output from the encoding modules, and outputs a decision which updates the memory slots. Finally a memory consolidation operation is performed to amalgamate the objects discovered in Mw to Ms.

Encoding: The goal of the encoding module is to process an input image and extract representations to be used by subsequent discovery pipeline.

Storage: The storage module consists of two memory blocks: Semantic (Ms) and Working Memory (Mw).

Retrieval: The retrieval module is the center for making decisions for our method. It takes as input the current state of storage and the encoded representation (of the region being considered) from the encoding module, and makes a decision if the region belongs to 1) a known object (results in 'Update Ms' in figure), 2) previously encountered novel object ('Update Mw') or 3) Newly encountered novel object ('Create Mw').

Memory Consolidation: We propose a memory consolidation step, where representations formed in the Working Memory are added to the Semantic Memory, extending our repertoire of known categories.