Poster
POET: Prompt Offset Tuning for Continual Human Action Adaptation
Prachi Garg · Joseph K J · Vineeth N Balasubramanian · Necati Cihan Camgoz · Chengde Wan · Kenrick Kin · Weiguang Si · Shugao Ma · Fernando de la Torre
# 211
Strong Double Blind |
As extended reality (XR) is redefining how users interact with computing devices, research in human action recognition is gaining prominence. Typically, models deployed on immersive computing devices are static and limited to their default set of classes. The goal of our research is to provide users and developers with the capability to personalize their experience by adding new action classes to their device models continually. Importantly, a user should be able to add new classes in a low-shot and efficient manner, while this process should not require storing or replaying any of user's sensitive training data. We formalize this problem as privacy aware few-shot continual action recognition. Towards this end, we propose POET: Prompt Offset Tuning. While existing prompt tuning approaches have shown great promise for continual learning of image, text, and video modalities; they demand access to extensively pretrained transformers. Breaking away from this assumption, POET demonstrates the efficacy of prompt tuning a significantly lightweight backbone, pretrained exclusively on the base class data. We propose a novel spatio-temporal learnable prompt selection approach, and are the first to apply this prompting technique to Graph Neural Networks. To evaluate our method, we introduce two new benchmarks: (i) NTU RGB+D dataset for activity recognition (ii) SHREC-2017 dataset for hand gesture recognition. The code will be released upon acceptance.