Poster
Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment
Zhanzhong Pang · Fadime Sener · Shrinivas Ramasubramanian · Angela Yao
# 240
Strong Double Blind |
Temporal action segmentation assigns an action label to each frame in untrimmed procedural activity videos. Such videos often exhibit a long-tailed action distribution due to the varying action frequencies and durations. However, state-of-the-art temporal action segmentation methods typically overlook the long-tail problem and fail to recognize tail actions. Existing long-tail methods make class-independent assumptions and fall short in identifying tail classes when applied to temporal action segmentation frameworks. To address these, we propose a novel group-wise temporal logit adjustment~(G-TLA) framework that combines a group-wise softmax formulation leveraging activity information with temporal logit adjustment which utilizes action order. We evaluate our approach across five temporal segmentation benchmarks, where our framework demonstrates significant improvements in tail recognition, highlighting its effectiveness.