Poster
LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration
Siqi Wang · Bryan Plummer
# 16
Learning with noisy labels (LNL) aims to train a high-performing model using a noisy dataset. We observe that noise for a given class often comes from a limited set of categories, yet many LNL methods overlook this. For example, an image mislabeled as a cheetah is more likely a leopard than a hippopotamus due to its visual similarity. Thus, we explore Learning with Noisy Labels with noise source Knowledge integration (LNL+K), which takes advantage of knowledge about likely source(s) of label noise that is often already provided in a dataset's meta-data. We find that integrating noise source knowledge boosts performance even in settings where LNL methods typically fail. For example, LNL+K methods are effective on datasets where noise represents the majority of samples, which breaks a critical premise of most methods developed for LNL. We also find that LNL+K methods can boost performance even when the noise sources are estimated rather than provided in the meta-data. Our experiments provide several baseline LNL+K methods that integrate noise source knowledge into state-of-the-art LNL models evaluated across six diverse datasets and two types of noise, where we report gains of up to 23% compared to the unadapted methods. Critically, we show that LNL methods fail to generalize on some real-world datasets, even when adapted to integrate noise source knowledge, highlighting the importance of directly exploring LNL+K.