Skip to yearly menu bar Skip to main content



Workshops
Workshop
Aron Monszpart · Eric Brachmann · Map-free Workshop
Abstract

The Map-free Visual Relocalization workshop investigates topics related to metric visual relocalization relative to a single reference image instead of relative to a map. This problem is of major importance to many higher level applications, such as Augmented/Mixed Reality, SLAM and 3D reconstruction. It is important now, because both industry and academia are debating whether and how to build HD-maps of the world for those tasks. Our community is working to reduce the need for such maps in the first place.

We host the first Map-free Visual Relocalization Challenge 2024 competition with two tracks: map-free metric relative pose from a single image to a single image (proposed by Arnold et al. in ECCV 2022) and from a query sequence to a single image (new). While the former is a more challenging and thus interesting research topic, the latter represents a more realistic relocalization scenario, where the system making the queries may fuse information from query images and tracking poses over a short amount of time and baseline. We invite papers to be submitted to the workshop.

Workshop
Shiqi Yang
Abstract

n recent years, we have witnessed significant advancements in the field of visual generation which have molded the research landscape presented in computer vision conferences such as ECCV, ICCV, and CVPR. However, in a world where information is conveyed through a rich tapestry of sensory experiences, the fusion of audio and visual modalities has become much more essential for understanding and replicating the intricacies of human perception and diverse real-world applications. Indeed, the integration of audio and visual information has emerged as a critical area of research in computer vision and machine learning, having numerous applications across various domains, from immersive gaming environments to lifelike simulations for medical training, such as multimedia analysis, virtual reality, advertisement and cinematic application.

Despite these strong motivations, little attention has been given to research focusing on understanding and generating audio-visual modalities compared to traditional, vision-only approaches and applications. Given the recent prominence of multi-modal foundation models, embracing the fusion of audio and visual data is expected to further advance current research efforts and practical applications within the computer vision community, which makes this workshop an encouraging addition to ECCV that will catalyze advancements in this burgeoning field.

In this workshop, we aim to shine …

Workshop
Robin Hesse
Abstract

Deep neural networks (DNNs) are an essential component in the field of computer vision and achieve state-of-the-art results in almost all of its sub-disciplines. While DNNs excel at predictive performance, they are often too complex to be understood by humans, leading to them often being referred to as “black-box models”. This is of particular concern when DNNs are applied in safety-critical domains such as autonomous driving or medical applications. With this problem in mind, explainable artificial intelligence (XAI) aims to gain a better understanding of DNNs, ultimately leading to more robust, fair, and interpretable models. To this end, a variety of different approaches, such as attribution maps, intrinsically explainable models, and mechanistic interpretability methods, have been developed. While this important field of research is gaining more and more traction, there is also justified criticism of the way in which the research is conducted. For example, the term “explainability” in itself is not properly defined and is highly dependent on the end user and the task, leading to ill-defined research questions and no standardized evaluation practices. The goals of this workshop are thus two-fold:

1. Discussion and dissemination of ideas at the cutting-edge of XAI research (“Where are we?”)
2. A …

Workshop
Tomas Hodan
Abstract
Workshop
Andrea Fusiello
Abstract
Workshop
Martin R Oswald
Abstract
Workshop
Yichen Li
Abstract
Workshop
Niclas Zeller
Abstract
Workshop
Despoina Paschalidou
Abstract
Workshop
Mohamed Elhoseiny
Abstract
Workshop
iuri frosio
Abstract

Our scope is to bring together people working in Computer Vision (CV) and, more broadly speaking, Artificial Intelligence (AI), to talk about the adoption of CV/AI methods for videogames, that represent a large capital market within creative industries and a crucial domain for AI research at the same time. Our workshop will cover various aspects of videogames development and consumption, ranging from game creation, game servicing, player experience management, to bot creation, cheat detection, and human computer interaction mediated by large language models. We believe that focusing on CV for videogames will bring together cohesively related works with foreseeable and practical impact on today’s market, thus we will give priority to submissions specifically devoted to the application of state of the art CV/AI methods FOR videogames, while we will assign lower priority to submissions on the adoption of videogames as test beds for the creation and testing of CV/AI methods. We also plan to favour the presentation of novel datasets that can sparkle further research in this field.

The committee and keynotes includes multiple genders, researchers with origins from different geographical areas (USA, EU, Asia), from both industry (NVIDIA, Activision, Blockade Labs, Microsoft, Snap) and academia (Universities of Trento, Malta, …

Workshop
Deblina Bhattacharjee
Abstract
Workshop
Stuart James
Abstract
Workshop
Roberto Pierdicca
Abstract
Workshop
Andre Araujo
Abstract
Workshop
Henghui Ding
Abstract
Workshop
Hongxu Yin
Abstract
Workshop
Giuseppe Fiameni
Abstract
Workshop
Yao Feng
Abstract
Workshop
Leena Mathur
Abstract
Workshop
Linlin Yang
Abstract

Our HANDS workshop will gather vision researchers working on perceiving hands performing actions, including 2D & 3D hand detection, segmentation, pose/shape estimation, tracking, etc. We will also cover related applications including gesture recognition, hand-object manipulation analysis, hand activity understanding, and interactive interfaces.

The eighth edition of this workshop will emphasize the use of large foundation models (e.g., CLIP, Point-E, Segment Anything, Latent Diffusion Models) for hand-related tasks. These models have revolutionized the perceptions of AI, and demonstrate groundbreaking contributions to multimodal understanding, zero-shot learning, and transfer learning. However, there remains an untapped potential for exploring their applications in hand-related tasks. Our offical website is https://hands-workshop.org.

Workshop
Alexander Krull
Abstract
Workshop
Lucia Schiatti
Abstract
Workshop
Anand Bhattad
Abstract
Workshop
Michael Dorkenwald
Abstract

From GPT to DINO to diffusion models, the past years have seen major advances in self-supervised learning, with many new methods reaching astounding performances on standard benchmarks. Still, the field of SSL is rapidly evolving with new learning paradigms coming up at an unprecedented speed. At the same time, works on coupled data, such as image-text pairs, have shown large potential in producing even stronger models capable of zero-shot tasks and benefiting from the methodology developed in SSL. Despite this progress, it is also apparent that there are still major unresolved challenges and it is not clear what the next step is going to be. In this workshop, we want to highlight and provide a forum to discuss potential research directions, from radically new self-supervision tasks, data sources, and paradigms to surprising counter-intuitive results. Through invited speakers and paper oral talks, our goal is to provide a forum to discuss and exchange ideas where both the leaders in this field, as well as the new, younger generation, can equally contribute to discussing the future of this field.

Workshop
Andrea Pilzer
Abstract

This \textbf{UNcertainty quantification for Computer Vision (UNCV)} Workshop %aims to raise awareness about models, data, and prediction uncertainties to the vision community
aims to raise awareness and generate discussion regarding how predictive uncertainty can, and should, be effectively incorporated into models within the vision community. The workshop will bring together experts from machine learning and computer vision to create a new generation of well-calibrated and effective methods that \emph{`know when they do not know'}.

Workshop
Diego Garcia-Olano
Abstract
Workshop
Shangzhe Wu
Abstract
Workshop
Vasileios Belagiannis
Abstract
Workshop
Lucia Cascone
Abstract
Workshop
Federica Proietto Salanitri
Abstract
Workshop
Zane Durante
Abstract
Workshop
Viorica Patraucean
Abstract

Following the successful 2023 edition, we organise the second Perception Test Challenge to benchmark multimodal perception models on the Perception Test (blog, github) - a diagnostic benchmark created by Google DeepMind to comprehensively probe the abilities of multimodal models across:
* video, audio, and text modalities
* four skill areas: Memory, Abstraction, Physics, Semantics
* four types of reasoning: Descriptive, Explanatory, Predictive, Counterfactual
* six computational tasks: multiple-choice video-QA, grounded video-QA, object tracking, point tracking, action localisation, sound localisation

Workshop
Ryuichiro Hataya
Abstract
Workshop
Antonio Alliegro
Abstract
Workshop
Marco Cotogni
Abstract

In an era of rapid advancements in Artificial Intelligence, the imperative to foster
Trustworthy AI has never been more critical. The first “Trust What You learN
(TWYN)” workshop seeks to create a dynamic forum for researchers, practition-
ers, and industry experts to explore and advance the intersection of Trustworthy
AI and DeepFake Analysis within the realm of Computer Vision. The workshop
aims to delve into the multifaceted dimensions of building AI systems that are
not only technically proficient but also ethical, transparent, and accountable.
The dual focus on Trustworthy AI and DeepFake Analysis reflects the work-
shop’s commitment to addressing the challenges posed by the proliferation of
deep fake technologies while simultaneously promoting responsible AI practices.

Workshop
Mamatha Thota
Abstract
Workshop
Vivek Sharma
Abstract

The focus of this workshop is to bring together researchers from industry and academia who focus on both distributed and privacy-preserving machine learning for vision and imaging. These topics are of increasingly large commercial and policy interest. It is therefore important to build a community for this research area, which involves collaborating researchers that share insights, code, data, benchmarks, training pipelines, etc., and together aim to improve the state of privacy in computer vision.

Workshop
Yiming Wang
Abstract
Workshop
Tzofi Klinghoffer
Abstract

Neural fields have been widely adopted for learning novel view synthesis and 3D reconstruction from RGB images by modeling transport of light in the visible spectrum. This workshop focuses on neural fields beyond conventional cameras, including (1) learning neural fields from data from different sensors across the electromagnetic spectrum and beyond, such as lidar, cryo-electron microscopy (cryoEM), thermal, event cameras, acoustic, and more, and (2) modeling associated physics-based differentiable forward models and/or the physics of more complex light transport (reflections, shadows, polarization, diffraction limits, optics, scattering in fog or water, etc.). Our goal is to bring together a diverse group of researchers using neural fields across sensor domains to foster learning and discussion in this growing area.

Workshop
Francesca Palermo
Abstract

As Smart Eyewear devices become increasingly prevalent, optimizing their functionality and user experience through sophisticated computer vision applications is crucial. These devices must not only effectively process real-time data but also operate under power and computational constraints while ensuring user privacy and ethical standards are upheld.

The "Eyes of the Future: Integrating Computer Vision in Smart Eyewear (ICVSE)" workshop, at ECCV 2024, aims to advance the field of Smart Eyewear by integrating cutting-edge computer vision technologies. This workshop addresses the need to bridge theoretical research and practical implementations in Smart Eyewear, a technology that will transform user interactions in everyday life through enhanced perception and augmented reality experiences.

The need for this workshop stems from the rapid advancements in both computer vision and wearable technology sectors, necessitating a dedicated forum where interdisciplinary insights and experiences can be shared to accelerate practical applications. Thus, ICVSE not only aims to showcase novel research but also to inspire a roadmap for future developments in Smart Eyewear technology.

Workshop
Guido Borghi · Marcella Cornia · Federico Becattini · Claudio Ferrari · Tomaso Fontanini
Abstract