Registration Desk: Registration Sun 29 Sep 08:00 a.m.
Workshop: Eyes of the Future: Integrating Computer Vision in Smart Eyewear Sun 29 Sep 09:00 a.m.
As Smart Eyewear devices become increasingly prevalent, optimizing their functionality and user experience through sophisticated computer vision applications is crucial. These devices must not only effectively process real-time data but also operate under power and computational constraints while ensuring user privacy and ethical standards are upheld.x000D
x000D
The "Eyes of the Future: Integrating Computer Vision in Smart Eyewear (ICVSE)" workshop, at ECCV 2024, aims to advance the field of Smart Eyewear by integrating cutting-edge computer vision technologies. This workshop addresses the need to bridge theoretical research and practical implementations in Smart Eyewear, a technology that will transform user interactions in everyday life through enhanced perception and augmented reality experiences.x000D
x000D
The need for this workshop stems from the rapid advancements in both computer vision and wearable technology sectors, necessitating a dedicated forum where interdisciplinary insights and experiences can be shared to accelerate practical applications. Thus, ICVSE not only aims to showcase novel research but also to inspire a roadmap for future developments in Smart Eyewear technology.
Workshop: Critical Evaluation of Generative Models and their Impact on Society Sun 29 Sep 09:00 a.m.
Workshop: Recovering 6D Object Pose Sun 29 Sep 09:00 a.m.
Workshop: The Second Perception Test Challenge Sun 29 Sep 09:00 a.m.
Following the successful 2023 edition, we organise the second Perception Test Challenge to benchmark multimodal perception models on the Perception Test (blog, github) - a diagnostic benchmark created by Google DeepMind to comprehensively probe the abilities of multimodal models across:
* video, audio, and text modalities
* four skill areas: Memory, Abstraction, Physics, Semantics
* four types of reasoning: Descriptive, Explanatory, Predictive, Counterfactual
* six computational tasks: multiple-choice video-QA, grounded video-QA, object tracking, point tracking, action localisation, sound localisation
AI for Visual Arts Workshop and Challenges (AI4VA) Sun 29 Sep 09:00 a.m.
Workshop: 3D Vision and Modeling Challenges in eCommerce Sun 29 Sep 09:00 a.m.
Workshop: 3rd edition of Computer Vision for Metaverse (CV4Metaverse) Sun 29 Sep 09:00 a.m.
ACVR2024 - 12th International Workshop on Assistive Computer Vision and Robotics Sun 29 Sep 09:00 a.m.
Workshop: BioImage Computing (BIC) Sun 29 Sep 09:00 a.m.
Workshop: Fairness and ethics towards transparent AI: facing the chalLEnge through model Debiasing (FAILED) Sun 29 Sep 09:00 a.m.
Workshop: Self-Supervised Learning - What is next? Sun 29 Sep 09:00 a.m.
From GPT to DINO to diffusion models, the past years have seen major advances in self-supervised learning, with many new methods reaching astounding performances on standard benchmarks. Still, the field of SSL is rapidly evolving with new learning paradigms coming up at an unprecedented speed. At the same time, works on coupled data, such as image-text pairs, have shown large potential in producing even stronger models capable of zero-shot tasks and benefiting from the methodology developed in SSL. Despite this progress, it is also apparent that there are still major unresolved challenges and it is not clear what the next step is going to be. In this workshop, we want to highlight and provide a forum to discuss potential research directions, from radically new self-supervision tasks, data sources, and paradigms to surprising counter-intuitive results. Through invited speakers and paper oral talks, our goal is to provide a forum to discuss and exchange ideas where both the leaders in this field, as well as the new, younger generation, can equally contribute to discussing the future of this field.
The First Workshop on Expressive Encounters: Co-speech gestures across cultures in the wild Sun 29 Sep 09:00 a.m.
Workshop on Artificial Social Intelligence Sun 29 Sep 09:00 a.m.
Workshop on Spatial AI Sun 29 Sep 09:00 a.m.
Visual object tracking and segmentation challenge VOTS2024 workshop Sun 29 Sep 09:00 a.m.
Workshop: Beyond Euclidean: Hyperbolic and Hyperspherical Learning for Computer Vision Sun 29 Sep 09:00 a.m.
9th Workshop on Computer Vision in Plant Phenotyping and Agriculture (CVPPA) Sun 29 Sep 09:00 a.m.
Workshop: Scalable 3D Scene Generation and 3D Geometric Scene Understanding Sun 29 Sep 09:00 a.m.
2nd International Workshop on Privacy-Preserving Computer Vision Sun 29 Sep 09:00 a.m.
The focus of this workshop is to bring together researchers from industry and academia who focus on both distributed and privacy-preserving machine learning for vision and imaging. These topics are of increasingly large commercial and policy interest. It is therefore important to build a community for this research area, which involves collaborating researchers that share insights, code, data, benchmarks, training pipelines, etc., and together aim to improve the state of privacy in computer vision.
Workshop: T-CAP - Towards a Complete Analysis of People: Fine-grained Understanding for Real-World Applications Sun 29 Sep 02:00 p.m.
Workshop: AVGenL: Audio-Visual Generation and Learning Sun 29 Sep 02:00 p.m.
n recent years, we have witnessed significant advancements in the field of visual generation which have molded the research landscape presented in computer vision conferences such as ECCV, ICCV, and CVPR. However, in a world where information is conveyed through a rich tapestry of sensory experiences, the fusion of audio and visual modalities has become much more essential for understanding and replicating the intricacies of human perception and diverse real-world applications. Indeed, the integration of audio and visual information has emerged as a critical area of research in computer vision and machine learning, having numerous applications across various domains, from immersive gaming environments to lifelike simulations for medical training, such as multimedia analysis, virtual reality, advertisement and cinematic application. x000D
x000D
Despite these strong motivations, little attention has been given to research focusing on understanding and generating audio-visual modalities compared to traditional, vision-only approaches and applications. Given the recent prominence of multi-modal foundation models, embracing the fusion of audio and visual data is expected to further advance current research efforts and practical applications within the computer vision community, which makes this workshop an encouraging addition to ECCV that will catalyze advancements in this burgeoning field.x000D
x000D
In this workshop, we aim to shine a spotlight on this exciting yet under-investigated field by prioritizing new approaches in audio-visual generation, as well as covering a wide range of topics related to audio-visual learning, where the convergence of auditory and visual signals unlocks a plethora of opportunities for advancing creativity, understanding, and also machine perception. We hope our workshop can bring together researchers, practitioners, and enthusiasts from diverse disciplines in both academia and industry to delve into the latest developments, challenges, and breakthroughs in audio-visual generation and learning.
Workshop: AI4DH: Artificial Intelligence for Digital Humanities Sun 29 Sep 02:00 p.m.
The Third ROAD Workshop & Challenge: Event Detection for Situation Awareness in Autonomous Driving Sun 29 Sep 02:00 p.m.
OpenSUN3D: 3rd Workshop on Open-Vocabulary 3D Scene Understanding Sun 29 Sep 02:00 p.m.
The ability to perceive, understand and interact with arbitrary 3D environments is a long-standing goal in research with applications in AR/VR, health, robotics and so on. Current 3D scene understanding models are largely limited to low-level recognition tasks such as object detection or semantic segmentation, and do not generalize well beyond the a pre-defined set of training labels. More recently, large visual-language models (VLM), such as CLIP, have demonstrated impressive capabilities trained solely on internet-scale image-language pairs. Some initial works have shown that these models have the potential to extend 3D scene understanding not only to open set recognition, but also offer additional applications such as affordances, materials, activities, and properties of unseen environments. The goal of this workshop is to bundle these efforts and to discuss and establish clear task definitions, evaluation metrics, and benchmark datasets.
The First Workshop on: Computer Vision for Videogames (CV2) Sun 29 Sep 02:00 p.m.
Our scope is to bring together people working in Computer Vision (CV) and, more broadly speaking, Artificial Intelligence (AI), to talk about the adoption of CV/AI methods for videogames, that represent a large capital market within creative industries and a crucial domain for AI research at the same time. Our workshop will cover various aspects of videogames development and consumption, ranging from game creation, game servicing, player experience management, to bot creation, cheat detection, and human computer interaction mediated by large language models. We believe that focusing on CV for videogames will bring together cohesively related works with foreseeable and practical impact on today’s market, thus we will give priority to submissions specifically devoted to the application of state of the art CV/AI methods FOR videogames, while we will assign lower priority to submissions on the adoption of videogames as test beds for the creation and testing of CV/AI methods. We also plan to favour the presentation of novel datasets that can sparkle further research in this field.x000D
x000D
The committee and keynotes includes multiple genders, researchers with origins from different geographical areas (USA, EU, Asia), from both industry (NVIDIA, Activision, Blockade Labs, Microsoft, Snap) and academia (Universities of Trento, Malta, Stanford), and different research experience (from PhD students to full professors and managers). We intend promoting cross-disciplinary and diversity not only within the members of the organizing committee, but also in the list of topics covered by the workshop.x000D
x000D
Furthermore, a latest generation GPU sponsored by NVIDIA will be awarded to the best academic paper, to help researchers that may not have access to significant computational resources. The workshop will help sharing and discussing different points of view on the future of CV in videogames in a friendly environment.
Workshop: Traditional Computer Vision in the Age of Deep Learning (TradiCV) Sun 29 Sep 02:00 p.m.
Workshop: Explainable AI for Computer Vision: Where Are We and Where Are We Going? Sun 29 Sep 02:00 p.m.
Deep neural networks (DNNs) are an essential component in the field of computer vision and achieve state-of-the-art results in almost all of its sub-disciplines. While DNNs excel at predictive performance, they are often too complex to be understood by humans, leading to them often being referred to as “black-box models”. This is of particular concern when DNNs are applied in safety-critical domains such as autonomous driving or medical applications. With this problem in mind, explainable artificial intelligence (XAI) aims to gain a better understanding of DNNs, ultimately leading to more robust, fair, and interpretable models. To this end, a variety of different approaches, such as attribution maps, intrinsically explainable models, and mechanistic interpretability methods, have been developed. While this important field of research is gaining more and more traction, there is also justified criticism of the way in which the research is conducted. For example, the term “explainability” in itself is not properly defined and is highly dependent on the end user and the task, leading to ill-defined research questions and no standardized evaluation practices. The goals of this workshop are thus two-fold:x000D
x000D
1. Discussion and dissemination of ideas at the cutting-edge of XAI research (“Where are we?”)x000D
2. A critical introspection on the challenges faced by the community and the way to go forward (“Where are we going?”)