Show Detail |
Timezone: America/Los_Angeles |
Filter Rooms:
SAT 28 SEP
11 p.m.
(ends 9:00 AM)
SUN 29 SEP
midnight
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
1:30 a.m.
4 a.m.
5 a.m.
Workshop:
(ends 9:00 AM)
Workshop:
(ends 9:00 AM)
Workshop:
(ends 9:00 AM)
Workshop:
(ends 9:00 AM)
6:30 a.m.
11 p.m.
(ends 9:00 AM)
MON 30 SEP
midnight
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
Workshop:
(ends 4:00 AM)
Tutorial:
(ends 4:00 AM)
1:30 a.m.
4 a.m.
5 a.m.
Workshop:
(ends 9:00 AM)
Workshop:
(ends 9:00 AM)
Workshop:
(ends 9:00 AM)
6:30 a.m.
10 p.m.
(ends 9:30 AM)
11 p.m.
(ends 12:00 AM)
TUE 1 OCT
midnight
Orals 12:00-1:20
[12:00]
Towards Scene Graph Anticipation
[12:10]
OP-Align: Object-level and Part-level Alignment for Self-supervised Category-level Articulated Object Pose Estimation
[12:20]
PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers
[12:30]
Bi-directional Contextual Attention for 3D Dense Captioning
[12:40]
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
[12:50]
ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-agnostic Counting
[1:00]
A Fair Ranking and New Model for Panoptic Scene Graph Generation
[1:10]
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
(ends 1:30 AM)
Orals 12:00-1:20
[12:00]
Making Large Language Models Better Planners with Reasoning-Decision Alignment
[12:10]
MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping
[12:20]
M^2Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation
[12:30]
H-V2X: A Large Scale Highway Dataset for BEV Perception
[12:40]
Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction
[12:50]
DriveLM: Driving with Graph Visual Question Answering
[1:00]
RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios
[1:10]
Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks
(ends 1:30 AM)
Orals 12:00-1:20
[12:00]
Integer-Valued Training and Spike-driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection
[12:10]
Latent Diffusion Prior Enhanced Deep Unfolding for Snapshot Spectral Compressive Imaging
[12:20]
SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow
[12:30]
Photon Inhibition for Energy-Efficient Single-Photon Imaging
[12:40]
Minimalist Vision with Freeform Pixels
[12:50]
Flying with Photons: Rendering Novel Views of Propagating Light
[1:00]
A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging
[1:10]
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
(ends 1:30 AM)
Demonstrations 12:00-3:30
(ends 3:30 AM)
1:30 a.m.
(ends 3:30 AM)
3 a.m.
3:30 a.m.
4:30 a.m.
Orals 4:30-6:20
[4:30]
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis
[4:40]
TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation
[4:50]
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
[5:00]
FlashTex: Fast Relightable Mesh Texturing with LightControlNet
[5:10]
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering
[5:20]
LLMGA: Multimodal Large Language Model based Generation Assistant
[5:30]
Accelerating Image Generation with Sub-path Linear Approximation Model
[5:40]
SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation
[5:50]
Bridging the Gap: Studio-like Avatar Creation from a Monocular Phone Capture
[6:00]
Zero-Shot Detection of AI-Generated Images
[6:10]
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
(ends 6:30 AM)
Orals 4:30-6:20
[4:30]
Efficient Bias Mitigation Without Privileged Information
[4:40]
Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation
[4:50]
MobileNetV4: Universal Models for the Mobile Ecosystem
[5:00]
Momentum Auxiliary Network for Supervised Local Learning
[5:10]
From Fake to Real: Pretraining on Balanced Synthetic Images to Prevent Spurious Correlations in Image Recognition
[5:20]
Dataset Enhancement with Instance-Level Augmentations
[5:30]
Adaptive Parametric Activation
[5:40]
Relation DETR: Exploring Explicit Position Relation Prior for Object Detection
[5:50]
Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation
[6:00]
CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
[6:10]
On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines
(ends 6:30 AM)
Orals 4:30-6:20
[4:30]
Physics-Free Spectrally Multiplexed Photometric Stereo under Unknown Spectral Composition
[4:40]
COMO: Compact Mapping and Odometry
[4:50]
Smoothness, Synthesis, and Sampling: Re-thinking Unsupervised Multi-View Stereo with DIV Loss
[5:00]
ADen: Adaptive Density Representations for Sparse-view Camera Pose Estimation
[5:10]
SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
[5:20]
Six-Point Method for Multi-Camera Systems with Reduced Solution Space
[5:30]
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer
[5:40]
Grounding Image Matching in 3D with MASt3R
[5:50]
ConDense: Consistent 2D-3D Pre-training for Dense and Sparse Features from Multi-View Images
[6:00]
Correspondences of the Third Kind: Camera Pose Estimation from Object Reflection
[6:10]
Camera Calibration using a Collimator System
(ends 6:30 AM)
5:30 a.m.
Demonstrations 5:30-9:00
(ends 9:00 AM)
6:30 a.m.
Keynote:
Lourdes Agapito · Vittorio Ferrari
(ends 7:30 AM)
7:30 a.m.
Posters 7:30-9:30
SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models
Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation
(ends 9:30 AM)
9:30 a.m.
11 p.m.
(ends 9:30 AM)
WED 2 OCT
midnight
(ends 3:30 AM)
Orals 12:00-1:20
[12:00]
PetFace: A Large-Scale Dataset and Benchmark for Animal Identification
[12:10]
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
[12:20]
Towards Model-Agnostic Dataset Condensation by Heterogeneous Models
[12:30]
Parrot Captions Teach CLIP to Spot Text
[12:40]
Towards Open-ended Visual Quality Comparison
[12:50]
VETRA: A Dataset for Vehicle Tracking in Aerial Imagery - New Challenges for Multi-Object Tracking
[1:00]
Insect Identification in the Wild: The AMI Dataset
[1:10]
MarineInst: A Foundation Model for Marine Image Analysis with Instance Visual Description
(ends 1:30 AM)
Orals 12:00-1:20
[12:00]
PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
[12:10]
Self-Supervised Video Desmoking for Laparoscopic Surgery
[12:20]
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos
[12:30]
Rethinking Deep Unrolled Model for Accelerated MRI Reconstruction
[12:40]
Adaptive Correspondence Scoring for Unsupervised Medical Image Registration
[12:50]
Revisiting Adaptive Cellular Recognition Under Domain Shifts: A Contextual Correspondence View
[1:00]
SparseSSP: 3D Subcellular Structure Prediction from Sparse-View Transmitted Light Images
[1:10]
Knowledge-enhanced Visual-Language Pretraining for Computational Pathology
(ends 1:30 AM)
Orals 12:00-1:20
[12:00]
HGL: Hierarchical Geometry Learning for Test-time Adaptation in 3D Point Cloud Segmentation
[12:10]
PointLLM: Empowering Large Language Models to Understand Point Clouds
[12:20]
RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
[12:30]
DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment
[12:40]
KeypointDETR: An End-to-End 3D Keypoint Detector
[12:50]
Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather
[1:00]
RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation
[1:10]
Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration
(ends 1:30 AM)
1:30 a.m.