Skip to yearly menu bar Skip to main content


Poster

CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches

Sifan Wu · Amir Hosein Khasahmadi · Mor Katz · Pradeep Kumar Jayaraman · Yewen Pu · Karl D.D. Willis · Bang Liu

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Thu 3 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Parametric Computer-Aided Design (CAD) is central to contemporary mechanical design. We harness the capabilities of pre-trained foundation models, renowned for their successes in natural language processing and computer vision, to develop generative models specifically for CAD. These models are adept at understanding complex geometries and design reasoning, a crucial advancement in CAD technology. In this paper, we propose CadVLM, an end-to-end vision language model for CAD generation. Our approach involves adapting pre-trained foundation models to manipulate engineering sketches effectively, integrating both sketch primitive sequences and sketch images. Extensive experiments demonstrate superior performance on multiple CAD sketch generation tasks such as CAD autocompletion, CAD autoconstraint, and image conditional generation. To our knowledge, this is the first instance of a multimodal Large Language Model (LLM) being successfully applied to parametric CAD generation, representing a pioneering step in the field of computer-aided mechanical design. The code is available at https://anonymous.4open.science/r/CadVLM.

Live content is unavailable. Log in and register to view live content