Expo Talk Panel
West Ballroom C

Generative AI has emerged as a transformative force capable of creating new multimodal content - including text, speech, images, video, and 3D - while handling complex dialogues and problem-solving tasks. This disruptive technology is reshaping traditional methodologies across various application domains, redefining the user interfaces for computing devices. Its impact transcends industries, promising substantial advancements in utility, productivity, and efficiency.

As the adoption of generative AI accelerates, its computational demands are increasing dramatically, making on-device processing more essential than ever. Currently, most generative AI applications operate in the cloud, placing significant strain on resources and incurring high equipment and operational costs. These workloads are motivating a reevaluation of effective strategies for the implementation of AI models. One promising approach is to shift the AI workloads to edge devices, such as phones, laptops, and XR headsets, where some -yet- limited processing capabilities are available. This transition not only reduces cost for cloud operations, but also enhances privacy, reduces communication bandwidth needs, and facilitates more streamlined access. However, enabling generative AI on resource-limited devices requires AI models to be optimized for edge devices, leveraging their available AI accelerators. In this talk, we will explore the pivotal role of deploying generative AI on-device and the full-stack optimizations necessary to facilitate this shift. The presentation will feature hands-on demonstrations, showcasing live-action, industrial-grade examples of generative AI models operating on edge devices. Highlights include: Self-Speculative Decoding Visual Content Generation: Generative 3D Diffusers Multimodal Generative Models on Edge Parameter-Efficient Personalization on Edge

Chat is not available.