Creative AI
Creative AI Session 1
East Ballroom C
Jean Oh · Marcelo Coelho · Lia Coleman · Yingtao Tian
Charting the Shapes of Stories with Game Theory
Constantinos Daskalakis · Ian Gemp · Yanchen Jiang · Renato Leme · Christos Papadimitriou · Georgios Piliouras
Stories are records of our experiences and their analysis reveals insights into the nature of being human. Successful analyses are often interdisciplinary, leveraging mathematical tools to extract structure from stories and insights from structure. Historically, these tools have been restricted to 1-d charts and dynamic social networks; however, modern AI offers the possibility of identifying more fully the plot structure, character incentives, and, importantly, counterfactual plot lines that the story could have taken but did not take. In this work, we use AI to model the structure of stories as game-theoretic objects, amenable to quantitative analysis. This allows us to not only interrogate each character's decision making, but also possibly peer into the original author's conception of the characters' world. We demonstrate our proposed technique on Shakespeare's famous Romeo and Juliet. We conclude with a discussion of how our analysis could be replicated in broader contexts, including real-life scenarios.
Designed by a computer, built by a designer. Learnings from the fabrication of a chair generated by a ML model
TOMAS CABEZON PEDROSO
This work explores the mismatches between the way ML and humans understand everyday objects and their potential impact on the future of design and fabrication. We present an experiment in which a generative ML model is trained on a dataset of 6k+ objects and a random output is selected to be fabricated. The fabrication, the translation of the volumetric digital data into a physical form, serves as an speculative experiment into the shifts in design and authorship by embedding Machine Learning (ML) in design. We discuss our process and the challenges encountered in representing 3D data in a digital form, as well as the reciprocal challenge of transforming ML-generated 3D digital data into physical objects. The result of this experiment is a chair designed by a computer and built by a designer. The material and technique decisions made in the fabrication process highlight discrepancies between human and computer perceptions of the essential features of an object. This leads to a contemplative artwork that raises questions about the future of design and the role of humans and machines in the creative process. The experiment demonstrates the potential, but also the limitations, of machine learning models to generate not just digital models, but also physical objects that are both functional, visually appealing and that encode the history of design.
Exploratory Study Of Human-AI Interaction For Hindustani Music
Nithya Shikarpur · Anna Huang
This paper presents a study of participants interacting with and using GaMaDHaNi, a novel hierarchical generative model for Hindustani vocal contours. As one of the first models capable of generating continuous melodic contours inherent to the Hindustani music idiom, we aim to study the expectations, reactions, and preferences of practicing musicians while interacting with the model. Through a user study of three participants, we note their challenges as (1) the lack of restrictions in model output, and (2) the incoherence of model output. We situate these challenges in the context of Hindustani music and aim to suggest future directions for the model design to address these gaps.
Forrest Dance - is an art video using advance AI movement transfer techniques - submitted are excerpts from two parts of it ( the beginning and the end). It speaks to how trees wake each morning and dance (main video) and by nightfall eventually becoming pure light dancers (supplementary1 video). It is a piece about the environment and nature - how all beings have consciousness in their own way. I grew up in NYC but now love that I live in Deep Cove, BC - deep in the woods. I hike everyday and have certain trees that I view as friends over the years, such as these trees (supplementary2 and 3 images). I photograph/video and use them in my artwork in the way that I see them in my head. Much of my 30+ year AI still, installation and video art work brings in movement, emotion and dance - and a love of nature. This work is an art video - that while 15 minutes long - I am open to editing it for the art exhibit.
Shadow puppetry or shadow play, allows bodily participation into the process of linguistic storytelling, while the potential of multi-modal interaction through shadow plays in existing large-language-model-based creative tools has not been fully discovered. We propose Narratron, a generative story-making tool that co-creates and co-performs children stories from shadow using Claude 2 model. To achieve Narratron, our system is designed to recognize hand gestural inputs as main character and to develop story plot in accordance with character change. Through our system, we seek to stimulate creativity in shadow play storytelling and to facilitate a multi-modal human-AI collaboration.
OpenOpenAI and Alternate Altman: Democratizing the Future of AI through Speculative Participatory Media
Gauri Agarwal · Kevin Dunnell · Pat Pataranutaporn · Andrew Lippman · Patricia Maes
In an era where tech leaders wield unprecedented influence over the trajectory of artificial intelligence, this paper explores a radical reimagining of AI governance through speculative design and interactive media. We introduce "OpenOpenAI," an interactive web platform where "Alternate Altmans"—AI-driven personas with distinct visions for the future and ethical frameworks—serve as avatars for public discourse on AI’s future. Participants engage in dialogues, vote on directions, and witness simulated consequences of their decisions in the form of a simulated keynote presentation from Sam Altman, CEO of OpenAI, collectively shaping the narrative of technological evolution. This work critiques the concentration of power in the tech industry, advocating for a democratized approach to AI development that harnesses the diverse perspectives of global citizens. Through this experimental platform, we aim to foster public engagement and ignite conversations about the future of democracy and technology.
Recent advances across the field of machine learning have created a world in which the existent, publicly available models, training tools, and compute enable small entities unprecedented access to model building and deployment. This development simultaneously creates a number of novel dangers, and also an opportunity for relatively underresourced artistic cooperatives and even individual artists, to create interactive and performance art outside the scope we conventionally see in the ``AI Art'' community today. Inspired by this development, we argue for the artistic merit of expanding the scope of what we think of AI Art far past the scope of what we have seen exhibited in the past in venues such as NeurIPS and in museums, toward art that is more directly provocative, that centers the individuals rather than the "AI", and that engages with model training rather than simply inference. As a proof of concept, we describe a fictional interactive exhibit, the Penametron, which invites the users to interact with, and contribute to the training of, a model that estimates the length of a (fully clothed) visitor's penis.
Secure & Personalized Music-to-Video Generation via CHARCHA
Mehul Agarwal · Gauri Agarwal · Santiago Benoit · Andrew Lippman · Jean Oh
Music is a deeply personal experience and our aim is to enhance this with a fully-automated pipeline for personalized music video generation. Our work allows listeners to not just be consumers but co-creators in the music video generation process by creating personalized, consistent and context-driven visuals based on lyrics, rhythm and emotion in the music. The pipeline combines multimodal translation and generation techniques and utilizes low-rank adaptation on listeners' images to create immersive music videos that reflect both the music and the individual. To ensure the ethical use of users' identity, we also introduce CHARCHA, a facial identity verification protocol that protects people against unauthorized use of their face while at the same time collecting authorized images from users for personalizing their videos. This paper thus provides a secure and innovative framework for creating deeply personalized music videos.
The Tale of Punyakoti: An AI-Enhanced Audio Experience
Harini S · Yashaswini Viswanath · Kaustubha V
This project presents an AI-enhanced audio version of the legendary story of Punyakoti, a cherished tale from Indian folklore renowned for its moral lessons on integrity and compassion. Utilizing advanced AI technology, the audio rendition brings the story to life with vivid narrations and immersive sound design. The AI-generated voice modulation and auditory effects enrich the storytelling experience, highlighting the emotional and ethical dimensions of Punyakoti’s narrative. By blending traditional elements with modern audio techniques, this project aims to make the tale accessible and engaging for contemporary audiences, offering a fresh perspective on a timeless legend. The audio format invites listeners to experience the enduring values of honesty and self-sacrifice through innovative storytelling.
Tuning Music Education: AI-Powered Personalization in learning music
Mayank Sanganeria · Rohan Gala
Recent AI-driven step-function advances in several longstanding problems in music technology are opening up new avenues to create the next generation of music education tools. Creating personalized, engaging, and effective learning experiences are continuously evolving challenges in music education. Here we present two case studies using such advances in music technology to address these challenges. In our first case study we showcase an application that uses Automatic Chord Recognition to generate personalized exercises from audio tracks, connecting traditional ear training with real-world musical contexts. In the second case study we prototype adaptive piano method books that use Automatic Music Transcription to generate exercises at different skill levels while retaining a close connection to musical interests.These applications demonstrate how recent AI developments can democratize access to high-quality music education and promote rich interaction with music in the age of generative AI. We hope this work inspires other efforts in the community, aimed at removing barriers to access to high-quality music education and fostering human participation in musical expression.
This graphic score was created using MidJourney version 3, a commercially available artificial intelligence system that generates images from text-based prompts. Using text prompts provided by Schedel, Yager’s AI account produced a series of visuals, which the authors then refined and organized into a score for performance. Musicians are invited to interpret these images, transforming elements such as color, shape, and direction into sound.Each performer establishes their own set of rules for this translation, maintaining consistency within a performance but allowing for variation between performances. This ensures that every interpretation is unique, offering a fresh perspective on the visual-to-sonic conversion. Performers navigate the score one part at a time, without overlapping, and the score can be projected to enhance audience engagement. This piece explores the dynamic interaction between AI-generated visuals and human musical interpretation, encouraging an evolving dialogue between sight, sound, and computational creativity.The piece can be exhibited and/or performed.