[ Virtual ]
While deep learning has driven impressive progress, one of the toughest remaining challenges is generalization beyond the training distribution. Few-shot learning is an area of research that aims to address this, by striving to build models that can learn new concepts rapidly in a more "human-like" way. While many influential few-shot learning methods were based on meta-learning, recent progress has been made by simpler transfer learning algorithms, and it has been suggested in fact that few-shot learning might be an emergent property of large-scale models. In this talk, I will give an overview of the evolution of few-shot learning methods and benchmarks from my point of view, and discuss the evolving role of meta-learning for this problem. I will discuss lessons learned from using larger and more diverse benchmarks for evaluation and trade-offs between different approaches, closing with a discussion about open questions.
Link to slides: https://drive.google.com/file/d/1ZIULjhFjyNqjSS10p-5CDaqgzlrZcaGD/view?usp=sharing
[ Virtual ]
Foundation models adopting the methodology of deep learning with pre-training on large-scale unlabeled data and finetuning with task-specific supervision are becoming a mainstream technique in machine learning. Although foundation models hold many promises in learning general representations and few-shot/zero-shot generalization across domains and data modalities, at the same time they raise unprecedented challenges and considerable risks in robustness and privacy due to the use of the excessive volume of data and complex neural network architectures. This tutorial aims to deliver a Coursera-like online tutorial containing comprehensive lectures, a hands-on and interactive Jupyter/Colab live coding demo, and a panel discussion on different aspects of trustworthiness in foundation models.
[ Virtual ]
Incrementally learning new information from a non-stationary stream of data, referred to as lifelong learning, is a key feature of natural intelligence, but an open challenge for deep learning. For example, when artificial neural networks are trained on samples from a new task or data distribution, they tend to rapidly lose previously acquired capabilities, a phenomenon referred to as catastrophic forgetting. In stark contrast, humans and other animals are able to incrementally learn new skills without compromising those that were learned before. Numerous deep learning methods for lifelong learning have been proposed in recent years, but yet a substantial gap remains between the lifelong learning abilities of artificial and biological neural networks.
In this tutorial, we start by asking what key capabilities a successful lifelong learning machine should have. We then review the current literature on lifelong learning, and we ask how far we have come. We do this in two parts. First, we review the popular benchmarks and setups currently used in the literature, and we critically assess to what extent they measure progress relevant for lifelong learning applications in the real world. Second, we review the strategies for lifelong learning that have been explored so far, and we …
[ Virtual ]
This tutorial will provide an overview of recent advances in Neurosymbolic Programming. The objective in this area is to learn neurosymbolic programs, which combine elements of both neural networks and classical symbolic programs with the aim of inheriting the benefits of both. A key advantage of the neurosymbolic programmiing approach is that here, one learns models that look more like the models that domain experts write by hand in code, but that are also more expressive than classical interpretable models in machine learning. Also, neurosymbolic programs can more easily incorporate prior knowledge and are easier to analyze and verify. From the point of view of techniques, neurosymbolic programming combines ideas from machine learning and program synthesis and represents an exciting new contact point between the two communities. This tutorial will cover a broad range of basic concepts in the area, including neurosymbolic architectures, domain-specific languages, architecture/program search algorithms, meta-learning algorithms such as library learning, and applications to science and autonomy. Our panel will discuss open challenges in the field and ways in which machine learning and programming languages researchers can come together to address them. The tutorial is an abridged version of the tutorial at the Neurosymbolic Programming summer school …
[ Virtual ]
Recent advances in Natural Language Processing (NLP) have propelled the state of the art to new highs. One such advance is the use of external memory to support reasoning in deep learning models such as Transformers.
Without external memory to store sufficient background knowledge, reasoning in NLP systems must be performed based on limited information leading to poor performance on knowledge-rich tasks. Conversely, NLP systems with access to external memory have resulted in significant performance gains on many important tasks including question answering (QA) and other tasks associated with QA such as fact verification, and entity linking. The tutorial will present : 1) an overview of state of the art approaches for representing background knowledge in addressable memory, and 2) applications in the healthcare domain.
[ Virtual ]
In several real-world scenarios, decision making involves complex reasoning, i.e., the ability to answer complex probabilistic queries. Moreover, in many sensitive domains like health- care and economical decision making, the result of these queries is required to be exact as approximations without guarantees would make the decision making process brittle. In all these scenarios, tractable probabilistic inference and learning are becoming more and more mandatory. In this tutorial, we will introduce the framework of probabilistic circuits (PCs) under which one can learn deep generative models that guarantee exact inference in polynomial (often linear) time. After certain recent algorithmic and theoretical results, which we will discuss in this tutorial, PCs have achieved impressive results in probabilistic modeling, sometimes outperforming intractable models such as variational autoencoders. We will show the syntax and semantics of PCs and show how several commonly used ML models -- from Gaussian mixture models to HMMs and decision trees -- can be understood as computational graphs within the PC framework. We will discuss how PCs are special cases of neural networks, when restricting network with certain structural properties enables different tractability scenarios. This unified view of probabilistic ML models opens up a range of ways to learn PCs …
[ Virtual ]
As machine learning models permeate every aspect of decision making systems in consequential areas such as healthcare, banking, hiring and education, it has become critical for these models to satisfy trustworthiness desiderata such as fairness, privacy, robustness and interpretability. Initially studied in isolation, recent work has emerged at the intersection of these different fields of research, leading to interesting questions on how fairness can be achieved under privacy, interpretability and robustness constraints. Given the interesting questions that emerge at the intersection of these different fields, this tutorial aims to investigate how these different topics relate, and how they can augment each other to provide better or more suited definitions and mitigation strategies for algorithmic fairness. We are particularly interested in addressing open questions in the field, such as: how algorithmic fairness is compatible with privacy constraints? What are the trade-offs when we consider algorithmic fairness at the intersection of robustness? Can we develop fair and explainable models? We will also articulate some limitations of technical approaches to algorithmic fairness, and discuss critiques that are coming from outside of computer science.
[ Virtual ]
Many engineering, scientific, and industrial applications including automated machine learning (e.g., hyper-parameter tuning) involve making design choices to optimize one or more expensive to evaluate objectives. Some examples include tuning the knobs of a compiler to optimize performance and efficiency of a set of software programs; designing new materials to optimize strength, elasticity, and durability; and designing hardware to optimize performance, power, and area. Bayesian Optimization (BO) is an effective framework to solve black-box optimization problems with expensive function evaluations. The key idea behind BO is to build a cheap surrogate model (e.g., Gaussian Process) using the real experimental data; and employ it to intelligently select the sequence of function evaluations using an acquisition function, e.g., expected improvement (EI).
The goal of this tutorial is to present recent advances in BO by focusing on challenges, principles, algorithmic ideas and their connections, and important real-world applications. Specifically, we will cover recent work on acqusition functions, BO methods for discrete and hybrid spaces, BO methods for high-dimensional input spaces, causal BO, and key innovations in BoTorch toolbox along with a hands-on demonstration.
[ Virtual ]
When an algorithm can make consequential decisions for people's lives, people have an incentive to respond to the algorithm strategically in order to obtain a more desirable decision. This means that unless the algorithm adapts to this strategizing, it may end up creating policy decisions that are incompatible with the original policy's goal. This has been the mantra of the rapidly growing research area of incentive-aware Machine Learning (ML). In this tutorial, we introduce this area to the broader ML community. After a primer on the basic background needed, we introduce the audience to the four perspectives that have been studied so far: the robustness perspective (where the decision-maker tries to create algorithms that are robust to strategizing), the fairness perspective (where we study the inequalities that arise or are reinforced as a result of strategizing), the improvement perspective (where the learner tries to incentivize effort exertion towards actually improving their points), and the performativity perspective (where the decision-maker wishes to achieve a notion of stability in these settings).
[ Virtual ]
The efficient communication of information has enormous societal and environmental impact, and stands to benefit from the machine learning revolution seen in other fields. Through this tutorial, we hope to disseminate the ideas of information theory and compression to a broad audience, overview the core methodologies in learning-based compression (i.e., neural compression), and present the relevant technical challenges and open problems defining a new frontier of probabilistic machine learning. Besides covering the technical grounds, we will also explore the broader underlying themes and future research in our panel discussion, focusing on the interplay between computation and communication, the role of machine learning, and societal considerations such as fairness, privacy, and energy footprint as we strive to make our learning and information processing systems more efficient.
[ Virtual ]
Creative domains render a big part of modern society, having a significant influence on economy and cultural life. During the last decade, fast development of ML technologies such as Generative models, led to creation of multiple creative applications. In this tutorial, we talk about co-creativity and generative art in computer vision, NLP, interactive music generation and the interplay between these modalities .
While there are opportunities for ML to empower artists to create and distribute their work, there are risks and harms when using these technologies in cultural contexts. These include harms arising from tools intended to support the creative process (e.g. biased or unsafe output, such as deepfakes) or harms incurred in creative output (e.g. individual or systemic inequity). At the same time, these systems can have broader and long-term social implications on the shape and diversity of culture more generally, which we discuss in this tutorial.
Finally in this tutorial, we discuss open questions on co-creation process, interplay between modalities, assessment of creative systems, and broader impact of these technologies and potential harms that can stem from these models.
[ Virtual ]
Algorithmic rankings have become increasingly common in the online world. From social media to streaming services and from e-commerce to hiring, ranking has become the primary way online users ingest information. In many cases, recommendations have implications for both the users and items (or creators) being ranked. These systems are also increasingly personalized to viewers, relying on imperfect information to learn and respond to user preferences. In recent years, the machine learning community has become increasingly aware of potential harms to consumers (e.g. echo chambers, addictive design, virality of harmful content) and creators (e.g. access to opportunity, misattribution and appropriation). In this tutorial we will explore the current state of research on responsible recommendations and the primary challenges with understanding, evaluating and training these systems for user and content providers. This tutorial additionally presents the primary challenges in applying this research in practice. The perspectives and methods presented in this tutorial will apply to recommendation systems generally and will not include any specific information regarding actual recommendation products. The tutorial is designed to be accessible to a broad audience of machine learning practitioners, some background in predictive systems and ranking is beneficial but not required.
[ Virtual ]
Data is one of the key drivers of progress in machine learning. Modern datasets require scale far beyond the ability of individual domain experts to produce. To overcome this limitation, a wide variety of techniques have been developed to build large datasets efficiently, including crowdsourcing, automated labeling, weak supervision, and many more. This tutorial describes classical and modern methods for building datasets beyond manual hand-labeling. It covers both theoretical and practical aspects of dataset construction. Theoretically, we discuss guarantees for a variety of crowdsourcing, active learning-based, and weak supervision techniques, with a particular focus on generalization properties of downstream models trained on the resulting datasets. Practically, we describe several popular systems implementing such techniques and their use in industry and beyond. We cover both the promise and potential pitfalls of using such methods. Finally, we offer a comparison of automated dataset construction versus other popular approaches to dealing with a lack of large amounts of labeled data, including few- and zero-shot methods enabled by foundation models.