Workshop
Statistical Frontiers in LLMs and Foundation Models
Anastasios Angelopoulos · Stephen Bates · Alexander D'Amour · Jessica Hullman · Fanny Yang · Sophia Sun · Tatsunori Hashimoto
West Ballroom A
Sat 14 Dec, 9 a.m. PST
We propose a workshop on the emerging frontier at the intersection between statistics and foundation models. Rigorous evaluation of large foundation models such as LLMs is necessary for reliable deployment, but it poses a towering challenge due to a lack of datasets and the black-box nature of many such models. The proposed workshop brings together the community working on understanding and improving LLMs with new statistical methodologies, and explores topics including benchmarking, measuring and correcting bias, automatic evaluation, watermarking, models/data auditing, and uncertainty quantification.
Chat is not available.
Timezone: America/Los_Angeles
Schedule
Sat 9:00 a.m. - 9:39 a.m.
|
Opening Remarks
(
Intro
)
>
SlidesLive Video |
🔗 |
Sat 9:30 a.m. - 10:15 a.m.
|
Invited talk #1: Bernhard Schölkopf
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 10:15 a.m. - 11:15 a.m.
|
Unstructured Time
(
Unstructured Time
)
>
|
🔗 |
Sat 11:15 a.m. - 12:00 p.m.
|
Invited talks #2: Mihaela van der Schaar
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction ( Poster ) > link | Drew Nguyen · Reese Pathak · Anastasios Angelopoulos · Stephen Bates · Michael Jordan 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Enhancing Semantic Clustering for Uncertainty Quantification & Conformal Prediction by LLMs ( Poster ) > link |
11 presentersRamneet Kaur · Colin Samplawski · Adam Cobb · Anirban Roy · Brian Matejek · Manoj Acharya · Daniel Elenius · Alexander Berenbeim · John Pavlik · Nathaniel Bastian · Susmit Jha |
Sat 12:00 p.m. - 12:45 p.m.
|
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models ( Poster ) > link |
11 presentersSiyuan Wu · Yue Huang · Gao Chujie · Dongping Chen · Qihui Zhang · Yao Wan · Tianyi Zhou · Xiangliang Zhang · Jianfeng Gao · Chaowei Xiao · Lichao Sun |
Sat 12:00 p.m. - 12:45 p.m.
|
Infilling Score: A Pretraining Data Detection Algorithm for Large Language Models ( Poster ) > link | Negin Raoof · Litu Rout · Giannis Daras · Sujay Sanghavi · Constantine Caramanis · Sanjay Shakkottai · Alex Dimakis 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Harnessing Large Language Models for Market Research: A Data-augumentation Approach ( Poster ) > link | Mengxin Wang · Dennis Zhang · Heng Zhang 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++? ( Poster ) > link | Vaishnavi Bhargava · Rajat Ghosh · Debojyoti Dutta 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Mind the Gap: A Surgical Study on the Self-improvement Capabilities of LLMs ( Poster ) > link | Yuda Song · Hanlin Zhang · Udaya Ghai · Carson Eisenach · Sham Kakade · Dean Foster 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Protected Test-Time Adaptation via Online Entropy Matching ( Poster ) > link | Yarin Bar · Yaniv Romano · Shalev Shaer 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Weak-to-Strong Confidence Prediction ( Poster ) > link | Yukai Yang · Tracy Zhu · Marco Morucci · Tim G. J. Rudner 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Just rephrase it! Uncertainty estimation in closed-source language models via multiple rephrased queries ( Poster ) > link | Adam Yang · CHEN CHEN · Konstantinos Pitas 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Automated Social Science: Language Models as Scientist and Subjects ( Poster ) > link | Kehang Zhu · John Horton · Benjamin Manning 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Scheduling in LLM Inference with Blowed-up Memory Constraints ( Poster ) > link | Zijie Zhou · Jiashuo Jiang 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
CLUE: Concept-Level Uncertainty Estimation for Large Language Models ( Poster ) > link | Yu-Hsiang Wang · Andrew Bai · Che-Ping Tsai · Cho-Jui Hsieh 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? ( Poster ) > link | Han Bao · Yanbo Wang · Jiayi Ye · Yue Huang · Xiangqi Wang · Xiangliang Zhang 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark ( Poster ) > link | Elliot Epstein · Kaisheng Yao · Jing Li · Xinyi Bai · Hamid Palangi 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Cannot or Should Not? Automatic Analysis of Refusal Composition in IFT/RLHF Datasets and Refusal Behavior of Black-Box LLMs ( Poster ) > link | Alexander von Recum · Christoph Schnabl · Gabor Hollbeck · Marvin von Hagen · Silas Alberti · Philip Blinde 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
A Statistical Approach to Quantifying LLM Human Alignment ( Poster ) > link | Harbin Hong · Liu Leqi · Sebastian Caldas 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models ( Poster ) > link | Nhi Pham · Michael Schott 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
CriticAL: Model Criticism Automation with Language Models ( Poster ) > link | Michael Li · Noah Goodman · Emily Fox 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Robust Conformal Prediction Using Privileged Information ( Poster ) > link | Shai Feldman · Yaniv Romano 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
LLMs as Emotion Analyzers for Causal Models: Partial Identification with Fuzzy Interval Data ( Poster ) > link | Huidi Ma · Wendao Xue · Yifan Yu 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Detecting Watermark Spoofing Attacks ( Poster ) > link | Eliot Cowan · Max Daniels 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
MisMo: More is More in Alignment ( Poster ) > link | Benjamin Feuer · Micah Goldblum · Teresa Datta · Raz Besaleli · Samuel Dooley · Max Cembalest · John Dickerson 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Learning to Generate Verbalized Confidences ( Poster ) > link | Sophia Hager · Nicholas Andrews 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Black-box Uncertainty Quantification Method for LLM-as-a-Judge ( Poster ) > link | Nico Wagner · Michael Desmond · Rahul Nair · Zahra Ashktorab · Elizabeth Daly · Qian Pan · Martín Santillán Cooper · J Johnson · Werner Geyer 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP ( Poster ) > link | Sedigheh (Sarah) Eslami · Gerard de Melo 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
FEET: A Framework for Evaluating Embedding Techniques ( Poster ) > link | Simon Lee · John Lee 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Advancing Conversational Psychotherapy: Integrating Privacy, Dual-Memory, and Domain Expertise with Large Language Models ( Poster ) > link | XiuYu Zhang · Zening Luo 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Distribution-based sensitivity analysis for large language models ( Poster ) > link | Paulius Rauba · Qiyao Wei · Mihaela van der Schaar 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
To Believe or Not to Believe Your LLM ( Poster ) > link | Yasin Abbasi Yadkori · Ilja Kuzborskij · András György · Csaba Szepesvari 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation ( Poster ) > link | Misha Khodak · Lester Mackey · Miro Dudik · Alexandra Chouldechova 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking ( Poster ) > link | Yifan Zeng · Ojas Tendolkar · Raymond Baartmans · Qingyun Wu · Lizhong Chen · Huazheng Wang 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Learning to Localize: Practical Algorithms for Online Weighted Conformal Prediction ( Poster ) > link | Tiffany Ding · Anastasios Angelopoulos · Michael Jordan · Ryan Tibshirani 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Towards Probabilistically-Sound Beam Search with Masked Language Models ( Poster ) > link | Anna Sappington · Robert Calef · Creston Brooks · Charlie Cowen-Breen 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
A STEP TOWARDS MIXTURE OF GRADER: STATISTICAL ANALYSIS OF EXISTING AUTOMATIC EVALUATION METRICS ( Poster ) > link | Yun Joon Soh · Jishen Zhao 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Towards the Effect of Examples on In-Context Learning: A Theoretical Case Study ( Poster ) > link | Pengfei He · Yingqian Cui · Han Xu · Hui Liu · Makoto Yamada · Jiliang Tang · Yue XING 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Uncertainty-Penalized Directed Preference Optimization ( Poster ) > link | Sam Houliston · Alexander Immer · Alizée Pace · Gunnar Rätsch 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Pearls from Pebbles: Improved Confidence Functions for Auto-labeling ( Poster ) > link | Harit Vishwakarma · Yi Chen · Sui Jiet Tay · Satya Sai Srinath Namburi · Frederic Sala · Ramya Korlakai Vinayak 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Consistency-based Black-box Uncertainty Quantification for Text-to-SQL ( Poster ) > link | Debarun Bhattacharjya · Balaji Ganesan · Michael Glass · Junkyu Lee · Radu Marinescu · Katya Mirylenka · Xiao Shou 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Statistically Valid Information Bottleneck via Multiple Hypothesis Testing ( Poster ) > link | Amirmohammad Farzaneh · Osvaldo Simeone 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Skilling laws: scaling laws for LLM benchmark performance ( Poster ) > link | Felipe Maia Polo · Seamus Somerstep · Leshem Choshen · Yuekai Sun · Mikhail Yurochkin 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Monty Hall and Score Optimization in Conformal Prediction to Improve LLMs for MCQs ( Poster ) > link | Harit Vishwakarma · Alan Mishler · Thomas Cook · Niccolo Dalmasso · Natraj Raman · Sumitra Ganesh 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
A teacher-teacher framework for clinical language representation learning ( Poster ) > link | Feiqing Huang · Shenghan Zhang · Sara Sweet · Tianxi Cai 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Hessian-Free Laplace in Bayesian Deep Learning ( Poster ) > link | James McInerney · Nathan Kallus 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
When is Differentially Private Finetuning Actually Private? ( Poster ) > link | Roy Rinberg · Martin Pawelczyk 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees ( Poster ) > link | Yu Gui · Ying Jin · Zhimei Ren 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Towards Optimal Statistical Watermarking ( Poster ) > link | Baihe Huang · Hanlin Zhu · Banghua Zhu · Kannan Ramchandran · Michael Jordan · Jason Lee · Jiantao Jiao 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Optimizing Adversarial Samples for Tighter Privacy Auditing in Final Model-Only Settings ( Poster ) > link | Sangyeon Yoon · Wonje Jeung · Albert No 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
ICScore: Metrics for Evaluating Interestingness and Creativity of Stories ( Poster ) > link | Junha Lee · Jaeshin Cho · Youngjin Cho · Hyewon Jin · Hyemin Lee · Min Song 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Conformal Reasoning: Uncertainty Estimation in Interactive Environments ( Poster ) > link | Eric Frankel · Stella Li · Lillian Ratliff · Yulia Tsvetkov · Sewoong Oh · Pang Wei Koh 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Adaptive and Robust Watermark for Generative Tabular Data ( Poster ) > link | Dung Ngo · Daniel Scott · Saheed Obitayo · Vamsi Potluru · Manuela Veloso 🔗 |
Sat 12:00 p.m. - 12:45 p.m.
|
Poster Session #1
(
Poster Session
)
>
|
🔗 |
Sat 2:00 p.m. - 2:45 p.m.
|
Invited Talk #3: Weijie Su
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 3:00 p.m. - 3:45 p.m.
|
Invited Talk #4: Virginia Smith
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs ( Poster ) > link | Ruijia Niu · Dongxia Wu · Rose Yu · Yian Ma 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
An empirical study of in-context uncertainty quantification with conformal prediction ( Poster ) > link | Zhe Huang · Simone Rossi · Rui Yuan · Thomas Hannagan 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Evaluating language models as risk scores ( Poster ) > link | André F. Cruz · Moritz Hardt · Celestine Mendler-Dünner 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
A Watermark for Black-Box Language Models ( Poster ) > link | Dara Bahri · John Wieting · Dana Alon · Donald Metzler 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Mitigating Hallucination in Large Language Models with Explanatory Prompting ( Poster ) > link | Alexander Braverman · Weitong Zhang · Quanquan Gu 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Source Attribution for Large Language Model-Generated Data ( Poster ) > link | Xinyang Lu · Jingtan Wang · Zitong Zhao · Zhongxiang Dai · Chuan Sheng Foo · See-Kiong Ng · Bryan Kian Hsiang Low 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Mitigating LLM Hallucinations via ConformalAbstention ( Poster ) > link |
12 presentersYasin Abbasi Yadkori · Ilja Kuzborskij · David Stutz · András György · Adam Fisch · Arnaud Doucet · Iuliya Beloshapka · Wei-Hung Weng · Yao-Yuan Yang · Csaba Szepesvari · Taylan Cemgil · Nenad Tomasev |
Sat 3:45 p.m. - 4:30 p.m.
|
SCIURus: Shared Circuits for Interpretable Uncertainty Representations in Language Models ( Poster ) > link | Carter Teplica · Yixin Liu · Arman Cohan · Tim G. J. Rudner 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Taming False Positives in Out-of-Distribution Detection with Human Feedback ( Poster ) > link | Harit Vishwakarma · Heguang Lin · Ramya Korlakai Vinayak 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Length Optimization in Conformal Prediction ( Poster ) > link | Shayan Kiyani · George J. Pappas · Hamed Hassani 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Conformal Prediction Adaptive to Unknown Subpopulation Shifts ( Poster ) > link | Nien-Shao Wang · Sai Praneeth Karimireddy 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Bayesian Concept Bottleneck Models with LLM Priors ( Poster ) > link | Jean Feng · Avni Kothari · Lucas Zier · Chandan Singh · Yan Shuo Tan 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Statistical Uncertainty Quantification for Aggregate Performance Metrics in Machine Learning Benchmarks ( Poster ) > link | Rachel Longjohn · Giri Gopalan · Emily Casleton 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
A Framework for Evaluating LLMs Under Task Indeterminacy ( Poster ) > link | Luke Guerdan · Hanna Wallach · Solon Barocas · Alexandra Chouldechova 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Evaluating Generative AI Systems is a Social Science Measurement Challenge ( Poster ) > link |
20 presentersHanna Wallach · Meera Desai · Nicholas Pangakis · A. Feder Cooper · Angelina Wang · Solon Barocas · Alexandra Chouldechova · Chad Atalla · Su Lin Blodgett · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Alexandra Olteanu · Stefanie Reed · Emily Sheng · Dan Vann · Jennifer Wortman Vaughan · Matthew Vogel · Hannah Washington · Abigail Jacobs |
Sat 3:45 p.m. - 4:30 p.m.
|
Privately Learning from Graphs with Applications in Fine-tuning Large Pretrained Models ( Poster ) > link | Haoteng YIN · Rongzhe Wei · Eli Chien · Pan Li 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks ( Poster ) > link | Zizhang Chen · Pengyu Hong · Sandeep Madireddy 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Predictive Inference in Multi-environment Scenarios ( Poster ) > link | John Duchi · Suyash Gupta · Kuanhao Jiang · Pragya Sur 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Back-to-Basics Revisited: Benchmarking an Expanded Set of RLHF Algorithms ( Poster ) > link | Lucas Spangher · Rama Kumar Pasumarthi · Nick Masiewicki · Peter Grabowski · Eugene Ie · William Arnold · Daniele Calandriello · Bilal Piot 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Conformal Language Model Reasoning with Coherent Factuality ( Poster ) > link | Maya Gambhir · Maxon Rubin-Toles · Keshav Ramji · Aaron Roth · Surbhi Goel 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Adversarial Robust Deep Reinforcement Learning is Neither Robust Nor Safe ( Poster ) > link | Ezgi Korkmaz 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
ReFeR: A Hierarchical Framework of Models as Evaluative and Reasoning Agents ( Poster ) > link | Yaswanth Narsupalli · Abhranil Chandra · Sreevatsa Muppirala · Manish Gupta · Pawan Goyal 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Formal Analysis and Unification of Generalization in Deep Reinforcement Learning ( Poster ) > link | Ezgi Korkmaz 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Interactive Semantic Interventions for VLMs: A Human-in-the-Loop Approach to Interpretability ( Poster ) > link | Lukas Klein · Kenza Amara · Carsten Lüth · Hendrik Strobelt · Mennatallah El-Assady · Paul Jaeger 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
MarkMyWords: Analyzing and Evaluating Language Model Watermarks ( Poster ) > link | Julien Piet · Chawin Sitawarin · Vivian Fang · Norman Mu · David Wagner 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Deep Limit Model-free Prediction in Regression ( Poster ) > link | Kejin Wu · Dimitris Politis 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Fast yet Safe: Early-Exiting with Risk Control ( Poster ) > link | Metod Jazbec · Alexander Timans · Tin Hadži Veljković · Kaspar Sakmann · Dan Zhang · Christian Andersson Naesseth · Eric Nalisnick 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Conversational Question-Answering for process task guidance in manufacturing ( Poster ) > link | Ramesh Manuvinakurike · Elizabeth Watkins · Celal Savur · Anthony Rhodes · Sovan Biswas · Richard Beckwith · Gesem Mejia · Saurav Sahay · Giuseppe Raffa · Lama Nachman 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection ( Poster ) > link | Giorgos Iacovides · Wuyang Zhou · Danilo Mandic 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Auto-Evaluation with Few Labels through Post-hoc Regression ( Poster ) > link | Benjamin Eyre · David Madras 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
vTune: Verifiable fine-tuning Through Backdooring ( Poster ) > link | Eva Zhang · Akilesh Potti · Micah Goldblum 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Diffusion-Powered Image Super-Resolution That You Can Actually Trust ( Poster ) > link | Daniel Csillag · Eduardo Adame · Guilherme Tegoni Goedert 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners ( Poster ) > link | Bowen Jiang · Yangxinyu Xie · Zhuoqun Hao · Xiaomeng Wang · Tanwi Mallick · Weijie Su · Camillo Taylor · Dan Roth 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Scalable Subsampling Inference for Deep Neural Networks ( Poster ) > link | Kejin Wu · Dimitris Politis 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
HuLLMI: HUMAN VS. LLM IDENTIFICATION WITH EXPLAINABILITY ( Poster ) > link | Prathamesh Dinesh Joshi · Sahil Pocker · Raj Dandekar · Rajat Dandekar · Sreedath Panat 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
A shared standard for valid measurement of generative AI systems' capabilities, risks, and impacts ( Poster ) > link |
14 presentersAlexandra Chouldechova · Chad Atalla · Solon Barocas · A. Feder Cooper · Emily Corvi · Alex Dow · Jean Garcia-Gathright · Nicholas Pangakis · Stefanie Reed · Emily Sheng · Dan Vann · Matthew Vogel · Hannah Washington · Hanna Wallach |
Sat 3:45 p.m. - 4:30 p.m.
|
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation ( Poster ) > link | Siyuan Wang · Zhuohan Long · Zhihao Fan · Xuanjing Huang · zhongyu wei 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Obtaining Conformal Prediction-like guarantees by standard concentration: an observation ( Poster ) > link | Emmanouil Seferis 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Reexpress: Similarity-Distance-Magnitude Calibration ( Poster ) > link | Allen Schmaltz 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
LLMs for Causal Inference ( Poster ) > link | Jonathan Choi 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Uncertainty Quantification for Inverse Problems with Generative Priors under Distribution Shift ( Poster ) > link | Sara Fridovich-Keil 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Estimating and Correcting for Misclassification Error in Empirical Textual Research ( Poster ) > link | Jonathan Choi 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Are Police Biased? An NLP Approach ( Poster ) > link | Jonathan Choi 🔗 |
Sat 3:45 p.m. - 4:30 p.m.
|
Poster Session #2
(
Poster Session
)
>
|
🔗 |
Sat 4:30 p.m. - 5:15 p.m.
|
Closing remarks and Discussions
(
Discussions
)
>
|
🔗 |