Poster
in
Workshop: Attributing Model Behavior at Scale (ATTRIB)
Speculative Behavior: An Approach to Large Language Model Evaluation and Optimization
Hernan C. Vazquez · Jorge Sánchez · Rafael Carrascosa
Trained Large Language Models (LLMs) have gained significant interest due to their ability to interpret natural language instructions and address a wide range of tasks with high proficiency. However, in practice, these models pose multiple challenges. On one hand, it is exceedingly difficult to control and ensure that the model's behavior remains consistent, harmless, and safe. On the other hand, the most advanced models are delivered via APIs as black-box services, making it challenging to guarantee their proper behavior. Addressing these challenges has become an urgent concern, especially in environments where a model's response can impact safety and trustworthiness. Many recent studies focus on the evaluation of models using benchmarks based on community-curated datasets. However, this form of evaluation is prone to data leakage and premature dataset obsolescence. Moreover, it doesn't necessarily align with all the specific goals that may be desired. One alternative for aligning specific objectives with the model behavior is fine-tuning, but this process is time-consuming and might be prohibitively expensive for many organizations. In this study, we propose the idea of measuring the model's behavior towards specific objectives through the concept of Speculative Behavior Equivalence (SBE). We introduce a general, agnostic approach that can be adapted to various models and tailored to the unique metrics of individual cases whilst remaining constrained to specific budgets. Additionally, we formulate the Speculative Behavior-Based Optimization problem (CSBO), which presents an opportunity to leverage AutoML techniques in the field of LLMs for optimizing behavior.