Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Algorithmic Fairness through the lens of Metrics and Evaluation

Q-Morality: Quantum-Enhanced ActAdd-Guided Bias Reduction in LLMs

Shardul Kulkarni

Keywords: [ Evaluation Metrics and Techniques ] [ Bias Detection ] [ Algorithm Development ] [ Bias Mitigation ]


Abstract:

Addressing bias in Large Language Models (LLMs) is a significant challenge due to their vast parameter spaces and the societal biases in training data. We propose Q-Morality, a quantum-based approach that combines Quantum Vectors with ActAdd steering to enhance bias mitigation. Quantum Vectors, grounded in superposition principles, encode multiple data states simultaneously, enabling parallel processing of diverse information and improving the model’s ability to capture intricate biases that classical vectors may overlook.The ActAdd steering mechanism modifies latent space activations, guiding model outputs toward fairness by correcting biased patterns identified by Quantum Vectors. This combination leads to a more precise bias mitigation process.To evaluate this method, we use the Sentence Encoder Association Test (SEAT) and Word Embedding Association Test (WEAT) metrics, which are commonly used to quantify bias in LLMs. Results show substantial bias reduction, with WEAT scores dropping from 0.754 for female-associated terms and 0.743 for male-associated terms to 0.002 post-debiasing. Ablation studies indicate that removing Quantum Vectors increases the WEAT score to 0.025, demonstrating their crucial role in achieving effective bias reduction.Q-Morality also scales effectively to large models, addressing various forms of bias in a computationally efficient manner. Future research could further refine its real-world applications in industries like healthcare and recruitment, where fairness is essential.

Chat is not available.