Poster
in
Workshop: Compositional Learning: Perspectives, Methods, and Paths Forward
Relational composition during attribute retrieval in GPT is not purely linear
Michael McCoy · Anna Leshinskaya
Keywords: [ compositionality ] [ large language models ] [ relational combination ] [ systematicity ]
The longstanding question of how neural networks could implement relational composition has been buoyed by recent success showing relational abstraction in transformer-based large language models (LLMs). We address recent findings showing some, but imperfect, generalizability in linear composition during knowledge retrieval of attributive triplets such as has-color (banana, yellow) in GPT-J [Hernandez, E. et al, (2024). Linearity of relation decoding in transformer language models (arXiv:2308.09124)]. We report that limitations to relational generalization are explainable by two systematic factors. First, relational combinations that are more accurately retrieved generalize better than uncertain or inaccurate ones. Second, relational generalization scales with the semantic similarity of the entities being bound, showing that it is in fact non-linearly dependent on component meanings rather than being purely invariant. This aligns with longstanding findings that human judgments of adjectival combinations are likewise non-linearly interactive.