Abstract:
Humans occasionally reason using logic and abstract categories, and yet most state of the art neural models use continuous distributed representations. These representations are impressive in their learning capabilities, but have proven difficult to interpret, or to compare to biological representations. But continuous representations can sometimes be interpreted symbolically, and a distributed code can seem to be constructed by composing abstract categories. We ask whether it is possible to detect and get back this structure, and we answer that it sort of is. The demixing problem is equivalent to factorizing the data into a continuous and a binary part $\mathbf{X}= \mathbf{W}\mathbf{S}^T$. After establishing some general facts and intuitions, we present two algorithms which work on low-rank or full-rank data, assess their reliability on extensive simulated data, and use them to interpret neural word embeddings where we expect some compositional structure. We hope this problem is interesting and that our simple algorithms provide a promising direction for solving it.