Poster
in
Workshop: Time Series in the Age of Large Models
Efficient Time Series Processing for Transformers and State-Space Models through Token Merging
Leon Götz · Marcel Kollovieh · Stephan Günnemann · Leo Schwinn
Transformer architectures and state-space models have shown promising results in time series analysis. However, processing very long sequences imposes significant computational requirements. Token merging, which involves replacing multiple tokens with a single one calculated as their linear combination, has shown to considerably improve the throughput of vision transformer architectures while maintaining accuracy. In this work, we perform the first investigations of token merging in time series analysis. We further introduce local merging, a domain-specific token merging algorithm that selectively combines tokens within a local neighborhood, achieving two major benefits: a) Local merging can adjust its the computational complexity from quadratic to linear based on the neighborhood size to effectively scale token merging to long sequences; b) Local merging is the first causal merging scheme enabling token merging in transformer decoders. Our comprehensive empirical evaluation demonstrates that token merging offers substantial computational benefits with minimal impact on accuracy across various models and datasets. On the recently proposed Chronos foundation model, we achieve accelerations up to 5400% with only minor accuracy degradations.