Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Time Series in the Age of Large Models

Mixture of Experts for Time Series Foundation Models

Xu Liu · Juncheng Liu · Gerald Woo · Ibrahim Taha Aksu · Chenghao Liu · Silvio Savarese · Caiming Xiong · Doyen Sahoo


Abstract:

Time series foundation models, such as MOIRAI, have shown exceptional zero-shot forecasting capabilities. However, they enable cross-frequency learning by employing multiple linear projection layers, each specialized for handling time series at a specific frequency. This design has two major limitations: (1) Time series data are imbalanced across frequencies, leading to insufficient training of parameters for underrepresented frequencies and diminishing the effectiveness of cross-frequency learning. (2) Specialization at the frequency level is coarse-grained. For instance, time series with similar patterns but different frequencies can produce undesirable, distinct embeddings. Additionally, time series data from the same frequency can exhibit various patterns and a linear layer lacks the capacity to handle such complexity. To address these issues holistically, this paper proposes MOIRAI-MOE, which uses a single projection layer and delegates the modeling of diverse time series patterns to the mixture of experts (MoE) within Transformers. By leveraging experts for token-level specialization, MOIRAI-MOE achieves superior unified learning capabilities and delivers significant improvements in both in-distribution and zero-shot assessments.

Chat is not available.