Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Time Series in the Age of Large Models

Scaling-laws for Large Time-series Models

JUSTIN ALSING · Thomas Edwards · Benjamin Wandelt · James Alvey · Nam Nguyen

[ ] [ Project Page ]
Sun 15 Dec 3:30 p.m. PST — 3:42 p.m. PST

Abstract:

Scaling laws for large language models (LLMs) have provided useful guidance in training ever larger models for predictable performance gains. Time series forecasting shares a similar sequential structure to language, and is amenable to large-scale transformer architectures. Here we show that foundational decoder-only time series transformer models exhibit analogous scaling-behavior to LLMs, with architectural details (aspect ratio and number of heads) having a minimal effect over broad ranges. We assemble a large corpus of heterogenous time series data on which to train, and establish for the first time power-law scaling with parameter count, dataset size, and training compute, spanning five orders of magnitude.

Chat is not available.