Poster
Layer-Adaptive $\mathcal{H}_\infty$ State Pruning for Deep Diagonal State Space Model Compression
Minseon Gwak · Seongrok Moon · Joohwan Ko · PooGyeon Park
[
Abstract
]
Wed 11 Dec 4:30 p.m. PST
— 7:30 p.m. PST
Abstract:
Due to the lack of state dimension optimization methods, deep state space models (SSMs) have sacrificed model capacity, training search space, or stability to alleviate computational costs caused by large state dimensions. In this work, we provide a deep SSM pruning method, Layer-Adaptive $\mathcal{H}_{\infty}$ STate pruning (LAST), which optimizes the state dimension of a deep diagonal SSM (DSSM) in terms of model-level energy loss. LAST evaluates the importance of states in different layers of a deep DSSM, extending modal truncation for a single DSSM. Under the objective of minimizing model-level energy loss incurred by state pruning in a layer, a LAST score is evaluated with $\mathcal{H}_{\infty}$ norms of subsystems for each state and layer-wise energy normalization. The LAST score serves as a global pruning criterion, which enables a cross-layer comparison and layer-adaptive pruning. Across various sequence benchmarks, LAST optimizes previous deep DSSMs, revealing the redundancy and compressibility of their state spaces. Notably, we demonstrate that, on average, pruning $33\%$ of states still attains performance with $0.52\%$ accuracy loss in multi-input multi-output DSSMs without retraining.Code is available at https://anonymous.4open.science/r/LAST-3D52.
Live content is unavailable. Log in and register to view live content