Poster
in
Workshop: Workshop on Machine Learning and Compression
Information-theoretic Generalization Analysis for Vector-Quantized VAEs
Futoshi Futami · Masahiro Fujisawa
Encoder-decoder models, which transform input data into latent variables, have achieved significant success in machine learning. While the generalization ability of these models has been theoretically analyzed in supervised learning focusing on the complexity of latent variables, the role of latent variables in generalization and data generation performances are less explored theoretically in unsupervised learning. To address this gap, our study leverages information-theoretic generalization error analysis (IT analysis). Using the supersample setting in recent IT analysis, we demonstrate that the generalization gap for reconstruction loss can be evaluated through mutual information related to the posterior distribution of latent variables conditional on the input data, without relying on the decoder's information. We also introduce a novel permutation-symmetric supersample setting, which extends the existing IT analysis and shows that regularization of the encoder's capacity leads to generalization. Finally, we guarantee the 2-Wasserstein distance between the data distribution and the distribution of generated data, offering insights into the model’s data generation capabilities.