Poster
in
Workshop: Workshop on Advancing Neural Network Training (WANT): Computational Efficiency, Scalability, and Resource Optimization
Patch Gradient Descent: Training Neural Networks on Very Large Images
Deepak Gupta · Gowreesh Mago · Arnav Chavan · Dilip K. Prasad · Rajat Thomas
Current deep learning models falter when faced with large-scale images, largely due to prohibitive computing and memory demands. Enter Patch Gradient Descent (PatchGD), a groundbreaking learning technique that seamlessly trains deep learning models on expansive images. This innovation takes inspiration from the standard feedforward-backpropagation paradigm. However, instead of processing an entire image simultaneously, PatchGD smartly segments and updates a core information-gathering element using portions of the image before the final evaluation. This ensures wide coverage across iterations, bringing in notable memory and computational efficiencies. When tested on the high-resolution PANDA and UltraMNIST datasets using ResNet50 and MobileNetV2 models, PatchGD clearly outstrips traditional gradient descent techniques, particularly under memory constraints. The future of handling vast image datasets effectively lies with PatchGD.