Optimal Data Augmentation Strategy Search

A key challenge of object detection in practice is that there is only a limited number of images for training. Data augmentation techniques such as cropping, translation, and horizontal flipping are commonly used in deep learning for computer vision tasks. Moreover, data augmentation also acts as a regularizer to combat overfitting. However, less attention has been directed to discovering data augmentation policies for continually evolving data. We propose an efficient data augmentation procedure by leveraging the exploit-and-explore strategy to improve object detection accuracy and reduce the runtime compared to competitive approaches.

Update

September 2, 2021

Researchers

Overview

Popular models for object detection include SSD, RetinaNet, Faster-RCNN, Mask-RCNN. However, all of them require a large amount of training data. Because training data is continually evolving, frequently shift between geolocations, and quickly shifts from time to time, we use data augmentation techniques to make the models more robust against unseen data.

A common data augmentation approach is random perturbation; for example, randomly rotate, crop the image, randomly adjust color, hue, and saturation, or randomly distort the images. However, more recently, studies have shown that instead of randomly picking some of those augmentation strategies, it is better to search for the more optimal ones in this huge search space. We propose an efficient data augmentation procedure by leveraging the exploit-and-explore strategy to improve the model accuracy and reduce the runtime compared to competitive approaches.

The new method consists of two steps. In Step 1, a fixed set of data augmentation methods for images are randomly initialized and trained in parallel. In Step 2, after training a certain time, we record a few top-performing methods (i.e., exploitation), then perturb the hyperparameters of the recorded methods to search in the hyperparameter space (i.e., exploration). We repeat the exploration-exploitation strategies if necessary. The challenge in our problem is defining a smooth parameterization of the augmentation method so that the algorithm can incrementally adopt augmentations to improve performance. We show that the proposed method can match the performance of the state-of-the-art methods on CIFAR-10 and CIFAR-100, with at least two orders of magnitude less overall compute.

Links

Contact email: xwdai@berkeley.edu