Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition

@inproceedings{metasatoshi19,
    author={Satoshi Tsutsui, Yanwei Fu, David Crandall},
    booktitle	= {Advances in Neural Information Processing Systems (NeurIPS)},
    title = {{Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition}},
    year = {2019}
}

We introduce an effective way to employ an ImageNet-pre-trained image generator for the purpose of improving fine-grained one-shot classification when data is scarce. As a way to fine-tune the pre-trained generator, our pilot study finds that adjusting only scale and shift parameters in batch normalization can produce a visually realistic images. This way works with a single image making the method less dependent on the number of available images. Furthermore, although naively adding the generated images into the training set does not improve the performance, we show that it can improve the performance if we properly mix the generated images with the original image. In order to learn the parameters of this mixing, we adapt a meta-learning framework. We implement this idea and demonstrate a consistent and significant improvement over several classifiers on two fine-grained benchmark datasets.

Framework

Our Meta Image Reinforcing Network (MetaIRNet) has two modules: an image fusion network, and a one-shot classification network. The image fusion network reinforces generated images to try to make them beneficial for the one-shot classifier, while the one-shot classifier learns representations that are suitable to classify unseen examples with few examples. Both networks are trained by end-to-end, so the loss back-propagates from classifier to the fusion network.

Fine-tune BigGAN with a single image

Reinforced Images by MetaIRNet

The IU Computer Vision Lab's projects and activities have been funded, in part, by grants and contracts from the Air Force Office of Scientific Research (AFOSR), the Defense Threat Reduction Agency (DTRA), Dzyne Technologies, EgoVid, Inc., ETRI, Facebook, Google, Grant Thornton LLP, IARPA, the Indiana Innovation Institute (IN3), the IU Data to Insight Center, the IU Office of the Vice Provost for Research through an Emerging Areas of Research grant, the IU Social Sciences Research Commons, the Lilly Endowment, NASA, National Science Foundation (IIS-1253549, CNS-1834899, CNS-1408730, BCS-1842817, CNS-1744748, IIS-1257141, IIS-1852294), NVidia, ObjectVideo, Office of Naval Research (ONR), Pixm, Inc., and the U.S. Navy. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government, or any sponsor.