Neural Rendering Model: Joint Generation and Prediction for Semi-Supervised Learning
Unsupervised and semi-supervised learning are important problems, yet challenging with complex data like natural images. Progress on these problems would accelerate if we had access to appropriate generative models under which to pose the associated inference tasks.
Given the success of the Convolutional Neural Networks (CNNs) for prediction in images, we design a new class of probabilistic generative models, namely the Neural Rendering Models (NRMs), whose inference corresponds to any given CNN architecture. NRM uses the given CNN to design the prior distribution in the probabilistic model. We show that it leads to efficient semi-supervised learning, which uses less labeled data while maintaining good prediction performance. NRM generates images from coarse to finer scales. It introduces a small set of latent variables at each level, and enforces dependencies among all the latent variables via a conjugate prior distribution. This conjugate prior yields a new regularizer based on paths rendered in the generative model for training CNNs–the Rendering Path Normalization (RPN). We demonstrate that this regularizer improves generalization, both in theory and in practice. Furthermore, likelihood estimation in NRM yields training losses for CNNs, and inspired by this, we design a new loss termed as the Max-Min cross entropy which outperforms the traditional cross-entropy loss for object classification. The Max-Min cross entropy suggests a new deep network architecture, namely the Max-Min network, to realize this loss. Numerical experiments demonstrate that the NRM with RPN and Max-Min cross entropy exceed or match the-state-of-art on benchmarks including SVHN, CIFAR10, and CIFAR100 for semi-supervised and supervised learning tasks.
N. Ho, T. Nguyen (co-first author), A. B. Patel, A. Anandkumar, M. I. Jordan, R. G. Baraniuk. Neural Rendering Model: Joint Generation and Prediction for Semi-Supervised Learning. Submitted, 2018
N. Ho, T. Nguyen (co-first author), A. B. Patel, A. Anandkumar, M. I. Jordan, R. G. Baraniuk. The Latent-Dependent Deep Rendering Model. Workshop on Theoretical Foundations and Applications of Deep Generative Models at ICML, 2018
Learning image classifiers from (limited) real and (abundant) synthetic data
While deep learning’s biggest successes in computer vision rely on massive datasets consisting of labeled images, it’s often costly or infeasible to acquire and annotate such voluminous data in practice. One promising solution is to train models on synthetic data, for which we know the true labels, and then deploy these models in real-world scenarios. Unfortunately, supervised learning techniques perform poorly when training and test distributions diverge. The subtle differences between real and synthetic data significantly degrade performance. To learn models without real-world labels, we propose a two-part solution: (i) we employ a synthetic renderer, capable of generating large amounts of realistically varying synthetic images; and (ii) we propose a domain adaptation strategy to bridge the gap between synthetic and real images. By mixing synthetic and real data in each minibatch during training, we improve the test accuracy for object classification tasks. Finally, we propose the Mixed-Reality Generative Adversarial Network (MrGAN) which iteratively maps between synthetic and real data via a multi-stage, iterative process. The result of the optimization is a shared space into which both real and synthetic images can be mapped. After training in the shared space, our models generalize better (from synthetic) to real data. We validate the advantages of using synthetic data and MrGANs on our CIFAR-based datasets for domain adaptation. Using both synthetic data and MrGANs, we achieve an improvement of 8.85\% in test accuracy.
T. Nguyen, H. Chen, Z. C. Lipton, L. Dirac, S. Soatto, A. Anandkumar. Learning Image Classifiers from (Limited) Real and (Abundant) Synthetic Data. Submitted, 2018
A Probabilistic Framework for Deep Learning
A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves the unknown object position, orientation, and scale in object recognition while speech recognition involves the unknown voice pronunciation, pitch, and speed. Recently, a new breed of deep learning algorithms have emerged for high-nuisance inference tasks that routinely yield pattern recognition systems with near- or super-human capabilities. But a fundamental question remains: Why do they work? Intuitions abound, but a coherent framework for understanding, analyzing, and synthesizing deep learning architectures has remained elusive. We answer this question by developing a new probabilistic framework for deep learning based on the Deep Rendering Model: a generative probabilistic model that explicitly captures latent nuisance variation. By relaxing the generative model to a discriminative one, we can recover state-of-the-art deep convolutional neural networks, providing insights into their successes and shortcomings, as well as a principled route to their improvement.
A. B. Patel, T. Nguyen, and R. G. Baraniuk. A Probabilistic Framework for Deep Learning. NIPS, 2016.
Semi-supervised Learning with the Deep Rendering Mixture Model
Semi-supervised learning algorithms reduce the high cost of acquiring labeled training data by using both labeled and unlabeled data during learning. Deep Convolutional Networks (DCNs), which have achieved great success in supervised tasks, have recently made progress in semi-supervised learning. However, since a probabilistic generative model underlying DCNs is missing, there is no principled way to enable DCNs to learn from unlabeled data.
In this paper, we develop a new semi-supervised learning algorithm based on the recently developed Deep Rendering Mixture Model (DRMM), a probabilistic generative model whose inference algorithm corresponds to the computations in a DCN.
We derive the Expectation Maximization algorithm for the DRMM and use it to learn from both labeled and unlabeled data. We employ variational inference and a novel non-negativity constraint inspired by the DRMM theory to dramatically improve performance. Our DRMM-based semi-supervised learning algorithm achieves state-of-the-art performance on the MNIST and SVHN datasets and competitive results on CIFAR10 amongst all methods that do not use data augmentation.
T. Nguyen, W. Liu, E. Perez, R. G. Baraniuk, and A. B. Patel. Semi-supervised Learning with the Deep Rendering Mixture Model. Submitted, 2017
Towards a Cortically Inspired Deep Learning Model: Semi-Supervised Learning, Divisive Normalization, and Synaptic Pruning
Deep learning has driven dramatic advances in performance on a wide range of difficult machine perception tasks, and its applications abound. Yet for many tasks it still lags far behind the mammalian brain in term of performance and efficiency in natural tasks. Building a brain-inspired learning system to narrow the gap between artificial and biological neural networks has been a long sought-after goal in both the neuroscience and machine learning communities. To take a step towards a neurally plausible learning system, we build a class of models that use functional elements and computational principles of the cortex for more robust and versatile machine learning. In particular, we incorporate the following three major neural features into the Deep Convolutional Networks (DCNs): semi-supervised learning, divisive normalization, and synaptic pruning. These neural features are derived from a recently developed generative model underlying DCNs – the Deep Rendering Mixture Model (DRMM). Our semi-supervised learning algorithm achieves state-of-the-art performance on the MNIST and SVHN datasets and competitive results on CIFAR10 amongst all methods that do not use data augmentation. Our divisive normalization enables faster and more stable training. Using our synaptic pruning method, we can compress the model significantly with little loss in accuracy.
T. Nguyen, W. Liu, F. Sinz, R. G. Baraniuk, A. A. Tolias, X. Pitkow, A. B. Patel. Towards a Cortically Inspired Deep Learning Model: Semi-Supervised Learning, Divisive Normalization, and Synaptic Pruning. Conference on Cognitive Computational Neuroscience (CCN), 2017